The second day of ACM Multimedia 2013 was stuffed with challenges and competitions.
The Multimedia Grand Challenge is a plenary session in which researchers present in a very fast-paced manner (5-6 minutes) their solutions to a number of challenges proposed by firms like Yahoo!, Microsoft, Technicolor, Huawei and such.
These challenges are indeed quite grand: think about how to solve large-scale Flickr-tag Image Classification on 2 millions of images (proposed by Yahoo!), providing rich multimedia retrieval from query videos (proposed by Technicolor), finding beautiful shots in videos (proposed by NHK) or develop a web scale image retrieval system for Bing (proposed by Microsoft).
Regarding the Bing challenge task I'd like to highlight that Microsoft has done soething thst's quite unusual and extremy useful for the scientific community: they have provided Bing query logs and infomation about user clicks. The dataset is still available for future development.
Of the proposed solutions the one I liked more (and that got the second prize) is presented in "Towards a Comprehensive Computational Model for Aesthetic Assessment of Videos". The authors have proposed an approach that uses an aesthetic model emphasizing psycho-visual statistics extracted from multiple levels, in contrast with more typical approaches that relay on sets of visual concept classifiers. In particular, in the middle level of the system, a new large scale visual sentiment ontology (and associated detectors - >1200) is used. Details on this ontology can be found in the "Large-scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs" paper that has been presented in the "Brave new Topics: Social and Cognitive Aspects" session.