WIAMIS 2012: The 13th International Workshop on Image Analysis for Multimedia Interactive Services

23rd - 25th May 2012, Dublin City University, Ireland

Dr. Aljoscha Smolic

Disney Research Zurich, SwitzerlandTime and Date, TBC

 

Content Creation for Stereoscopic 3D and Beyond

Stereoscopic 3D is established in cinema, on Blu-ray, TV, PCs, laptops, and mobile devices. Since technology for stereo 3D is mature and content creation is understood well enough, these developments are expected to be sustainable this time. Most current systems rely on classical approaches to 3D video, i.e. representation as stereo or multiview video, coding and transmission using simulcast, frame-compatible composition or MVC. More advanced "next generation" approaches exploit some kind of understanding of the 3D scene geometry such as depth or disparity, in order to extend functionality and increase efficiency. This includes for instance flexible adjustment of depth impression to viewing conditions and user preferences or support of autostereoscopic multiview displays. Also content creation for classical stereo 3D can greatly benefit from such 3D geometry aware processing. Naturally such advanced 3D video representation formats require advanced processing algorithms, e.g. to extract 3D geometry and to render virtual views. Such advanced 3D video representation and processing will be the focus of this talk.

Biography

Aljoscha SmolicDr. Aljoscha Smolic joined Disney Research Zurich, Switzerland in 2009, as Senior Research Scientist and Head of the "Advanced Video Technology" group. Before he was Scientific Project Manager at the Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut (HHI), Berlin, also heading a research group. He has been involved in several national and international research projects, where he conducted research in various fields of video processing, video coding, computer vision and computer graphics and published more than 100 referred papers in these fields. In current projects he is responsible for research in 2D video, 3D video and free viewpoint video processing and coding. He received the Dipl.-Ing. Degree in Electrical Engineering from the Technical University of Berlin, Germany in 1996, and the Dr.-Ing. Degree in Electrical Engineering and Information Technology from Aachen University of Technology (RWTH), Germany, in 2001. Dr. Smolic received the "Rudolf-Urtlel-Award" of the German Society for Technology in TV and Cinema (FKTG) for his dissertation in 2002. He is Area Editor for Signal Processing: Image Communication and served as Guest Editor for the Proceedings of the IEEE, IEEE Transactions on CSVT, IEEE Signal Processing Magazine, and other scientific journals. He is Committee Member of several conferences, including ICIP, ICME, and EUSIPCO and served in several Chair positions of conferences. He chaired the MPEG ad hoc group on 3DAV pioneering standards for 3D video. In this context he also served as one of the Editors of the Multi-view Video Coding (MVC) standard. Since many years he is teaching full lecture courses on Multimedia Communications and other topics, now at ETH Zurich.


Dr. Tat-Seng Chua

National University of SingaporeTime and Date, TBC

 

Learning the Social Pulses of a City from User-Generated Contents

Social network has become an indispensable part of social activities for many people. Most such people share their views, delights and life moments, as well as conduct online purchases on various types of social networks. In particular, a large amount of images and videos are shared in the process, which form a rich part of such social engagements and contents. The mining of such visual media, along with the associated text and metadata (collectively known as the user generated contents or GGCs), enables us to study the various aspects of social habits and preferences of users, and towards the generation of the graph of social pulses. This talk describes our research on analyzing the UGCs to generate the social graph of a city. It discusses our research on analyzing and structuring huge amounts of UGCs ranging from user comments, tweets, shared media, check-in venues, as well as well social activities on mobile devices. In particular, we will describe our research towards mining collective interests and concerns of people, social activities, various types of popular locations, trails and their photos, as well as local fashion trends.

Biography

Tat-Seng ChuaTat-Seng Chua is the KITHCT Chair Professor at the School of Computing, National University of Singapore (NUS). He was the Acting and Founding Dean of the School of Computing during 1998-2000. He joined NUS in 1983, and spent three years as a research staff member at the Institute of Systems Science (now I2R) in the late 1980s. Dr Chua's main research interests are in multimedia information retrieval, multimedia question-answering, and the analysis and structuring of user-generated contents. He works on several multi-million-dollar projects: interactive media search, local contextual search, and real-time live media search.

Dr Chua has organized and served as program committee member of numerous international conferences in the areas of computer graphics, multimedia and text processing. He is the conference co-chair of ACM Multimedia 2005, CIVR (Conference on Image and Video Retrieval) 2005, and ACM SIGIR 2008, and the Technical PC Co-Chair of SIGIR 2010. He serves in the editorial boards of: ACM Transactions of Information Systems (ACM), Foundation and Trends in Information Retrieval (NOW), The Visual Computer (Springer Verlag), and Multimedia Tools and Applications (Kluwer). He sits in the steering committee of ICMR (International Conference on Multimedia Retrieval), Computer Graphics International, and Multimedia Modeling conference series; and serves as member of International Review Panels of two large-scale research projects in Europe. He is the independent director of 2 listed companies in Singapore.


Dr. Mubarak Shah

University of Central FloridaTime and Date, TBC

 

Representing Actions, Events, and Activities as Motion Patterns

Automatic analysis of videos is one of most challenging problems in Computer vision. In this talk I will introduce the problem of action, event, and activity representation and recognition from video sequences. I will begin by giving a brief overview of a few interesting methods to solve this problem, including trajectories, volumes, and local interest points based representations.

The main part of the talk will focus on a newly developed framework for the discovery and statistical representation of motion patterns in videos, which can act as primitive, atomic actions. These action primitives are employed as a generalizable representation of articulated human actions, gestures, and facial expressions. The motion primitives are learned by hierarchical clustering of observed optical flow in four dimensional, spatial and motion flow space, and a sequence of these primitives can be represented as a simple string, a histogram, or a Hidden Markov model.

I will then describe methods to extend the framework of motion patterns estimation to the problem of multi-agent activity recognition. First, I will talk about transformation invariant matching of motion patterns in order to recognize simple events in surveillance scenarios. I will end the talk by presenting a framework in which a motion pattern represents the behavior of a single agent, while multi-agent activity takes the form of a graph, which can be compared to other activity graphs, by attributed inexact graph matching. This method is applied to the problem of American football plays recognition.

Biography

Mubarak ShahDr. Mubarak Shah, Agere Chair Professor of Computer Science, is the founding director of the Computer Visions Lab at University of Central Florida (UCF). He is a co-author of three books (Motion-Based Recognition (1997), Video Registration (2003), and Automated Multi-Camera Surveillance: Algorithms and Practice (2008)), all by Springer. He has published extensively on topics related to visual surveillance, tracking, human activity and action recognition, object detection and categorization, shape from shading, geo registration, visual crowd analysis, etc. Dr. Shah is a fellow of IEEE, IAPR, AAAS and SPIE. In 2006, he was awarded the Pegasus Professor award, the highest award at UCF, given to a faculty member who has made a significant impact on the university, has made an extraordinary contribution to the university community, and has demonstrated excellence in teaching, research and service. He is ACM Distinguished Speaker. He was an IEEE Distinguished Visitor speaker for 1997-2000, and received IEEE Outstanding Engineering Educator Award in 1997. He received the Harris Corporation's Engineering Achievement Award in 1999, the TOKTEN awards from UNDP in 1995, 1997, and 2000; SANA award in 2007, an honorable mention for the ICCV 2005 Where Am I? Challenge Problem, and was nominated for the best paper award in ACM Multimedia Conference in 2005 and 2010. At UCF he received Scholarship of Teaching and Learning (SoTL) award in 20111; College of Engineering and Computer Science Advisory Board award for faculty excellence in 2011; Teaching Incentive Program awards in 1995 and 2003, Research Incentive Award in 2003 and 2009, Millionaires' Club awards in 2005, 2006, 2009, 2010 and 2011; University Distinguished Researcher award in 2007. He is an editor of international book series on Video Computing; editor in chief of Machine Vision and Applications journal, and an associate editor of ACM Computing Surveys journal. He was an associate editor of the IEEE Transactions on PAMI, and a guest editor of the special issue of International Journal of Computer Vision on Video Computing. He was the program co-chair of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008.


Prof. Gaël Richard

Télécom ParisTechTime and Date, TBC

 

Audio and Multimedia Music Signal Processing

The enormous amount of unstructured audio data available nowadays and the spread of its use as a data source in many applications are introducing new challenges to researchers in information and multimedia signal processing. The automatic analysis of audio documents (music, radio broadcast audio streams,...) gathers several research directions including audio indexing (extraction of informative features leading to the estimation of melody, rhythm, instrumentation, harmony,...), audio classiÞcation (grouping by similarity, by music genre or by audio events categories) and content-based retrieval (such as query by example or query by humming approaches). The aim of this talk is then to provide a general introduction of audio signal processing, to review the state of the art in audio and music signal processing and to highlight with some examples the potential of audio signal processing for multimedia streams. This talk will also be the occasion to present recent results obtained on a large multi-sensors database built for the Huawei/3Dlife grand challenge of the ACM Multimedia 2011 Conference and to highlight the potential of multiple sensors processing for multimedia applications.

Biography

Gael RichardProf. Gaël Richard received the State Engineering degree from Telecom ParisTech, in 1990, the Ph.D. and "Habilitation à Diriger des Recherches" degrees from University of Paris-XI, respectively in 1994 and 2001. He then spent two years at the CAIP Center, Rutgers University (USA), in the speech processing group of Prof. J. Flanagan, where he explored innovative approaches for speech production. Between 1997 and 2001, he successively worked for Matra Nortel Communications and Philips Consumer Communications. In particular, he was the project manager of several large-scale European projects in the field of multimodal speaker verification and audio processing. He then joined Télécom ParisTech where he is now full Professor and Head of the Audio, Acoustics and Waves research group. Co-author of over 140 papers and inventor in a number of patents, he is also an expert for the European Commission in Audio and Multimedia signal processing. Former Associate Editor of the IEEE Transactions on Audio, Speech and Language Processing, Pr. Richard is a current member of the EURASIP, senior member of IEEE, and member of the Audio Acoustics Signal Processing Technical committee of the IEEE.