Keynote: Deep Visual Understanding from Deep Learning
Deep learning and neural networks coupled with high-performance computing have led to remarkable advances in computer vision. For example, we now have a good capability to detect and localize people or objects and determine their 3d pose and layout in a scene. But we are still quite short of “visual understanding,” a much larger problem. For example, vision helps guide manipulation and locomotion, and this requires building dynamic models of consequences of various actions. Further, we should not just detect people, objects and actions but also link them together, by what we call “visual semantic role labeling,” essentially identifying subject-verb-object relationships. And finally, we should be able to make predictions – what will happen next in a video stream? In this talk Professor Malik will review progress in deep visual understanding, give an overview of the state of the art, and show a tantalizing glimpse into what the future holds.
Keynote Speaker. Over the past 30 years, Prof. Malik's research group has worked on many different topics in computer vision. Several well-known concepts and algorithms arose in this research, such as anisotropic diffusion, normalized cuts, high dynamic range imaging, and shape contexts. Prof. Malik received the Distinguished Researcher in Computer Vision Award from IEEE PAMI-TC and the K.S. Fu Prize from the International Association of Pattern Recognition. He has been elected to the National Academy of Sciences, the National Academy of Engineering and the American Academy of Arts and Sciences. He earned a B.Tech in Electrical Engineering from Indian Institute of Technology, Kanpur in 1980 and a PhD in Computer Science from Stanford University in 1985.