Mon 01/31

Seminar @ Cornell Tech: Georgia Gkioxari

Perceiving the World in 2D and 3D

Images are powerful storytellers as they capture events, memorable or mundane, from our everyday lives. Humans have the ability to perceive images without any difficulty but for machines to do the same, they need to build an understanding of the world, a world composed of complex objects, humans and their rich interactions. In this talk, I will present my work towards enabling machines to recognize and localize objects from images and even reason about their interactions. The progress in 2D visual understanding is unprecedented and, thanks to the advances presented in this talk, it powers our smartphones, our home devices and our self-driving cars. However, the world is 3D and objects have 3D properties which modern recognition models ignore. Toward 3D perception, I will present our work on inferring 3D object shapes from real-world images via learning object priors and on understanding 3D scene semantics via multi-view 2D supervision with the help of differentiable geometry. To this end, I will present PyTorch3D, our efficient and modular 3D deep learning library which allows us to efficiently marry advances in deep learning with geometry and is widely adopted within the academic and industry research community.

Speaker Bio

Georgia Gkioxari is a research scientist at Meta AI. She received her PhD in computer science and electrical engineering from the University of California at Berkeley under the supervision of Jitendra Malik in 2016. Her research interests lie in computer vision, with a focus on object recognition from images and videos. In 2017, Georgia received the Marr Prize at ICCV for “Mask R-CNN”. In 2019, she was named one of the 30 Influential Women Advancing AI by ReWork and was nominated for the Women in AI awards by VentureBeat. In 2021, Georgia received the PAMI Young Researcher Award and the Mark Everingham prize for Detectron.