The Impact of VLMs on Semantic Navigation: A Before and After View of Object Search

The Impact of VLMs on Semantic Navigation: A Before and After View of Object Search

Abstract: Understanding how humans leverage semantic knowledge to navigate unfamiliar environments and decide where to explore next is pivotal for developing robots capable of human-like search behaviors. For example, when looking for a fork, a person would look near refrigerators and ovens, not beds and sofas. To perform similar reasoning, a robot needs to have and use priors about the expected semantic layout of the environment. In this talk, I will present two of my solutions to this object search problem which leverage semantic priors developed directly before (ICLR 2022) and directly after (ICRA 2024) the recent rapid improvements in LLMs and vision-language models (VLMs). I will discuss how these advances in natural language and computer vision changed our solution to this robotics problem, and I will also talk about the connection between these solutions to the object search problem and other unsolved semantic reasoning challenges in robotics.

Bio: Bernadette Bucher is an Assistant Professor in the Robotics Department at University of Michigan. She leads the Mapping and Motion Lab which focuses on learning interpretable visual representations and estimating their uncertainty for use in robotics, particularly mobile manipulation. Before joining University of Michigan this fall, she was a research scientist at the Boston Dynamics AI Institute, a senior software engineer at Lockheed Martin Corporation, and an intern at NVIDIA Research. She earned her PhD from University of Pennsylvania and bachelor’s and Masters degrees from University of Alabama.