Object-Centric Scene Understanding via Lidar-Based 3D Object Detection
Author(s)
Wilson, Benjamin
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
3D scene understanding is a fundamental challenge in designing autonomous systems that navigate in the real world. With applications in autonomous driving to warehouse robotics, effective 3D understanding has the ability to revolutionize a number of industries. Understanding complex scenes requires having large amounts of diverse data with models that can efficiently determine where objects of interest are located in the 3D environments. This dissertation argues that (1) current on-road autonomy datasets lack capturing a diverse set of objects and behaviors, and (2) lidar-based 3D object detection models, particularly in the range view, are the starting point for building a robust understanding of complex 3D scenes. In this dissertation, I introduce three original contributions which focus on developing diverse datasets and lidar-based 3D object detection models for autonomy. I first present Argoverse 2: three large-scale datasets that capture complex behaviors in multiple sensor modalities. Second, I introduce a set of straightforward techniques for 3D object detection in native lidar representation, the range view, which makes this class of methods competitive with bird's-eye view and voxel-based models. Lastly, I extend range view 3D object detection to hybrid architectures which combine multiview representations, multimodal projections, and temporal reasoning.
Sponsor
Date
2024-12-03
Extent
Resource Type
Text
Resource Subtype
Dissertation