Enabling Ubiquitous 3D Intelligence via Algorithm-Hardware Synergy
Author(s)
Li, Chaojian
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
3D intelligence is emerging as one of the next frontiers of artificial intelligence, bridging the gap between the digital and physical worlds by enabling machines to perceive, understand, and interact with the 3D environment around us. Unlike traditional AI do- mains such as natural language processing or image recognition, which focus on interpreting 2D or sequential data, 3D intelligence deals with spatial complexity and richer sensory input. This opens the door to transformative applications in autonomous driving, augmented/virtual reality (AR/VR), embodied AI, telepresence, and beyond. However, realizing the vision of ubiquitous 3D intelligence, i.e., enabling every application to run on every device, all at once, presents fundamental challenges in terms of (1) efficiency, e.g., achieving real-time processing speeds for 3D intelligence applications; (2) accessibility, e.g., enabling diverse devices to run the corresponding 3D intelligence pipeline; and (3) adaptability, e.g., supporting multiple 3D intelligence applications within a unified framework. This thesis addresses these challenges through a unified research agenda: exploring multi-granular algorithm-hardware synergy. That is, instead of treating algorithms and hardware as separate entities, we design them jointly, leveraging hardware-aware algorithms, customizing hardware architectures for specific algorithmic bottlenecks, and co- optimizing both across different system layers: (1) Tackling the Efficiency Challenge: Instant-3D. We identify that many 3D intelligence workloads are dominated by a small set of bottleneck operators, particularly those with irregular memory access patterns due to the non-local layout of 3D data. To address this, we develop Instant-3D, a hardware-algorithm co-design framework that optimizes both memory usage and access regularity for these operators. This design is further extended to support large-scale scenes via a multi-chip system, culminating in the Fusion-3D prototype chip. (2) Tackling the Accessibility Challenge: MixRT. Modern edge devices typically feature heterogeneous GPU resources, but existing rendering models often fail to make full use of them. We propose MixRT, one of the first frameworks that combines traditional graphics and neural network operators into a hybrid pipeline optimized for diverse hardware architectures. (3) Tackling the Adaptability Challenge: Uni-Render. 3D intelligence encompasses a broad range of applications, each with their own preferred rendering paradigms. To support these diverse models, we develop Uni-Render, the first unified accelerator that supports five different rendering pipelines using a reconfigurable architecture, offering a practical path toward general-purpose rendering acceleration on edge platforms. (4) Research Infrastructure Development: HW-NAS-Bench. In addition to technical innovations, this thesis also contributes to the broader research community through infrastructure development. We release HW-NAS-Bench, a large-scale benchmark providing energy and latency data for over 15,000 neural architectures across multiple hardware platforms. This resource supports fair comparisons and reproducibility in hardware-aware neural architecture search, and has become a widely used tool in the community. Looking Forward. The approach outlined in this thesis, systematically addressing efficiency, accessibility, and adaptability via algorithm-hardware synergy, offers a scalable pathway toward truly ubiquitous 3D intelligence. We envision a future where any 3D intelligence application, from embodied robotics to scientific modeling, can run instantly and efficiently across heterogeneous platforms. This vision extends not only to human-facing applications like AR/VR and autonomous vehicles, but also to scientific discovery in domains such as atomistic modeling and material science. Ultimately, this work contributes toward building AI systems that are increasingly immersive, responsive, and accessible across a wide range of devices and applications.
Sponsor
Date
2025-05-19
Extent
Resource Type
Text
Resource Subtype
Dissertation