Huaiyuan Xu . Junliang Chen . Shiyu Meng . Yi Wang . Lap-Pui Chau*
This work focuses on 3D dense perception in autonomous driving, encompassing LiDAR-Centric Occupancy Perception, Vision-Centric Occupancy Perception, and Multi-Modal Occupancy Perception. Information fusion techniques for this field are discussed. We believe this will be the most comprehensive survey to date on 3D Occupancy Perception. Please stay tuned!😉😉😉
This is an active repository, you can watch for following the latest advances. If you find it useful, please kindly star this repo.
✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!!!
If you discover any missing work or have any suggestions, please feel free to submit a pull request or contact us. We will promptly add the missing papers to this repository.
[1] A systematically survey for the latest research on 3D occupancy perception in the field of autonomous driving.
[2] The survey provides the taxonomy of 3D occupancy perception, and elaborate on core methodological issues, including network pipelines, multi-source information fusion, and effective network training.
[3] The survey presents evaluations for 3D occupancy perception, and offers detailed performance comparisons. Furthermore, current limitations and future research directions are discussed.
- [2024-05-18] More figures have been added to the survey. We reorganize the occupancy-based applications.
- [2024-05-08] The first version of the survey is available on arXiv. We curate this repository.
3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception.
- Introduction
- Summary of Contents
- Methods: A Survey
- 3D Occupancy Datasets
- Occupancy-based Applications
- Cite The Survey
- Contact
Dataset | Year | Venue | Modality | # of Classes | Flow | Link |
---|---|---|---|---|---|---|
OpenScene | 2024 | CVPR 2024 Challenge | Camera | - | ✔️ | Intro. |
Cam4DOcc | 2024 | CVPR | Camera+LiDAR | 2 | ✔️ | Intro. |
Occ3D | 2024 | NeurIPS | Camera | 14 (Occ3D-Waymo), 16 (Occ3D-nuScenes) | ❌ | Intro. |
OpenOcc | 2023 | ICCV | Camera | 16 | ❌ | Intro. |
OpenOccupancy | 2023 | ICCV | Camera+LiDAR | 16 | ❌ | Intro. |
SurroundOcc | 2023 | ICCV | Camera | 16 | ❌ | Intro. |
OCFBench | 2023 | arXiv | LiDAR | -(OCFBench-Lyft), 17(OCFBench-Argoverse), 25(OCFBench-ApolloScape), 16(OCFBench-nuScenes) | ❌ | Intro. |
SSCBench | 2023 | arXiv | Camera | 19(SSCBench-KITTI-360), 16(SSCBench-nuScenes), 14(SSCBench-Waymo) | ❌ | Intro. |
SemanticKITT | 2019 | ICCV | Camera+LiDAR | 19(Semantic Scene Completion task) | ❌ | Intro. |
Specific Task | Year | Venue | Paper Title | Link |
---|---|---|---|---|
3D Panoptic Segmentation | 2024 | CVPR | PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation | Code |
BEV Segmentation | 2024 | arXiv | OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks | - |
Specific Task | Year | Venue | Paper Title | Link |
---|---|---|---|---|
3D Object Detection | 2024 | CVPR | Learning Occupancy for Monocular 3D Object Detection | Code |
3D Object Detection | 2024 | AAAI | SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection | Code |
Specific Task | Year | Venue | Paper Title | Link |
---|---|---|---|---|
3D Flow Prediction | 2024 | CVPR | Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications | Code |
Specific Task | Year | Venue | Paper Title | Link |
---|---|---|---|---|
Scene Generation | 2024 | CVPR | SemCity: Semantic Scene Generation with Triplane Diffusion | Code |
Specific Tasks | Year | Venue | Paper Title | Link |
---|---|---|---|---|
Occupancy Prediction, 3D Object Detection, Online Mapping, Multi-object Tracking, Motion Prediction, Motion Planning | 2024 | CVPR | DriveWorld: 4D Pre-trained Scene Understanding viaWorld Models for Autonomous Driving | - |
Occupancy Prediction, 3D Object Detection | 2024 | RA-L | UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving | Code |
Occupancy Prediction, 3D Object Detection, BEV segmentation, Motion Planning | 2023 | ICCV | Scene as Occupancy | Code |
If you find our survey and repository useful for your research project, please consider citing our paper:
@misc{xu2024survey,
title={A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective},
author={Huaiyuan Xu and Junliang Chen and Shiyu Meng and Yi Wang and Lap-Pui Chau},
year={2024},
eprint={2405.05173},
archivePrefix={arXiv}
}
If you have any questions, please feel free to get in touch: