Skip to content

[Information Fusion 2025] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

Notifications You must be signed in to change notification settings

HuaiyuanXu/3D-Occupancy-Perception

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

image

Huaiyuan Xu . Junliang Chen . Shiyu Meng . Yi Wang . Lap-Pui Chau*

arXiv PDF

We research 3D Occupancy Perception for Autonomous Driving

This work focuses on 3D dense perception in autonomous driving, encompassing LiDAR-Centric Occupancy Perception, Vision-Centric Occupancy Perception, and Multi-Modal Occupancy Perception. Information fusion techniques for this field are discussed. We believe this will be the most comprehensive survey to date on 3D Occupancy Perception. Please stay tuned!😉😉😉

This is an active repository, you can watch for following the latest advances. If you find it useful, please kindly star this repo.

✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!

If you discover any missing work or have any suggestions, please feel free to submit a pull request or contact us. We will promptly add the missing papers to this repository.

✨Highlight

[1] A systematically survey for the latest research on 3D occupancy perception in the field of autonomous driving.

[2] The survey provides the taxonomy of 3D occupancy perception, and elaborate on core methodological issues, including network pipelines, multi-source information fusion, and effective network training.

[3] The survey presents evaluations for 3D occupancy perception, and offers detailed performance comparisons. Furthermore, current limitations and future research directions are discussed.

🔥 News

  • [2024-09-03] This survey got accepted by Information Fusion (Impact factor: 14.7).
  • [2024-07-21] More representative works and benchmarking comparisons have been incorporated, bringing the total to 192 literature references.
  • [2024-05-18] More figures have been added to the survey. We reorganize the occupancy-based applications.
  • [2024-05-08] The first version of the survey is available on arXiv. We curate this repository.

Introduction

3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception.

Summary of Contents

Methods: A Survey

LiDAR-Centric Occupancy Perception

Year Venue Paper Title Link
2024 NeurIPS TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight Code
2024 CVPR PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness (Best paper award candidate) Project Page
2024 IROS LiDAR-based 4D Occupancy Completion and Forecasting Project Page
2024 arXiv Towards 3D Semantic Scene Completion for Autonomous Driving: A Meta-Learning Framework Empowered by Deformable Large-Kernel Attention and Mamba Model -
2024 arXiv OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity Project Page
2024 arXiv DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models -
2024 arXiv MergeOcc: Bridge the Domain Gap between Different LiDARs for Robust Occupancy Prediction -
2023 T-IV Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders Code
2023 arXiv PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction Code
2021 T-PAMI Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data -
2021 AAAI Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion Code
2020 CoRL S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds -
2020 3DV LMSCNet: Lightweight Multiscale 3D Semantic Completion Code

Vision-Centric Occupancy Perception

Year Venue Paper Title Link
2025 AAAI ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction Project Page
2025 AAAI ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder Code
2025 AAAI LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba -
2025 AAAI Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance -
2024 NeurIPS OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries Code
2024 NeurIPS Context and Geometry Aware Voxel Transformer for Semantic Scene Completion (Spotlight paper) Code
2024 NeurIPS OPUS: Occupancy Prediction Using a Sparse Set Code
2024 ECCV ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers Code
2024 ECCV CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction Code
2024 ECCV VEON: Vocabulary-Enhanced Occupancy Prediction Code
2024 ECCV Fully Sparse 3D Occupancy Prediction Code
2024 ECCV GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction Project Page
2024 ECCV Occupancy as Set of Points Code
2024 ECCV Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion Code
2024 CVPR LowRankOcc: Tensor Decomposition and Low-Rank Recovery for Vision-based 3D Semantic Occupancy Prediction -
2024 CVPR Bi-SSC: Geometric-Semantic Bidirectional Fusion for Camera-based 3D Semantic Scene Completion -
2024 CVPR Symphonize 3D Semantic Scene Completion with Contextual Instance Queries Code
2024 CVPR SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction Project Page
2024 CVPR SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction Project Page
2024 CVPR PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation Code
2024 CVPR Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation Code
2024 CVPR COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction Code
2024 CVPR Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles Project Page
2024 CVPR Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications Code
2024 CVPR Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation Project Page
2024 CVPR DriveWorld: 4D Pre-trained Scene Understanding viaWorld Models for Autonomous Driving -
2024 T-IP Camera-based 3D Semantic Scene Completion with Sparse Guidance Network Code
2024 CoRL Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction Project Page
2024 IJCAI Label-efficient Semantic Scene Completion with Scribble Annotations Code
2024 IJCAI Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion Code
2024 ICRA The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition Project Page
2024 ICRA RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision Code
2024 ICRA MonoOcc: Digging into Monocular Semantic Occupancy Prediction Code
2024 ICRA FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird’s-Eye View and Perspective View -
2024 AAAI Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving Code
2024 AAAI One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception -
2024 RA-L HybridOcc: NeRF Enhanced Transformer-based Multi-Camera 3D Occupancy Prediction -
2024 RA-L UniScene: Multi-Camera Unified Pre-Training via 3D Scene Reconstruction Code
2024 AAIML SOccDPT: Semi-Supervised 3D Semantic Occupancy from Dense Prediction Transformers trained under memory constraints Project Page
2024 3DV PanoSSC: Exploring Monocular Panoptic 3D Scene Reconstruction for Autonomous Driving -
2024 IROS SSCBench: Monocular 3D Semantic Scene Completion Benchmark in Street Views Code
2024 arXiv GSRender: Deduplicated Occupancy Prediction via Weakly Supervised 3D Gaussian Splatting -
2024 arXiv GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding Code
2024 arXiv GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction Code
2024 arXiv GaussianAD: Gaussian-Centric End-to-End Autonomous Driving Project Page
2024 arXiv Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction -
2024 arXiv Fast Occupancy Network -
2024 arXiv Lightweight Spatial Embedding for Vision-based 3D Occupancy Prediction -
2024 arXiv GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction Code
2024 arXiv Language Driven Occupancy Prediction Code
2024 arXiv GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving Code
2024 arXiv ET-Former: Efficient Triplane Deformable Attention for 3D Semantic Scene Completion From Monocular Camera -
2024 arXiv ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning -
2024 arXiv Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction Code
2024 arXiv AdaOcc: Adaptive-Resolution Occupancy Prediction -
2024 arXiv GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting Project Page
2024 arXiv MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering Code
2024 arXiv VPOcc: Exploiting Vanishing Point for Monocular 3D Semantic Occupancy Prediction -
2024 arXiv UniVision: A Unified Framework for Vision-Centric 3D Perception Code
2024 arXiv LangOcc: Self-Supervised Open Vocabulary Occupancy Estimation via Volume Rendering -
2024 arXiv Real-Time 3D Occupancy Prediction via Geometric-Semantic Disentanglement -
2024 arXiv α-SSC: Uncertainty-Aware Camera-based 3D Semantic Scene Completion -
2024 arXiv Panoptic-FlashOcc: An Efficient Baseline to Marry Semantic Occupancy with Panoptic via Instance Center Code
2024 arXiv BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network Code
2024 arXiv GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision -
2024 arXiv OccFlowNet: Towards Self-supervised Occupancy Estimation via Differentiable Rendering and Occupancy Flow -
2024 arXiv OccFiner: Offboard Occupancy Refinement with Hybrid Propagation -
2024 arXiv InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction Code
2024 arXiv Unified Spatio-Temporal Tri-Perspective View Representation for 3D Semantic Occupancy Prediction Project Page
2023 CVPR VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion Code
2023 CVPR Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction Project Page
2023 NeurIPS POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images Project Page
2023 NeurIPS Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving Project Page
2023 ICCV SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving Project Page
2023 ICCV Scene as Occupancy Code
2023 ICCV OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction Code
2023 ICCV NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space Code
2023 T-IV 3DOPFormer: 3D Occupancy Perception from Multi-Camera Images with Directional and Distance Enhancement Code
2023 arXiv OccupancyDETR: Using DETR for Mixed Dense-sparse 3D Occupancy Prediction -
2023 arXiv OVO: Open-Vocabulary Occupancy Code
2023 arXiv OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments Project Page
2023 arXiv OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion Code
2023 arXiv FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin Code
2023 arXiv FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation Code
2023 arXiv DepthSSC: Depth-Spatial Alignment and Dynamic Voxel Resolution for Monocular 3D Semantic Scene Completion -
2023 arXiv A Simple Framework for 3D Occupancy Estimation in Autonomous Driving Code
2023 arXiv UniWorld: Autonomous Driving Pre-training via World Models Code
2022 CVPR MonoScene: Monocular 3D Semantic Scene Completion Project Page

Radar-Centric Occupancy Perception

Year Venue Paper Title Link
2024 NeurIPS RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar -

Multi-Modal Occupancy Perception

Year Venue Paper Title Code
2024 ECCV OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving Project Page
2024 RA-L Co-Occ: Coupling Explicit Feature Fusion with Volume Rendering Regularization for Multi-Modal 3D Semantic Occupancy Prediction Project Page
2024 arXiv PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction -
2024 arXiv Robust 3D Semantic Occupancy Prediction with Calibration-free Spatial Transformation Code
2024 arXiv OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction -
2024 arXiv DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction Code
2024 arXiv OccMamba: Semantic Occupancy Prediction with State Space Models -
2024 arXiv LiCROcc: Teach Radar for Accurate Semantic Occupancy Prediction using LiDAR and Camera Project Page
2024 arXiv OccFusion: Depth Estimation Free Multi-sensor Fusion for 3D Occupancy Prediction -
2024 arXiv EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network Code
2024 arXiv Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution -
2024 arXiv OccFusion: A Straightforward and Effective Multi-Sensor Fusion Framework for 3D Occupancy Prediction -
2024 arXiv Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified 3D Perception -
2023 ICCV OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception Code

3D Occupancy Datasets

Dataset Year Venue Modality # of Classes Flow Link
OpenScene 2024 CVPR 2024 Challenge Camera - ✔️ Intro.
Cam4DOcc 2024 CVPR Camera+LiDAR 2 ✔️ Intro.
Occ3D 2024 NeurIPS Camera 14 (Occ3D-Waymo), 16 (Occ3D-nuScenes) Intro.
OpenOcc 2023 ICCV Camera 16 Intro.
OpenOccupancy 2023 ICCV Camera+LiDAR 16 Intro.
SurroundOcc 2023 ICCV Camera 16 Intro.
OCFBench 2023 arXiv LiDAR -(OCFBench-Lyft), 17(OCFBench-Argoverse), 25(OCFBench-ApolloScape), 16(OCFBench-nuScenes) Intro.
SSCBench 2023 arXiv Camera 19(SSCBench-KITTI-360), 16(SSCBench-nuScenes), 14(SSCBench-Waymo) Intro.
SemanticKITT 2019 ICCV Camera+LiDAR 19(Semantic Scene Completion task) Intro.

Occupancy-based Applications

Segmentation

Specific Task Year Venue Paper Title Link
3D Panoptic Segmentation 2024 CVPR PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation Code
BEV Segmentation 2024 CVPRW OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks Code

Detection

Specific Task Year Venue Paper Title Link
3D Object Detection 2024 NeurIPS Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection Code
3D Object Detection 2024 CVPR Learning Occupancy for Monocular 3D Object Detection Code
3D Object Detection 2024 AAAI SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection Code
3D Object Detection 2024 arXiv UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height -

Dynamic Perception

Specific Task Year Venue Paper Title Link
3D Flow Prediction 2024 CVPR Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications Code
3D Flow Prediction 2024 arXiv Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction Project Page

Generation

Specific Task Year Venue Paper Title Link
Scene Generation 2024 ECCV Pyramid Diffusion for Fine 3D Large Scene Generation (Oral paper) Code
Scene Generation 2024 CVPR SemCity: Semantic Scene Generation with Triplane Diffusion Code
Scene Generation 2024 arXiv OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation -
Scene Generation 2024 arXiv UniScene: Unified Occupancy-centric Driving Scene Generation Project Page
Scene Generation 2024 arXiv InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models Project Page
Scene Generation 2024 arXiv SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs Project Page

Navigation

Specific Task Year Venue Paper Title Link
Navigation for Air-Ground Robots 2024 RA-L HE-Nav: A High-Performance and Efficient Navigation System for Aerial-Ground Robots in Cluttered Environments Project Page
Navigation for Air-Ground Robots 2024 ICRA AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments Code
Navigation for Air-Ground Robots 2024 arXiv OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model Project Page

World Models

Specific Task Year Venue Paper Title Link
4D Occupancy Forecasting and Motion Planing 2024 ECCV OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving Project Page
4D Occupancy Forecasting 2024 CVPR UnO: Unsupervised Occupancy Fields for Perception and Forecasting (Oral paper) Project Page
4D Representation Learning Framework 2024 CVPR DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving -
4D Occupancy Forecasting 2024 CVPR Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications Code
4D Occupancy Forecasting 2024 AAAI Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence Project Page
4D Occupancy Forecasting and Motion Planing 2024 arXiv An Efficient Occupancy World Model via Decoupled Dynamic Flow and Image-assisted Training -
4D Occupancy Forecasting 2024 arXiv Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting -
4D Occupancy Generation 2024 arXiv DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes Project Page
4D Occupancy Forecasting and Generation 2024 arXiv DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model Project Page
4D Occupancy Forecasting 2024 arXiv FSF-Net: Enhance 4D Occupancy Forecasting with Coarse BEV Scene Flow for Autonomous Driving -
4D Occupancy Forecasting and Motion Planing 2024 arXiv RenderWorld: World Model with Self-Supervised 3D Label -
4D Occupancy Forecasting, Motion Planing, and Reasoning 2024 arXiv OccLLaMA: An Occupancy-Language-Action Generative World Model for Autonomous Driving -
4D Occupancy Forecasting and Generation 2024 arXiv Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving -
4D Occupancy Generation 2024 arXiv OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving Project Page
4D Occupancy Forecasting 2023 CVPR Point Cloud Forecasting as a Proxy for 4D Occupancy Forecasting Project Page

Unified Autonomous Driving Algorithm Framework

Specific Tasks Year Venue Paper Title Link
Occupancy Prediction, 3D Object Detection, Online Mapping, Multi-object Tracking, Motion Prediction, Motion Planning 2024 CVPR DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving -
Occupancy Prediction, 3D Object Detection 2024 RA-L UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving Code
Occupancy Prediction, 3D Object Detection, HD map reconstruction 2024 arXiv GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving Code
Occupancy Forecasting, Motion Planning 2024 arXiv Driving in the Occupancy World: Vision-Centric 4D Occupancy Forecasting and Planning via World Models for Autonomous Driving -
Occupancy Prediction, 3D Object Detection, BEV segmentation, Motion Planning 2023 ICCV Scene as Occupancy Code

Cite The Survey

If you find our survey and repository useful for your research project, please consider citing our paper:

@misc{xu2024survey,
      title={A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective}, 
      author={Huaiyuan Xu and Junliang Chen and Shiyu Meng and Yi Wang and Lap-Pui Chau},
      year={2024},
      eprint={2405.05173},
      archivePrefix={arXiv}
}

Contact

If you have any questions, please feel free to get in touch:

If you are interested in joining us as a Ph.D. student to research computer vision, machine learning, please feel free to contact Professor Chau:

About

[Information Fusion 2025] A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published