[feature] v2x-seq-spd

AIR-THU · Oct 30, 2023 · 35a74f3 · 35a74f3
1 parent cf92b8d
commit 35a74f3
Show file tree

Hide file tree

Showing 15 changed files with 667 additions and 68 deletions.
diff --git a/README.md b/README.md
@@ -26,12 +26,14 @@
 
 ## Highlights <a name="high"></a>
 - DAIR-V2X: The first real-world dataset for research on vehicle-to-everything autonomous driving. It comprises a total of 71,254 frames of image data and 71,254 frames of point cloud data.
+- V2X-Seq:  The first large-scale, real-world, and sequential V2X dataset, which includes data frames, trajectories, vector maps, and traffic lights captured from natural scenery.  V2X-Seq comprises two parts: V2X-Seq-SPD (Sequential Perception Dataset), which includes more than 15,000 frames captured from 95 scenarios; V2X-Seq-TFD (Trajectory Forecasting Dataset), which contains about 80,000 infrastructure-view scenarios, 80,000 vehicle-view scenarios, and 50,000 cooperative-view scenarios captured from 28 intersections' areas, covering 672 hours of data.
 - OpenDAIR-V2X: An open-sourced framework for supporting the research on  vehicle-to-everything autonomous driving.
 
 ## News <a name="news"></a>
-
+* [2023.09] 🔥 We have released the code for [V2X-Seq-SPD](https://github.com/AIR-THU/DAIR-V2X) and [V2X-Seq-TFD](https://github.com/AIR-THU/DAIR-V2X-Seq).
+* [2023.05] 🔥 V2X-Seq dataset is availale [here](https://thudair.baai.ac.cn/index). It can be unlimitedly downloaded within mainland China. Example dataset can be downloaded directly.
 * [2023.03] 🔥 Our new dataset "V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting" has been accepted by CVPR2023. Congratulations! We will release the dataset sooner. Please follow [DAIR-V2X-Seq](https://github.com/AIR-THU/DAIR-V2X-Seq) for the latest news.
-* [2023.03] 🔥 We have released training code for our [FFNET](https://github.com/haibao-yu/FFNet-VIC3D), and our OpenDAIRV2X now supports evaluating [FFNET](https://github.com/haibao-yu/FFNet-VIC3D).
+* [2023.03] We have released training code for our [FFNET](https://github.com/haibao-yu/FFNet-VIC3D), and our OpenDAIRV2X now supports evaluating [FFNET](https://github.com/haibao-yu/FFNet-VIC3D).
 * [2022.11] We have held the first [VIC3D Object Detection challenge](https://aistudio.baidu.com/aistudio/competition/detail/522/0/introduction). 
 * [2022.07] We have released the OpenDAIRV2X codebase v1.0.0.
   The current version can faciliate the researchers to use the DAIR-V2X dataset and reproduce the benchmarks.
@@ -44,51 +46,21 @@
  - [DAIR-V2X-I](https://thudair.baai.ac.cn/roadtest)
  - [DAIR-V2X-V](https://thudair.baai.ac.cn/cartest)
  - [DAIR-V2X-C](https://thudair.baai.ac.cn/coop-forecast)
+ - [V2X-Seq-SPD](https://thudair.baai.ac.cn/coop-forecast)
+ - [V2X-Seq-TFD](https://thudair.baai.ac.cn/cooplocus)
+ - V2X-Seq-SPD-Example: [google_drive_link](https://drive.google.com/file/d/1gjOmGEBMcipvDzu2zOrO9ex_OscUZMYY/view?usp=drive_link)
+ - V2X-Seq-TFD-Example: [google_drive_link](https://drive.google.com/file/d/1gjOmGEBMcipvDzu2zOrO9ex_OscUZMYY/view?usp=drive_link)
 
 ## Getting Started <a name="start"></a>
-Please refer to [getting_started.md](docs/get_started.md) for Installation, Evaluation, Benchmark and Training etc for VIC3D.
-
-## Major Features <a name="features"></a>
-
-- **Support Train/Evaluation for VIC3D**
-
-  It will directly support model training and evaluation for VIC3D. 
-  Now the model inference and model training are mainly based on MMDetection3D, which is not quite convenient to carry the VICAD research.
-
-- [x] Evaluation (Model inference is based on MMDetection3D)
-- [x] Training based on MMDetection3D
-- [ ] Direct Evaluation with DAIR-V2X Framework
-- [ ] Direct Training with DAIR-V2X Framework
-
+Please refer to [getting_started.md](docs/get_started.md) for the usage and benchmarks reproduction of DAIR-V2X dataset.
 
-- **Support different fusion methods for VIC3D**
-
-  It will directly support different fusion methods including early fusion/feature fusion/late fusion.
-  Now it supports early fusion and late fusion.
-- [x] Early Fusion
-- [x] Early Fusion
-- [x] Late Fusion
-
-- **Support multi-modality/single-modality detectors for VIC3D**
-
-  It will directly  support different modaility detectors including image-modality detector, pointcloud-modality detector and image-pointcloud fusion detector. 
-  Now it supports image-modality detector ImvoxelNet, pointcloud-modality detector PointPillars.
-- [x] Image-modality
-- [x] Pointcloud-modality
-- [ ] Multi-modality
-
-
-- **Support Cooperation-view/single-view detectors for VIC3D**
-
-  It directly supports different view's detectors for VIC3D, including infrastructure-view detector, 
-  vehicle-view detector, vehicle-infrastrucure cooperation-view detector.
-- [x] Infrastructure-view
-- [x] Vehicle-view
-- [x] Cooperation-view
+Please refer to [get_started_spd.md](docs/get_started_spd.md) for the usage and benchmarks reproduction of V2X-Seq-SPD dataset.
 
 ## Benchmark <a name="benchmark"></a>
 
-You can find more benchmark in [SV3D-Veh](configs/sv3d-veh), [SV3D-Inf](configs/sv3d-inf), and [VIC3D](configs/vic3d). We provide part of the VIC3D Benchmark in following table.
+You can find more benchmark in [SV3D-Veh](configs/sv3d-veh), [SV3D-Inf](configs/sv3d-inf), [VIC3D](configs/vic3d) and [VIC3D-SPD](configs/vic3d-spd/). 
+
+Part of the VIC3D detection benchmarks based on DAIR-V2X-C dataset:
 
 | Modality  | Fusion  | Model      | Dataset   | AP-3D (IoU=0.5)  |        |        |         | AP-BEV (IoU=0.5)  |       |        |         |   AB   |
 | :-------: | :-----: | :--------: | :-------: | :----: | :----: | :----: | :-----: | :-----: | :---: | :----: | :-----: | :----: |
@@ -101,11 +73,37 @@ You can find more benchmark in [SV3D-Veh](configs/sv3d-veh), [SV3D-Inf](configs/
 |       | Late-Fusion | PointPillars |VIC-Async-2| 52.43 | 51.13 | 67.09  | 49.86   | 58.10   | 57.23 | 70.86  | 55.78   | 478.01 |
 |       | TCLF        | PointPillars |VIC-Async-2| 53.37 | 52.41 | 67.33  | 50.87   | 59.17   | 58.25 | 71.20  | 57.43   | 897.91 |
 
+Part of the VIC3D detection and tracking benchmarks based on V2X-Seq-SPD:
 
-## Citation <a name="citation"></a>
+| Modality | Fusion      | Model       | Dataset      | AP 3D (Iou=0.5) | AP BEV (Iou=0.5) | MOTA   | MOTP   | AMOTA  | AMOTP  | IDs | AB(Byte) | 
+|----------|-------------|-------------|--------------|-----------------|------------------|--------|--------|--------|--------|-----|----------|-------------------------------------------------------------------------------------------------|
+| Image    | Veh Only    | ImvoxelNet  | VIC-Sync-SPD | 8.55            | 10.32            | 10.19 | 57.83 | 1.36 | 14.75 | 4   |          | 
+| Image    | Late Fusion | ImvoxelNet  | VIC-Sync-SPD | 17.31           | 22.53            | 21.81 | 56.67 | 6.22 | 25.24 | 47  | 3300     | 
+
+---
+
+
+
+## TODO List <a name="TODO List"></a>
+- [x] Dataset Release
+- [x] Dataset API
+- [x] Evaluation Code
+- [x] All detection benchmarks based on DAIR-V2X dataset
+- [x] Benchmarks for detection and tracking tasks with different fusion strategies for Image based on V2X-Seq-SPD dataset
+- [ ] All benchmarks for detection and tracking tasks based on V2X-Seq-SPD dataset
 
-If you find this project useful in your research, please consider cite:
+
+## Citation <a name="citation"></a>
+Please consider citing our paper if the project helps your research with the following BibTex:
+```bibtex
+@inproceedings{v2x-seq,
+  title={V2X-Seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting},
+  author={Yu, Haibao and Yang, Wenxian and Ruan, Hongzhi and Yang, Zhenwei and Tang, Yingjuan and Gao, Xu and Hao, Xin and Shi, Yifeng and Pan, Yifeng and Sun, Ning and Song, Juan and Yuan, Jirui and Luo, Ping and Nie, Zaiqing},
+  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+  year={2023},
+}
 ```
+```bibtex
 @inproceedings{dair-v2x,
   title={Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection},
   author={Yu, Haibao and Luo, Yizhen and Shu, Mao and Huo, Yiyi and Yang, Zebang and Shi, Yifeng and Guo, Zhenglong and Li, Hanyu and Hu, Xing and Yuan, Jirui and Nie, Zaiqing},
@@ -126,4 +124,6 @@ If any questions and suggenstations, please email to [email protected].
 - [DAIR-V2X-Seq](https://github.com/AIR-THU/DAIR-V2X-Seq) (:rocket:Ours!)
 - [FFNET](https://github.com/haibao-yu/FFNet-VIC3D) (:rocket:Ours!)
 - [mmdet3d](https://github.com/open-mmlab/mmdetection3d)
-- [pypcd](https://github.com/dimatura/pypcd)
+- [pypcd](https://github.com/dimatura/pypcd)
+- [AB3DMOT](https://github.com/xinshuoweng/AB3DMOT)
+
diff --git a/docs/apis/dataloaders.md b/docs/apis/dataloaders.md
@@ -24,7 +24,7 @@ for VICFrame, label, filt in dataset:
 You can access to the infrastructure frame class or vehicle frame class by:
 ```
 VICFrame.inf_frame    # The infrastructure frame, member of InfFrame
-VICFrame.veh_frame    # The vehicle frame, member of InfFrame
+VICFrame.veh_frame    # The vehicle frame, member of VehFrame
 ```
 We provide `Transform` class carrying out the coordinate transformation you need:
 ```
@@ -74,9 +74,9 @@ You can also access the frame values by their keys which are listed below:
 | `calib_camera_intrisinc_path`      | path of the camera intrinsics file                    |
 
 
-#### VICFrame.veh_frame
+#### VehFrame
 
-`veh_frame` refers to the vehicle frame class. We provide APIs which loads the point cloud (`inf_frame.point_cloud(data_format="array"/"file"/"tensor")`) or image (`inf_frame.image(data_format="array"/"file"/"tensor")`) of this frame. 
+`VehFrame` refers to the vehicle frame class. We provide APIs which loads the point cloud (`inf_frame.point_cloud(data_format="array"/"file"/"tensor")`) or image (`inf_frame.image(data_format="array"/"file"/"tensor")`) of this frame. 
 You can also access the frame values by their keys listed below:
 
 | Key                             | Value                             |

diff --git a/docs/visualization.md b/docs/visualization.md
@@ -1,5 +1,11 @@
 ## Visualization tutorial
 
+### visualization for spd
+
+In SPD, we utilize [SUSTechPOINTS](https://github.com/naurril/SUSTechPOINTS) as the visualization tool for point cloud 3D bounding boxes and tracking IDS. SUSTechPOINTS is a Portable 3D Point Cloud Interactive Annotation Platform System. We offer a tool for converting labels from dair-v2x to SUS. For more details, please refer to [gen_SUS_label](../tools/visualize/gen_SUS_label.py) 
+
+
+### visualization for vic3d
 We provide the tools to visualize the 3d label in images and point clouds, and visualize the prediction results.
 
 #### visualize 3d label in image
@@ -58,4 +64,4 @@ python tools/visualize/vis_label_in_3d.py --task fusion --path v2x/cache/vic-lat
 
 - **--task** refers to the type of task you choose to visualize, the optional values are '**fusion**', '**single**', '**pcd_label**'.
 - **--path** refers to the pickle file generated during inference.
-- **--id** refers to the 'filename' you want to visualize.
+- **--id** refers to the 'filename' you want to visualize.
diff --git a/tools/dataset_converter/calib_i2v.py b/tools/dataset_converter/calib_i2v.py
@@ -2,7 +2,7 @@
 import numpy as np
 import argparse
 from pypcd import pypcd
-from gen_kitti.utils import read_json, write_json, pcd2bin
+from tools.dataset_converter.utils import read_json, write_json, pcd2bin
 import json
 
 

diff --git a/tools/dataset_converter/dair2kitti.py b/tools/dataset_converter/dair2kitti.py
@@ -1,10 +1,10 @@
 import argparse
 import os
-from gen_kitti.label_lidarcoord_to_cameracoord import gen_lidar2cam
-from gen_kitti.label_json2kitti import json2kitti, rewrite_label, label_filter
-from gen_kitti.gen_calib2kitti import gen_calib2kitti
-from gen_kitti.gen_ImageSets_from_split_data import gen_ImageSet_from_split_data
-from gen_kitti.utils import pcd2bin
+from tools.dataset_converter.gen_kitti.label_lidarcoord_to_cameracoord import gen_lidar2cam
+from tools.dataset_converter.gen_kitti.label_json2kitti import json2kitti, rewrite_label, label_filter
+from tools.dataset_converter.gen_kitti.gen_calib2kitti import gen_calib2kitti
+from tools.dataset_converter.gen_kitti.gen_ImageSets_from_split_data import gen_ImageSet_from_split_data
+from tools.dataset_converter.utils import pcd2bin
 
 parser = argparse.ArgumentParser("Generate the Kitti Format Data")
 parser.add_argument("--source-root", type=str, default="data/single-vehicle-side", help="Raw data root about DAIR-V2X.")

diff --git a/tools/dataset_converter/gen_kitti/gen_ImageSets_from_split_data.py b/tools/dataset_converter/gen_kitti/gen_ImageSets_from_split_data.py
@@ -1,5 +1,5 @@
 import os
-from .utils import read_json, mkdir_p, write_txt
+from tools.dataset_converter.utils import read_json, mkdir_p, write_txt
 
 
 def gen_ImageSet_from_split_data(ImageSets_path, split_data_path, sensor_view="vehicle"):

diff --git a/tools/dataset_converter/gen_kitti/gen_calib2kitti.py b/tools/dataset_converter/gen_kitti/gen_calib2kitti.py
@@ -1,6 +1,6 @@
 import os
 import numpy as np
-from .utils import mkdir_p, read_json, get_files_path
+from tools.dataset_converter.utils import mkdir_p, read_json, get_files_path
 
 
 def convert_calib_v2x_to_kitti(cam_D, cam_K, t_velo2cam, r_velo2cam):

diff --git a/tools/dataset_converter/gen_kitti/label_json2kitti.py b/tools/dataset_converter/gen_kitti/label_json2kitti.py
@@ -1,5 +1,5 @@
 import os
-from .utils import mkdir_p, read_json, get_files_path
+from tools.dataset_converter.utils import mkdir_p, read_json, get_files_path
 
 
 def write_kitti_in_txt(my_json, path_txt):

diff --git a/tools/dataset_converter/label_world2v.py b/tools/dataset_converter/label_world2v.py
@@ -3,7 +3,7 @@
 from tqdm import tqdm
 import math
 import argparse
-from gen_kitti.utils import read_json, write_json, mkdir_p
+from tools.dataset_converter.utils import read_json, write_json, mkdir_p
 
 """
             virtuallidar2world

diff --git a/v2x/dataset/__init__.py b/v2x/dataset/__init__.py
@@ -6,4 +6,8 @@
     "dair-v2x-i": DAIRV2XI,
     "vic-sync": VICSyncDataset,
     "vic-async": VICAsyncDataset,
+    "dair-v2x-v-spd": DAIRV2XVSPD,
+    "dair-v2x-i-spd": DAIRV2XISPD,
+    "vic-sync-spd": VICSyncDatasetSPD,
+    "vic-async-spd": VICAsyncDatasetSPD,
 }
diff --git a/v2x/dataset/base_dataset.py b/v2x/dataset/base_dataset.py
@@ -63,6 +63,13 @@ def build_path_to_info(prefix, data, sensortype="lidar"):
             path2info[path] = elem
     return path2info
 
+def build_frame_to_info(data):
+    frame2info = {}
+    for elem in data:
+        if elem["frame_id"] == "":
+            continue
+        frame2info[elem["frame_id"]] = elem
+    return frame2info
 
 class DAIRV2XDataset(Dataset):
     def __init__(self, path, args, split="train", extended_range=None):