Support adjusting the line thickness of visualized boxes and update r…

…eadme
OpenRobotLab · Dec 26, 2023 · 3ba4828 · 3ba4828
1 parent 99ab536
commit 3ba4828
Show file tree

Hide file tree

Showing 6 changed files with 268 additions and 45 deletions.
diff --git a/README.md b/README.md
@@ -44,6 +44,19 @@
 
 [![demo](assets/demo_fig.png "demo")](https://tai-wang.github.io/embodiedscan)
 
+<!-- contents with emoji -->
+
+## 📋 Contents
+
+1. [About](#-about)
+2. [News](#-news)
+3. [Getting Started](#-getting-started)
+4. [Model and Benchmark](#-model-and-benchmark)
+5. [TODO List](#-todo-list)
+6. [Citation](#-citation)
+7. [License](#-license)
+8. [Acknowledgements](#-acknowledgements)
+
 ## 🏠 About
 
 <!-- ![Teaser](assets/teaser.jpg) -->
@@ -62,9 +75,63 @@ Building upon this database, we introduce a baseline framework named <b>Embodied
 
 - \[2023-12\] We release the [paper](./assets/EmbodiedScan.pdf) of EmbodiedScan. Please check the [webpage](https://tai-wang.github.io/embodiedscan) and view our demos!
 
-## 🔍 Overview
+## 📚 Getting Started
+
+### Installation
 
-### Model
+We test our codes under the following environment:
+
+- Ubuntu 20.04
+- NVIDIA Driver: 525.147.05
+- CUDA 12.0
+- Python 3.8.18
+- PyTorch 1.11.0+cu113
+- PyTorch3D 0.7.2
+
+1. Clone this repository.
+
+```bash
+git clone https://github.com/OpenRobotLab/EmbodiedScan.git
+cd EmbodiedScan
+```
+
+2. Install [PyTorch3D](https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md)
+
+```bash
+conda create -n embodiedscan python=3.8 -y  # pytorch3d needs python>3.7
+conda activate embodiedscan
+# We recommend installing pytorch3d with pre-compiled packages
+# For example, to install for Python 3.8, PyTorch 1.11.0 and CUDA 11.3
+# For more information, please refer to https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md#2-install-wheels-for-linux
+pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html
+```
+
+3. Install EmbodiedScan
+
+```bash
+# We plan to make EmbodiedScan easier to install by "pip install EmbodiedScan".
+# Please stay tuned for the future official release.
+# Make sure you are under ./EmbodiedScan/
+pip install -e .
+```
+
+### Data Preparation
+
+Please download ScanNet, 3RScan and matterport3d from its official website.
+
+We will release the demo data, re-organized file structure, post-processing script and annotation files in the near future.
+
+Please stay tuned.
+
+### Tutorial
+
+We provide a simple [tutorial](https://github.com/OpenRobotLab/EmbodiedScan/blob/main/embodiedscan/tutorial.ipynb) here as a guideline for the basic analysis and visualization of our dataset. Welcome to try and post your suggestions!
+
+## 📦 Model and Benchmark
+
+We will release the code for model training and benchmark with pretrained checkpoints in the 2024 Q1.
+
+### Model Overview
 
 <p align="center">
   <img src="assets/framework.png" align="center" width="100%">
@@ -91,6 +158,10 @@ Embodied Perceptron accepts RGB-D sequence with any number of views along with t
 <video src="assets/scannet_two_bed_demo.mp4" controls>
 </video> -->
 
+### Benchmark
+
+Please see the [paper](./assets/EmbodiedScan.pdf) for details of our two benchmarks, fundamental 3D perception and language-grounded benchmarks. This dataset is still scaling up and the benchmark is being polished and extended. Please stay tuned for our recent updates.
+
 ## 📝 TODO List
 
 - \[x\] Paper and partial code release.
@@ -153,7 +224,7 @@ This work is under the <a rel="license" href="http://creativecommons.org/license
 
 ## 👏 Acknowledgements
 
-- [OpenMMLab](https://github.com/open-mmlab): Our dataset code uses [MMEngine](https://github.com/open-mmlab/mmengine) and our model is built upon of [MMDetection3D](https://github.com/open-mmlab/mmdetection3d).
+- [OpenMMLab](https://github.com/open-mmlab): Our dataset code uses [MMEngine](https://github.com/open-mmlab/mmengine) and our model is built upon [MMDetection3D](https://github.com/open-mmlab/mmdetection3d).
 - [PyTorch3D](https://github.com/facebookresearch/pytorch3d): We use some functions supported in PyTorch3D for efficient computations on fundamental 3D data structures.
 - [ScanNet](https://github.com/ScanNet/ScanNet), [3RScan](https://github.com/WaldJohannaU/3RScan), [Matterport3D](https://github.com/niessner/Matterport): Our dataset uses the raw data from these datasets.
 - [ReferIt3D](https://github.com/referit3d/referit3d): We refer to the SR3D's approach to obtaining the language prompt annotations.

diff --git a/embodiedscan/explorer.py b/embodiedscan/explorer.py
@@ -9,6 +9,7 @@
 from embodiedscan.utils.continuous_drawer import (ContinuousDrawer,
                                                   ContinuousOccupancyDrawer)
 from embodiedscan.utils.img_drawer import ImageDrawer
+from embodiedscan.utils.utils import _9dof_to_box, _box_add_thickness
 
 DATASETS = ['scannet', '3rscan', 'matterport3d']
 
@@ -27,13 +28,12 @@ class EmbodiedScanExplorer:
             Defaults to None.
     """
 
-    def __init__(
-        self,
-        data_root: Union[dict, List],
-        ann_file: Union[dict, List, str],
-        verbose: bool = False,
-        color_setting: str = None,
-    ):
+    def __init__(self,
+                 data_root: Union[dict, List],
+                 ann_file: Union[dict, List, str],
+                 verbose: bool = False,
+                 color_setting: str = None,
+                 thickness: float = 0.01):
 
         if isinstance(ann_file, dict):
             ann_file = list(ann_file.values())
@@ -56,6 +56,7 @@ def __init__(
             self.data_root = data_root
 
         self.verbose = verbose
+        self.thickness = thickness
 
         if self.verbose:
             print('Dataset root')
@@ -239,9 +240,10 @@ def render_scene(self, scene_name, render_box=False):
             if self.verbose:
                 print('Rendering box')
             for instance in select['instances']:
-                box = self._9dof_to_box(instance['bbox_3d'],
-                                        instance['bbox_label_3d'])
-                boxes.append(box)
+                box = _9dof_to_box(instance['bbox_3d'],
+                                   self.classes[instance['bbox_label_3d'] - 1],
+                                   self.color_selector)
+                boxes += _box_add_thickness(box, self.thickness)
             if self.verbose:
                 print('Rendering complete')
         o3d.visualization.draw_geometries([mesh, frame] + boxes)
@@ -297,7 +299,7 @@ def render_continuous_scene(self,
         drawer = ContinuousDrawer(dataset, self.data_root[dataset],
                                   selected_scene, self.classes,
                                   self.color_selector, start_idx,
-                                  pcd_downsample)
+                                  pcd_downsample, self.thickness)
         drawer.begin()
 
     def render_continuous_occupancy(self, scene_name, start_cam=None):
@@ -443,8 +445,10 @@ def show_image(self, scene_name, camera_name, render_box=False):
                         print('Rendering box')
                     for i in camera['visible_instance_ids']:
                         instance = select['instances'][i]
-                        box = self._9dof_to_box(instance['bbox_3d'],
-                                                instance['bbox_label_3d'])
+                        box = _9dof_to_box(
+                            instance['bbox_3d'],
+                            self.classes[instance['bbox_label_3d'] - 1],
+                            self.color_selector)
                         label = self.classes[instance['bbox_label_3d'] - 1]
                         color = self.color_selector.get_color(label)
                         img_drawer.draw_box3d(box,
@@ -461,32 +465,6 @@ def show_image(self, scene_name, camera_name, render_box=False):
         print('No such camera')
         return
 
-    def _9dof_to_box(self, box, label_id):
-        """Convert 9-DoF box annotations to open3d oriented boxes.
-
-        Args:
-            box (list | np.ndarray): Original box annotations.
-            label_id (int): Category ID.
-
-        Returns:
-            open3d.geometry.OrientedBoundingBox: Converted boxes for the
-            subsequent processing and visualization.
-        """
-        if isinstance(box, list):
-            box = np.array(box)
-        center = box[:3].reshape(3, 1)
-        scale = box[3:6].reshape(3, 1)
-        rot = box[6:].reshape(3, 1)
-        rot_mat = \
-            o3d.geometry.OrientedBoundingBox.get_rotation_matrix_from_zxy(rot)
-        geo = o3d.geometry.OrientedBoundingBox(center, rot_mat, scale)
-
-        label = self.classes[label_id - 1]
-        color = self.color_selector.get_color(label)
-        color = [x / 255.0 for x in color]
-        geo.color = color
-        return geo
-
 
 if __name__ == '__main__':
     explorer = EmbodiedScanExplorer(

diff --git a/embodiedscan/tutorial.ipynb b/embodiedscan/tutorial.ipynb
@@ -1105,6 +1105,14 @@
       "Rendering box\n",
       "Rendering complete\n"
      ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/data/EmbodiedScan/release/EmbodiedScan/embodiedscan/utils/line_mesh.py:25: RuntimeWarning: invalid value encountered in divide\n",
+      "  axis_ = axis_ / np.linalg.norm(axis_)\n"
+     ]
     }
    ],
    "source": [

diff --git a/embodiedscan/utils/continuous_drawer.py b/embodiedscan/utils/continuous_drawer.py
@@ -5,7 +5,8 @@
 import numpy as np
 import open3d as o3d
 
-from .utils import _9dof_to_box, draw_camera, from_depth_to_point
+from .utils import (_9dof_to_box, _box_add_thickness, draw_camera,
+                    from_depth_to_point)
 
 
 class ContinuousDrawer:
@@ -26,14 +27,15 @@ class ContinuousDrawer:
     """
 
     def __init__(self, dataset, dir, scene, classes, color_selector, start_idx,
-                 pcd_downsample):
+                 pcd_downsample, thickness):
         self.dir = dir
         self.dataset = dataset
         self.scene = scene
         self.classes = classes
         self.color_selector = color_selector
         self.idx = start_idx
         self.downsample = pcd_downsample
+        self.thickness = thickness
         self.camera = None
         self.occupied = np.zeros((len(self.scene['instances']), ), dtype=bool)
         self.vis = o3d.visualization.VisualizerWithKeyCallback()
@@ -134,7 +136,9 @@ def draw_next(self, vis):
             box = _9dof_to_box(instance['bbox_3d'],
                                self.classes[instance['bbox_label_3d'] - 1],
                                self.color_selector)
-            vis.add_geometry(box)
+            box = _box_add_thickness(box, self.thickness)
+            for item in box:
+                vis.add_geometry(item)
 
         self.idx += 1
         ctr = vis.get_view_control()