feat: update rain-gs (warmup phase) with various options

cvlab-kaist · May 29, 2024 · 192384c · 192384c
1 parent 12bacf6
commit 192384c
Show file tree

Hide file tree

Showing 8 changed files with 197 additions and 46 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,12 @@
+*.pkl
+*.txt
+
+*.pyc
+.vscode
+output
+demo_logs
+build
+diff_rasterization/diff_rast.egg-info
+diff_rasterization/dist
+tensorboard_3d
+
diff --git a/README.md b/README.md
@@ -13,24 +13,75 @@ by [Jaewoo Jung](https://crepejung00.github.io)<sup>:umbrella:</sup>, [Jisang Ha
 ![](assets/teaser.png)<br>
 We introduce a novel optimization strategy (**RAIN-GS**) for 3D Gaussian Splatting!
 
-We show that our simple yet effective strategy consisted of **sparse-large-variance (SLV) random initialization** and **progressive Gaussian low-pass filter control** robustly guides 3D Gaussians to model the scene even when starting from random point clouds.
+We show that our simple yet effective strategy consisting of **sparse-large-variance (SLV) random initialization**, **progressive Gaussian low-pass filter control**, and the **Adaptive Bound-Expanding Split (ABE-Split) algorithm** robustly guides 3D Gaussians to model the scene even when starting from random point cloud.
 
-For further details and visualization results, please check out our [paper](https://arxiv.org/abs/2403.09413) and our [project page](https://ku-cvlab.github.io/RAIN-GS/).
+**❗️Update (2024/05/29):** We have updated our paper and codes which significantly improve our previous results! <br>
+**😴 TL;DR** for our update is as follows:
+- We added a modification to the original split algorithm of 3DGS which enables the Gaussians to model scenes further from the viewpoints! This new splitting algorithm is named Adaptive Bound-Expanding Split algorithm (**ABE-Split** algorithm).
+- Now with our three key components (SLV initialization, progressive Gaussians low-pass filtering, ABE-Split), we perform **on-par or even better** compared to 3DGS trainied with SfM initialized point cloud.
+
+- As RAIN-GS only requires the initial point cloud to be sparse (SLV initialization), we now additionally apply our strategy to **SfM/Noisy SfM point cloud** by choosing a sparse set of points from the point cloud.
+
+For further details and visualization results, please check out our updated [paper](https://arxiv.org/abs/2403.09413) and our new [project page](https://ku-cvlab.github.io/RAIN-GS/).
 
 ## Installation
 We implement **RAIN-GS** above the official implementation of 3D Gaussian Splatting. <br> For environmental setup, we kindly guide you to follow the original requirements of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting). 
 
 ## Training
 
-To train 3D Gaussian Splatting with our novel strategy (**RAIN-GS**), all you need to do is:
+To train 3D Gaussians Splatting with our **updated** **RAIN-GS** novel strategy, all you need to do is:
+
+```bash
+python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours_new 
+```
+You can train from various initializations by adding `--train_from ['random', 'reprojection', 'cluster', 'noisy_sfm']` (random is default)
+<details>
+<summary>Toggle to find more details for training from various initializations.</summary>
+
+- **Random Initialization** (Default)
+```bash
+python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours_new --train_from 'random'
+```
+- SfM (Structure-from-Motion) Initialization <br>
+In order to apply RAIN-GS to SfM Initialization, we need to start with a sparse set of points (SLV Initialization). <br>
+To choose the sparse set of points, you can choose several options:
+  - **Clustering** : Apply clustering to the initial point cloud using the [HDBSCAN](https://github.com/scikit-learn-contrib/hdbscan) algorithm.
+  ```bash
+  python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours_new --train_from 'cluster'
+  ```
+
+  - **Top 10%** : Each of the points from SfM comes with a confidence value, which is the reprojection error. Select the top 10% most confident points from the point cloud.
+  ```bash
+  python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours_new --train_from 'reprojection'
+  ```
+
+- **Noisy SfM Initialization** <br>
+In real-world scenarios, the point cloud from SfM can contain noise. To simulate this scenario, we add a random noise sampled from a normal distribution to the SfM point cloud. If you run with this option, we apply the clustering algorithm to the Noisy SfM point cloud.
+```bash
+python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours_new --train_from 'noisy_sfm'
+```
+
+</details>
+
+To train 3D Gaussian Splatting with our original **RAIN-GS**, all you need to do is:
 
 ```bash
 python train.py -s {dataset_path} --exp_name {exp_name} --eval --ours
 ```
 
 For dense-small-variance (DSV) random initialization (used in the original 3D Gaussian Splatting), you can simply run with the following command:
 ```bash
-python train.py -s {dataset_path} --exp_name {exp_name} --eval --DSV
+python train.py -s {dataset_path} --exp_name {exp_name} --eval --paper_random
+```
+
+For SfM (Structure-from-Motion) initialization (used in the original 3D Gaussian Splatting), you can simply run with the following command:
+```bash
+python train.py -s {dataset_path} --exp_name {exp_name} --eval
+```
+
+For Noisy SfM initialization (used in the original 3D Gaussian Splatting), you can simply run with the following command:
+```bash
+python train.py -s {dataset_path} --exp_name {exp_name} --eval --train_from 'noisy_sfm'
 ```
 
 To train with Mip-NeRF360 dataset, you can add argument `--images images_4` for outdoor scenes and `--images images_2` for indoor scenes to modify the resolution of the input images.

diff --git a/assets/teaser.png b/assets/teaser.png
diff --git a/paper.pdf b/paper.pdf
diff --git a/paper_v2.pdf b/paper_v2.pdf
diff --git a/scene/dataset_readers.py b/scene/dataset_readers.py
@@ -143,7 +143,7 @@ def readColmapSceneInfo(path, images, eval, llffhold=8, args_dict=None):
     ply_path = os.path.join(path, "sparse/0/points3D.ply")
     bin_path = os.path.join(path, "sparse/0/points3D.bin")
     txt_path = os.path.join(path, "sparse/0/points3D.txt")
-        
+
     if not os.path.exists(ply_path):
         print("Converting point3d.bin to .ply, will happen only the first time you open the scene.")
         try:
@@ -155,30 +155,59 @@ def readColmapSceneInfo(path, images, eval, llffhold=8, args_dict=None):
         pcd = fetchPly(ply_path)
     except:
         pcd = None
-
-    if (args_dict is not None) and (args_dict['DSV'] or args_dict['ours']):
-        num_pts = args_dict["num_gaussians"]
 
-        cam_pos = []
-        for k in cam_extrinsics.keys():
-            cam_pos.append(cam_extrinsics[k].tvec)
-        cam_pos = np.array(cam_pos)
-        min_cam_pos = np.min(cam_pos)
-        max_cam_pos = np.max(cam_pos)
-        mean_cam_pos = (min_cam_pos + max_cam_pos) / 2.0
-        cube_mean = (max_cam_pos - min_cam_pos) * 1.5
+    if args_dict['train_from'] == "noisy_sfm":
+        print(f"Adding noise to the point cloud (1.0)...")
+        xyz += np.random.normal(0, 1.0, xyz.shape)
+        rgb += np.random.normal(0, 1.0, rgb.shape)
+        rgb =  np.clip(rgb, 0, 255)
+
+    if (args_dict is not None) and (args_dict['paper_random'] or args_dict['ours'] or args_dict['ours_new']):
+        if not args_dict['ours'] and args_dict['train_from'] == "reprojection":
+            try:
+                xyz, rgb, error = read_points3D_binary(bin_path)
+            except:
+                xyz, rgb, error = read_points3D_text(txt_path)
+
+            error_rate = 10
+            err_thr = np.percentile(error[:,0], error_rate)
+            xyz = xyz[(error[:,0]<err_thr),:]
+            rgb = rgb[(error[:,0]<err_thr),:]
+            print(f"Train with {len(xyz)} sparse SfM points... (Sparse Type: Reprojection Error Top {error_rate}%)")
+            storePly(ply_path, xyz, rgb)
+
+        elif not args_dict['ours'] and ((args_dict['train_from'] == "cluster") or (args_dict['train_from'] == "noisy_sfm")):
+            from sklearn.cluster import HDBSCAN
+            hdbscan = HDBSCAN(min_cluster_size=5, store_centers='both').fit(xyz)
+            xyz = hdbscan.centroids_
+            shs = np.random.random((len(xyz), 3))
+            rgb = SH2RGB(shs) * 255
+            print(f"Train with {len(xyz)} sparse SfM points... (Sparse Type: cluster)")
+            storePly(ply_path, xyz, rgb)
 
-        if args_dict['DSV']:        
-            xyz = np.random.random((num_pts, 3)) * nerf_normalization["radius"] * 3 - nerf_normalization["radius"] * 1.5
-            xyz = xyz + nerf_normalization["translate"]
-            print(f"Generating DSV point cloud ({num_pts})...")
         else:
-            xyz = np.random.random((num_pts, 3)) * (max_cam_pos - min_cam_pos) * 3 - (cube_mean - mean_cam_pos)
-            print(f"Generating OUR point cloud ({num_pts})...")
+            num_pts = args_dict["num_gaussians"]
+
+            cam_pos = []
+            for k in cam_extrinsics.keys():
+                cam_pos.append(cam_extrinsics[k].tvec)
+            cam_pos = np.array(cam_pos)
+            min_cam_pos = np.min(cam_pos)
+            max_cam_pos = np.max(cam_pos)
+            mean_cam_pos = (min_cam_pos + max_cam_pos) / 2.0
+            cube_mean = (max_cam_pos - min_cam_pos) * 1.5
+
+            if args_dict['paper_random']:        
+                xyz = np.random.random((num_pts, 3)) * nerf_normalization["radius"] * 3 - nerf_normalization["radius"] * 1.5
+                xyz = xyz + nerf_normalization["translate"]
+                print(f"Generating random point cloud ({num_pts})...")
+            else:
+                xyz = np.random.random((num_pts, 3)) * (max_cam_pos - min_cam_pos) * 3 - (cube_mean - mean_cam_pos)
+                print(f"Generating OUR point cloud ({num_pts})...")
 
-        shs = np.random.random((num_pts, 3))
-        pcd = BasicPointCloud(points=xyz, colors=shs, normals=np.zeros((num_pts, 3)))
-        storePly(ply_path, xyz, SH2RGB(shs) * 255)
+            shs = np.random.random((num_pts, 3))
+            pcd = BasicPointCloud(points=xyz, colors=shs, normals=np.zeros((num_pts, 3)))
+            storePly(ply_path, xyz, SH2RGB(shs) * 255)
     try:
         pcd = fetchPly(ply_path)
     except:

diff --git a/scene/gaussian_model.py b/scene/gaussian_model.py
@@ -336,9 +336,33 @@ def densification_postfix(self, new_xyz, new_features_dc, new_features_rest, new
         self.denom = torch.zeros((self.get_xyz.shape[0], 1), device="cuda")
         self.max_radii2D = torch.zeros((self.get_xyz.shape[0]), device="cuda")
 
-    def densify_and_split(self, grads, grad_threshold, scene_extent, N=2):
+    def densify_and_split(self, grads, grad_threshold, scene_extent, N=2, abe_split=False):
         n_init_points = self.get_xyz.shape[0]
-
+
+        if abe_split:
+            BACK_N  = N - 1 
+            padded_grad = torch.zeros((n_init_points), device="cuda")
+            padded_grad[:grads.shape[0]] = grads.squeeze()
+            selected_pts_mask = torch.where(padded_grad >= grad_threshold, True, False)
+            selected_pts_mask = torch.logical_and(selected_pts_mask,
+                                                torch.max(self.get_scaling, dim=1).values > self.percent_dense*scene_extent)
+
+            stds = self.get_scaling[selected_pts_mask].repeat(BACK_N,1)
+            means =torch.zeros((stds.size(0), 3),device="cuda")
+            samples = torch.normal(mean=means, std=stds)
+            rots = build_rotation(self._rotation[selected_pts_mask]).repeat(BACK_N,1,1)
+            new_xyz = self.get_xyz[selected_pts_mask].repeat(BACK_N, 1)
+            new_scaling = self.scaling_inverse_activation(self.get_scaling[selected_pts_mask].repeat(BACK_N,1))
+            new_rotation = self._rotation[selected_pts_mask].repeat(BACK_N,1)
+            new_features_dc = self._features_dc[selected_pts_mask].repeat(BACK_N,1,1)
+            new_features_rest = self._features_rest[selected_pts_mask].repeat(BACK_N,1,1)
+            new_opacity = self._opacity[selected_pts_mask].repeat(BACK_N,1)
+
+            new_xyz = new_xyz*0.3*scene_extent
+
+            self.densification_postfix(new_xyz, new_features_dc, new_features_rest, new_opacity, new_scaling, new_rotation)
+            n_init_points = self.get_xyz.shape[0]
+
         padded_grad = torch.zeros((n_init_points), device="cuda")
         padded_grad[:grads.shape[0]] = grads.squeeze()
         selected_pts_mask = torch.where(padded_grad >= grad_threshold, True, False)
@@ -376,12 +400,12 @@ def densify_and_clone(self, grads, grad_threshold, scene_extent):
 
         self.densification_postfix(new_xyz, new_features_dc, new_features_rest, new_opacities, new_scaling, new_rotation)
 
-    def densify_and_prune(self, max_grad, min_opacity, extent, max_screen_size):
+    def densify_and_prune(self, max_grad, min_opacity, extent, max_screen_size, N=2, abe_split=False):
         grads = self.xyz_gradient_accum / self.denom
         grads[grads.isnan()] = 0.0
 
         self.densify_and_clone(grads, max_grad, extent)
-        self.densify_and_split(grads, max_grad, extent)
+        self.densify_and_split(grads, max_grad, extent, N=N, abe_split=abe_split)
 
         prune_mask = (self.get_opacity < min_opacity).squeeze()
         if max_screen_size:

diff --git a/train.py b/train.py
@@ -25,7 +25,7 @@ def training(dataset, opt, pipe, testing_iterations ,saving_iterations, checkpoi
     first_iter = 0
     tb_writer = prepare_output_and_logger(dataset, args_dict['output_path'], args_dict['exp_name'], args_dict['project_name'])
 
-    if args_dict['ours']:
+    if args_dict['ours'] or args_dict['ours_new']:
         divide_ratio = 0.7
     else:
         divide_ratio = 0.8
@@ -35,6 +35,9 @@ def training(dataset, opt, pipe, testing_iterations ,saving_iterations, checkpoi
     scene = Scene(dataset, gaussians, args_dict=args_dict)
     gaussians.training_setup(opt) 
 
+    if args_dict["warmup_iter"] > 0:
+        opt.densify_until_iter += args_dict["warmup_iter"]
+
     if checkpoint:
         (model_params, first_iter) = torch.load(checkpoint)
         gaussians.restore(model_params, opt)
@@ -67,16 +70,20 @@ def training(dataset, opt, pipe, testing_iterations ,saving_iterations, checkpoi
 
         iter_start.record()
 
-        gaussians.update_learning_rate(iteration)
-
-        if args_dict['DSV']:
-            if iteration % 1000 == 0:
-                gaussians.oneupSHdegree()
-        elif args_dict['ours']:
+        if args_dict['ours_new']:
+            if iteration >= args_dict["warmup_iter"]:    
+                gaussians.update_learning_rate(iteration-args_dict["warmup_iter"])
+        else:
+            gaussians.update_learning_rate(iteration)
+
+        if args_dict['ours'] or args_dict['ours_new']:
             if iteration >= 5000:
                 if iteration % 1000 == 0:
                     gaussians.oneupSHdegree()
-
+        else:
+            if iteration % 1000 == 0:
+                gaussians.oneupSHdegree()
+
         if not viewpoint_stack:
             viewpoint_stack = scene.getTrainCameras().copy()
         viewpoint_cam = viewpoint_stack.pop(randint(0, len(viewpoint_stack)-1))
@@ -128,7 +135,9 @@ def training(dataset, opt, pipe, testing_iterations ,saving_iterations, checkpoi
 
                 if iteration > opt.densify_from_iter and iteration % opt.densification_interval == 0:
                     size_threshold = 20 if iteration > opt.opacity_reset_interval else None
-                    gaussians.densify_and_prune(opt.densify_grad_threshold, 0.005, scene.cameras_extent, size_threshold)         
+                    abe_split = True if iteration <= args_dict['warmup_iter'] else False
+
+                    gaussians.densify_and_prune(opt.densify_grad_threshold, 0.005, scene.cameras_extent, size_threshold, N=2, abe_split=abe_split)         
 
                 if iteration % opt.opacity_reset_interval == 0 or (dataset.white_background and iteration == opt.densify_from_iter):
                     gaussians.reset_opacity()
@@ -224,7 +233,7 @@ def training_report(tb_writer, iteration, Ll1, loss, l1_loss, elapsed, testing_i
     parser.add_argument('--debug_from', type=int, default=-1)
     parser.add_argument('--detect_anomaly', action='store_true', default=False)
     parser.add_argument("--test_iterations", nargs="+", type=int, default=[7000, 30000])
-    parser.add_argument("--save_iterations", nargs="+", type=int, default=[7000, 30000])
+    parser.add_argument("--save_iterations", nargs="+", type=int, default=[30000])
     parser.add_argument("--quiet", action="store_true")
     parser.add_argument("--checkpoint_iterations", nargs="+", type=int, default=[])
     parser.add_argument("--start_checkpoint", type=str, default = None)
@@ -235,26 +244,52 @@ def training_report(tb_writer, iteration, Ll1, loss, l1_loss, elapsed, testing_i
     parser.add_argument("--c2f", action="store_true", default=False)
     parser.add_argument("--c2f_every_step", type=float, default=1000, help="Recompute low pass filter size for every c2f_every_step iterations")
     parser.add_argument("--c2f_max_lowpass", type=float, default= 300, help="Maximum low pass filter size")
-    parser.add_argument("--num_gaussians", type=int, default=1000000, help="Number of random initial gaussians to start with (default=1M for DSV)")
-    parser.add_argument('--DSV', action='store_true', help="Use the initialisation from the paper")
+    parser.add_argument("--num_gaussians", type=int, default=1000000, help="Number of random initial gaussians to start with (default=1M for random)")
+    parser.add_argument('--paper_random', action='store_true', help="Use the initialisation from the paper")
     parser.add_argument("--ours", action="store_true", help="Use our initialisation")
+    parser.add_argument("--ours_new", action="store_true", help="Use our initialisation version 2")
+    parser.add_argument("--warmup_iter", type=int, default=0)
+    parser.add_argument("--train_from", type=str, default="random", choices=["random", "reprojection", "cluster", "noisy_sfm"])
     args = parser.parse_args(sys.argv[1:])
     args.save_iterations.append(args.iterations)
     args.white_background = args.white_bg
     print("Optimizing " + args.model_path)
 
     safe_state(args.quiet)
 
-    if args.ours:
-        print("========= USING OUR INITIALISATION =========")
+    args.eval = True
+    outdoor_scenes=['bicycle', 'flowers', 'garden', 'stump', 'treehill']
+    indoor_scenes=['room', 'counter', 'kitchen', 'bonsai']
+    for scene in outdoor_scenes:
+        if scene in args.source_path:
+            args.images = "images_4"
+            print("Using images_4 for outdoor scenes")
+    for scene in indoor_scenes:
+        if scene in args.source_path:
+            args.images = "images_2"
+            print("Using images_2 for indoor scenes")
+
+    if args.ours or args.ours_new:
+        print("========= USING OUR METHOD =========")
         args.c2f = True
         args.c2f_every_step = 1000
         args.c2f_max_lowpass = 300
-        args.eval = True
         args.num_gaussians = 10
+    if args.ours_new:
+        args.warmup_iter = 10000
 
-    if not args.DSV and not args.ours:
-        parser.error("Please specify either --DSV or --ours")
+    if args.ours and (args.train_from != "random"):
+        parser.error("Our initialization version 1 can only be used with --train_from random")
+
+    # if args.sparse_sfm and args.cluster:
+    #     parser.error("Please specify either --sparse_sfm or --cluster")
+    # if args.random and (args.sparse_sfm or args.cluster):
+    #     parser.error("Random initialization cannot be used with --sparse_sfm or --cluster")
+    # if args.random and args.noisy_sfm:
+    #     parser.error("Random initialization cannot be used with --noisy_sfm")
+    # if args.ours and (args.sparse_sfm or args.cluster):
+    #     parser.error("Our initialization version 1 cannot be used with --sparse_sfm or --cluster")
+
     print(f"args: {args}")
 
     while True :