Semantic segmentation

There are many solutions to the 3D reconstruction problem, but a major breakthrough was the NeRF method, which does not deal with the explicit description of the surface of the shapes, but treats the space as a colorful, dense, luminous volume, called radiance field. This approach is extremely flexible, unlike previous methods, it can learn viewpoint-dependent lighting without any knowledge of the light source.

The radiance field approaches space as particles, as in reality. The radiance field can be represented using a neural network, this is the NeRF method. However, according to another approach, these particles can also have an extent. These extended particles are Gaussian splats which are geometrically equivalent to the ellipsoid.

Rendered images of the reconstruction	Ellipsoids of the reconstruction

The ellipsoids are stretched by Gaussian distributions, by 3 mutually perpendicular axes. Similarly, they have the necessary attributes, such as viewpoint-dependent color and transparency. The method that based on this concept called 3D Gaussian Splatting.

Output of Gaussian Splatting

Gaussian Splatting is a method for representing 3D scenes and allows to render radiance fields in real-time. Gaussian Splatting is capable of generating the 3D reconstruction of a scene by only using images and the corresponding camera poses, based on machine learning algorithms. The method optimises the attributes of the ellipsoids by comparing the rendered and a real images from given camera poses.

The output of the algorithm is a Polygon file and it contains the attributes of the ellipsoids that build up the reconstruction:

center
rotation
color (diffuse and view dependent components)
scale
opacity

The program

Segmentation is a commonly used procedure. One way to solve the segmentation of the 3D scene is to project the segmented images to the points (ellipsoids) of the reconstructed scene. For this method you need the semantic segmented version of the images of the training dataset, which the reconstruction is based on. It is important to name the semantic segmented version of the images as same as their original RGB pairs to know where to project from in the world coordinate system.

RGB	Semantic segmentation

Ray-casting

For the projection I implemented an OO code in Python based on ray-casting. Here you can see the partial class diagram of the implementation.

The code takes the camera poses and the corresponding images, than intersects the ellipsoids by the ray that goes to the direction of a given pixel coordinate and labels the closest ellipsoid with the color of the pixel.

The implementation contains functions that allows you to

label the ellipsoids
smooth the segmentation
reduce the noise of the segmentation
filter by labels
and save the results as the original format

Results

For the visualization of the partial results I used the Open3D library.

This project is capable of saving a properly formatted Polygon file which can be read in by the SIBR viewer. This way you can eliminate the unnecessary segments of the scene after labeling it. You can see the results below.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
documentation		documentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
camera.py		camera.py
camera_list.py		camera_list.py
covariance_operations.py		covariance_operations.py
ellipsoid.py		ellipsoid.py
ellipsoid_list.py		ellipsoid_list.py
label_cloud.py		label_cloud.py
labeling.py		labeling.py
segmented_cloud.py		segmented_cloud.py
sh.py		sh.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic segmentation

Output of Gaussian Splatting

The program

Ray-casting

Results

About

Releases

Packages

Languages

License

Viktr0/Segmentation-of-GS-reconstruction

Folders and files

Latest commit

History

Repository files navigation

Semantic segmentation

Output of Gaussian Splatting

The program

Ray-casting

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages