Awesome

GaussFusion: Seamless 3D Object Integration using Gaussian Splatting

Computer Vision Final Project on 3D Gaussian Splatting

<a href="https://github.com/EGalahad/GaussFusion/stargazers"> <img alt="GitHub stars" src="https://img.shields.io/github/stars/EGalahad/GaussFusion?style=social"/> </a>  <a href="https://github.com/EGalahad/GaussFusion/pulls"> <img alt="Pull Requests" src="https://img.shields.io/github/issues-pr/EGalahad/GaussFusion"/> </a> <img src="assets/bonsai_on_desktop.png" alt="bonsai on desktop" width="45%" /> <img src="assets/lamborghini_in_garage.png" alt="lamborghini in garage" width="45%" /> Figure 1a: Bonsai on Desktop                                           Figure 1b: Lamborghini in Garage Our plug-and-play merging visualizer supports versatile object integration across different scenes. <img src="assets/multi_res_384_fps_60.png" alt="multi_res_384_fps_60" width="32%" /> <img src="assets/multi_res_1024_fps_35.png" alt="multi_res_1024_fps_35" width="32%" /> <img src="assets/multi_res_3072_fps_8.png" alt="multi_res_3072_fps_8" width="32%" /> Figure 2a: 384p @ 60 FPS              Figure 2b: 1024p @ 35 FPS              Figure 2c: 3072p @ 8 FPS Multi-resolution rendering enables trade-off between visual quality and performance.

This project provides a streamlined pipeline to create a Gaussian Splatting model from two types of data sources: a 3D model of an object (downloaded from the internet) and a video scan of your environment. Our toolkit facilitates easy preprocessing of data into a format compatible with the Gaussian Splatting algorithm. Additionally, we offer an intuitive and interactive visualizer to merge and view the object in a new background, allowing for easy navigation and control over the object's relative position and scale.

For faster training, use the Grendel-GS distributed Gaussian Splatting training repository.

🌟 Visualizer Features

🔮 Format Compatibility: Supports trained Gaussian Splatting models in ply format, compatible with various 3DGS training repositories. Instead of custom checkpoint files expected by some visualizer.
🎨 Multi-Model Support: Visualize one or two Gaussian Splatting models simultaneously.
🕹️ Interactive Object Manipulation: Easily adjust position, scale, and orientation of objects in real-time for perfect scene composition.
🖥️ Adaptive Rendering: Toggle between different resolutions on-the-fly, balancing visual quality and performance to suit your hardware capabilities.

GaussFusion: Seamless 3D Object Integration using Gaussian Splatting

Data Collection

3D Models

We recommend using Sketchfab to download 3D models of objects. The website offers a wide variety of models, and most of them are free to download.

Video Scans

To create a video scan of your environment, you can use your smartphone or a camera to capture a video of the object from different angles.

Ensure smooth and consistent motion while capturing the video, do not move too fast or too slow.
Capture the object from different angles.
Make sure in the first few seconds, the camera is vertical. This is crucial for consistent coordinate system between different data sources.

Data Preprocessing

For each scene, the final output will be organized in the following directory structure:

scene/
    bonsai/
        database.db     --- COLMAP database file
        cameras.json    --- Camera parameters (only available for 3D model)
        images/         --- Images from the 3D model or video scan
            0000.jpg
            0001.jpg
            ...
        sparse/         --- Sparse reconstruction by COLMAP
            0/
                cameras.bin
                points3D.bin
                images.bin
    ...

Set the environment variable $PROJECT_PATH to the path of your scene folder, e.g. export PROJECT_PATH=scene/bonsai.

Image Generation

Generate Images from 3D Models

To generate images from different views of a downloaded 3D model, use the following script:

python data_processing/model_view_renderer.py \
    -i /path/to/your/3d/model.gltf \
    -o $PROJECT_PATH

This script also generates a cameras.json file containing the camera parameters for each view.

Generate Images from Video Scans

To generate images from a video scan, run:

python data_processing/video_frame_extractor.py \
    -i /path/to/your/video.mp4 \
    -o $PROJECT_PATH

COLMAP Feature Matching and Reconstruction

To create the sparse model with 3D keypoints and camera parameters, use the COLMAP library. The initial steps are the same for both 3D models and video scans:

colmap feature_extractor \
    --database_path $PROJECT_PATH/database.db \
    --image_path $PROJECT_PATH/images \
    --ImageReader.camera_model=SIMPLE_PINHOLE \
    --ImageReader.single_camera=true

colmap sequential_matcher \
    --database_path $PROJECT_PATH/database.db

If your colmap installation does not support GPU, add --SiftExtraction.use_gpu=false to the feature_extractor command and --SiftMatching.use_gpu=false to the sequential_matcher command.

If your 3D model images are not captured at contiguous locations, use exhaustive_matcher instead of sequential_matcher.

3D Model

Ensure the cameras.json file is generated before proceeding:

python data_processing/model_to_sparse_colmap.py \
    -i $PROJECT_PATH \
    -o $PROJECT_PATH/sparse/predefined

mkdir -p $PROJECT_PATH/sparse/0
colmap point_triangulator \
    --database_path $PROJECT_PATH/database.db \
    --image_path $PROJECT_PATH/images \
    --input_path $PROJECT_PATH/sparse/predefined \
    --output_path $PROJECT_PATH/sparse/0

The first command creates a COLMAP-compatible sparse model from the cameras.json file and reorders the images to align with the COLMAP database. The second command triangulates the 3D keypoints using the known camera parameters.

Video Scan

mkdir -p $PROJECT_PATH/sparse/raw

colmap mapper \
    --database_path $PROJECT_PATH/database.db \
    --image_path $PROJECT_PATH/images \
    --output_path $PROJECT_PATH/sparse/raw

This command creates a sparse model from the video scan images. We need a further step to make the coordinate of sparse model align with the actual world coordinate, e.g. gravity points to the negative z-axis. We can use the following script to rotate the sparse model:

mkdir -p $PROJECT_PATH/sparse/0
python data_processing/video_rotate_sparse_model.py \
    -i $PROJECT_PATH/sparse/raw \
    -o $PROJECT_PATH/sparse/0

Visualize Sparse Model

Now the sparse model will be created under the sparse_transformed/0,1,2,... directory. You can visualize the sparse model using colmap gui (File > import model under Ubuntu) to check the validity of the sparse model.

<div style="text-align:center"> <img src="assets/bonsai_colmap.png" alt="bonsai colmap" width="45%" /> <img src="assets/desktop_colmap.png" alt="desktop colmap" width="45%" /> </div> Figure 3a: Bonsai Sparse Model (3D synthetic model)         Figure 3b: Desktop Sparse Model (Video scan) Example visualizations of the sparse model in COLMAP GUI.

Gaussian Splatting Training

For training the Gaussian Splatting model, please refer to Grendel-GS.

Since the distributed system saves one .ply file for each GPU, you can use the following script to merge them in to a single .ply file:

python visualizer/merge_ply.py \
    -i /path/to/the/folder/that/contains/your/ply/files \
    -o /path/to/your/output/ply/file.ply

Visualization

To visualize the Gaussian Splatting model, we offer a simple visualizer. Ensure you have a CUDA-compatible GPU and sufficient memory:

python visualizer/visualize.py \
    --background /path/to/background/gaussian/splatting/model.ply \
    --object /path/to/object/gaussian/splatting/model.ply \
    --port 8080

Open your browser and navigate to http://localhost:8080 to view the visualization.