Awesome
PyTorch mip-NeRF
A reimplementation of mip-NeRF in PyTorch.
Not exactly 1-to-1 with the official repo, as we organized the code to out own liking (mostly how the datasets are structued, and hyperparam changes to run the code on a consumer level graphics card), made it more modular, and removed some repetitive code, but it achieves the same results.
Features
- Can use Spherical, or Spiral poses to generate videos for all 3 datasets
- Spherical:
- Spiral:
- Depth and Normals video renderings:
- Depth:
- Normals:
- Can extract meshes
Future Plans
In the future we plan on implementing/changing:
- Factoring out more repetitive/redundant code, optimize gpu memory and rps
- Clean up and expand mesh extraction code
- Zoomed poses for multicam dataset
- Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields support
- NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis support
Installation/Running
- Create a conda environment using
mipNeRF.yml
- Get the training data
- run
bash scripts/download_data.sh
to download all 3 datasets: LLFF, Blender, and Multicam. - Individually run the bash script corresponding to an individual dataset
bash scripts/download_llff.sh
to download LLFFbash scripts/download_blender.sh
to download Blenderbash scripts/download_multicam.sh
to download Multicam (Note this will also download the blender dataset since it's derived from it)
- run
- Optionally change config parameters: can change default parameters in
config.py
or specify with command line arguments- Default config setup to run on a high-end consumer level graphics card (~8-12GB)
- Run
python train.py
to trainpython -m tensorboard.main --logdir=log
to start the tensorboard
- Run
python visualize.py
to render a video from the trained model - Run
python extract_mesh.py
to render a mesh from the trained model
Code Structure
I explain the specifics of the code more in detail here but here is a basic rundown.
config.py
: Specifies hyperparameters.datasets.py
: Base genericDataset
class + 3 default dataset implementations.NeRFDataset
: Base class that all datasets should inherent from.Multicam
: Used for multicam data as in the original mip-NeRF paper.Blender
: Used for the synthetic dataset as in original NeRF.LLFF
: Used for the llff dataset as in the original NeRF.
loss.py
: mip-NeRF loss, pretty much just MSE, but also calculates psnr.model.py
: mip-NeRF model, not as modular as the way the original authors wrote it, but easier to understand its structure when laid out verbatim like this.pose_utils.py
: Various functions used to generate poses.ray_utils.py
: Various functions related involving rays that the model uses as input, most are used within the forward function of the model.scheduler.py
: mip-NeRF learning rate scheduler.train.py
: Trains a mip-NeRF model.visualize.py
: Creates the videos using a trained mip-NeRF.
mip-NeRF Summary
Here's a summary on how NeRF and mip-NeRF work that I wrote when writing this originally.
Results
<sub><sup>All PSNRs are average PSNR (coarse + fine).</sub></sup>
LLFF - Trex
<div> <img src="https://user-images.githubusercontent.com/42706447/173477393-8b93a3f8-3624-4826-a67c-82923d03ea34.png" alt="pic0" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477391-1f932ca3-6456-4af5-b041-bf63dbbed68a.png" alt="pic1" width="49%"> </div> <div> <img src="https://user-images.githubusercontent.com/42706447/173477394-9ab07f60-58b9-4311-8aba-c052412b4f68.png" alt="pic2" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477395-d69bdb34-ea6e-43de-8315-88c6f5e251e7.png" alt="pic3" width="49%"> </div> <br> Video: <br> <br> Depth: <br> <br> Normals: <br>Blender - Lego
<div> <img src="https://user-images.githubusercontent.com/42706447/173477588-a4d0034d-b8e5-4ea2-9459-5fff3e6b1cde.png" alt="pic0" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477593-d23a9603-b6b5-4d4f-9a2b-dcfd0d646dbc.png" alt="pic1" width="49%"> </div> <div> <img src="https://user-images.githubusercontent.com/42706447/173477594-ee6e5dda-b704-4403-9433-ee93bf2a8154.png" alt="pic2" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477595-2f0e2d88-e241-4ddc-809d-927c6e01c881.png" alt="pic3" width="49%"> </div>Video: <br>
<br> Depth: <br> <br> Normals: <br>Multicam - Mic
<div> <img src="https://user-images.githubusercontent.com/42706447/173477781-2c48d8e0-b0e5-4cd4-9599-cc0336333b30.png" alt="pic0" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477778-9fd4c802-e0b2-4e0b-bc31-6f27abc92c87.png" alt="pic1" width="49%"> </div> <div> <img src="https://user-images.githubusercontent.com/42706447/173477782-ec40bc91-1da7-49d2-b65b-b3250f34a8fc.png" alt="pic2" width="49%"> <img src="https://user-images.githubusercontent.com/42706447/173477784-8dfa7bc7-7122-40ed-855a-0081a593f1ce.png" alt="pic3" width="49%"> </div>Video: <br>
<br> Depth: <br> <br> Normals: <br>References/Contributions
- Thanks to Nina for helping with the code
- Original NeRF Code in Tensorflow
- NeRF Project Page
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
- Original mip-NeRF Code in JAX
- mip-NeRF Project Page
- Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
- nerf_pl