UniMVSNet is a learning-based multi-view stereo model, which has a unified depth representation to not only achieve sub-pixel depth estimation but also constrain the cost volume directly. To excavate the potential of our novel representation, we designed a Unified Focal Loss to combat the challenge of sample imbalance more reasonably and uniformly. Details are described in our paper:
<p align="center"> <img src="./.github/images/sample.png" width="100%"/> </p>Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation
Rui Peng, Rongjie Wang, Zhenyu Wang, Yawen Lai, Ronggang Wang
CVPR 2022 (arxiv)
UniMVSNet is more robust on the challenge regions and can generate more accurate depth maps. The point cloud is more complete and the details are finer.
If there are any errors in our code, please feel free to ask your questions.
⚙ Setup
1. Recommended environment
- PyTorch 1.2
- Python 3.6
2. DTU Dataset
Training Data. We adopt the full resolution ground-truth depth provided in CasMVSNet or MVSNet. Download DTU training data and Depth raw.
Unzip them and put the Depth_raw
to dtu_training
folder. The structure is just like:
├── Cameras
├── Depths
├── Depths_raw
└── Rectified
Testing Data. Download DTU testing data and unzip it. The structure is just like:
├── Cameras
├── scan1
├── scan2
├── ...
3. BlendedMVS Dataset
Training Data and Validation Data. Download BlendedMVS and unzip it. And we only adopt BlendedMVS for finetuning and not testing on it. The structure is just like:
├── 5a0271884e62597cdee0d0eb
├── 5a3ca9cb270f0e3f14d0eddb
├── ...
├── training_list.txt
├── ...
4. Tanks and Temples Dataset
Testing Data. Download Tanks and Temples and
unzip it. Here, we adopt the camera parameters of short depth range version (Included in your download), therefore, you should
replace the cams
folder in intermediate
folder with the short depth range version manually. The
structure is just like:
├── advanced
│ ├── Auditorium
│ ├── ...
└── intermediate
├── Family
├── ...
📊 Testing
1. Download models
Download our pretrained model and put it to <your model path>
Note: unimvsnet_blendedmvs
is the model trained on DTU and then finetuned
on BlendedMVS, and we only used it to test on Tanks dataset. More details
can be found in our paper.
2. DTU testing
Fusibile installation. Since we adopt Gipuma to filter and fuse the point on DTU dataset, you need to install
Fusibile first. Download fusible to <your fusibile path>
and execute the following commands:
cd <your fusibile path>
cmake .
If nothing goes wrong, you will get an executable named fusable. And most of the errors are caused by mismatched GPU computing power.
Point generation. To recreate the results from our paper, you need to specify the datapath
<your dtu_testing path>
, outdir
to <your output save path>
, resume
to <your model path>
, and fusibile_exe_path
to <your fusibile path>/fusibile
in shell file ./script/
first and then run:
bash ./scripts/
Note that we use the unimvsnet_dtu checkpoint when testing on DTU.
Point testing. You need to move the point clouds generated under each scene into a
folder dtu_points
. Meanwhile, you need to rename the point cloud in
the mvsnet001_l3.ply format (the middle three digits represent the number of scene).
Then specify the dataPath
, plyPath
and resultsPath
and ./dtu_eval/ComputeStat_web.m
. Finally, run
file ./dtu_eval/BaseEvalMain_web.m
through matlab software to evaluate
DTU point scene by scene first, then execute file ./dtu_eval/BaseEvalMain_web.m
to get the average metrics for the entire dataset.
3. Tanks and Temples testing
Point generation. Similarly, you need specify the datapath
, outdir
and resume
in shell file
, and then run:
bash ./scripts/
Note that we use the unimvsnet_blendedmvs checkpoint when testing on Tanks and Temples.
Point testing. You need to upload the generated points to Tanks and Temples benchmark, and it will return test results within a few hours.
We adopt dynamic geometric consistency checking strategies to filter and
fuse Tanks point clouds. Meanwhile, we consider the photometric constrains
of all stages like VisMVSNet. The configuration of scenes in ./fiter/
is the closest we get to reproducing our baseline.
📦 DTU points
You can download our precomputed DTU point clouds from the following link:
<table align="center"> <tr align="center"> <td>Points</td> <td>Confidence Threshold</td> <td>Consistent View</td> <td>Accuracy↓</td> <td>Completeness↓</td> <td>Overall↓</td> </tr> <tr align="center"> <td><a href="">dtu_points</a></td> <td>0.3</td> <td>3</td> <td>0.352</td> <td>0.278</td> <td>0.315</td> </tr> </table>🖼 Visualization
To visualize the depth map in pfm format, run:
python --vis --depth_path <your depth path> --depth_img_save_dir <your depth image save directory>
The visualized depth map will be saved as <your depth image save directory>/depth.png
. For visualization of point clouds,
some existing software such as MeshLab can be used.
⏳ Training
1. DTU training
To train the model from scratch on DTU, specify the datapath
and log_dir
in ./scripts/
and then run:
bash ./scripts/
By default, we employ the DistributedDataParallel mode to train our model, you can also train your model in a single GPU.
2. BlendedMVS fine-tuning
To fine-tune the model on BlendedMVS, you need specify datapath
, log_dir
in ./scripts/
first, then run:
bash ./scripts/
Actually, you can train the model on BlendedMVS from scratch just like some
other methods through removing the command resume
⚖ Citation
If you find our work useful in your research please consider citing our paper:
title = {Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation},
author = {Peng, Rui and Wang, Rongjie and Wang, Zhenyu and Lai, Yawen and Wang, Ronggang},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
👩 Acknowledgements
Thanks to MVSNet, MVSNet_pytorch and CasMVSNet.