Home

Awesome

BSP-NET-pytorch

PyTorch 1.2 implementation of BSP-Net: Generating Compact Meshes via Binary Space Partitioning, Zhiqin Chen, Andrea Tagliasacchi, Hao (Richard) Zhang.

Inference notebook by Axel Sparr: Open In Colab

Paper | Oral video | Project page

<img src='img/teaser.png' />

Other Implementations

The difference between the original and the others are:

Citation

If you find our work useful in your research, please consider citing:

@article{chen2020bspnet,
  title={BSP-Net: Generating Compact Meshes via Binary Space Partitioning},
  author={Zhiqin Chen and Andrea Tagliasacchi and Hao Zhang},
  journal={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

Dependencies

Requirements:

Datasets and pre-trained weights

The original voxel models are from HSP. The rendered views are from 3D-R2N2.

Since our network takes point-value pairs, the voxel models require further sampling.

For data preparation, please see point_sampling in IM-NET.

We provide the ready-to-use datasets in hdf5 format.

Backup links:

We also provide the pre-trained network weights.

Backup links:

Usage

First, make sure you have Cython installed and then build bspt module with

python setup.py build_ext --inplace

The bspt module is for recovering meshes from BSP-trees. If you fail to build the module, you can replace "from bspt import ..." with "from bspt_slow import ..." in all codes. bspt_slow.py is written in python and slower than the Cython implementation.

Please use the provided scripts train_ae.sh, train_svr.sh, test_ae.sh, test_svr.sh to train the network on the training set and get output meshes for the testing set.

To train an autoencoder, use the following commands for progressive training.

python main.py --ae --train --phase 0 --iteration 8000000 --sample_dir samples/all_vox256_img0_16 --sample_vox_size 16
python main.py --ae --train --phase 0 --iteration 8000000 --sample_dir samples/all_vox256_img0_32 --sample_vox_size 32
python main.py --ae --train --phase 0 --iteration 16000000 --sample_dir samples/all_vox256_img0_64 --sample_vox_size 64
python main.py --ae --train --phase 1 --iteration 16000000 --sample_dir samples/all_vox256_img1 --sample_vox_size 64

The above commands will train the AE model 8000000 iterations on 16<sup>3</sup> resolution, 8000000 iterations on 32<sup>3</sup> resolution, and 16000000 iterations on 64<sup>3</sup> resolution, for phase 0 (continuous phase); and then 16000000 iterations on 64<sup>3</sup> resolution, for phase 1 (discrete phase).

After training on each resolution, you may visualize some results from the testing set.

python main.py --ae --phase 0 --sample_dir samples/all_vox256_img0_16 --start 0 --end 16
python main.py --ae --phase 0 --sample_dir samples/all_vox256_img0_32 --start 0 --end 16
python main.py --ae --phase 0 --sample_dir samples/all_vox256_img0_64 --start 0 --end 16
python main.py --ae --phase 1 --sample_dir samples/all_vox256_img1 --start 0 --end 16

You can specify the start and end indices of the shapes by --start and --end. Note that you need to choose the phase correctly according to which phase the model was trained on.

To train the network for single-view reconstruction, after training the autoencoder, use the following command to extract the latent codes:

python main.py --ae --getz

Then use the following commands to train the SVR model and get some samples:

python main.py --svr --train --epoch 1000 --sample_dir samples/all_vox256_img2
python main.py --svr --sample_dir samples/all_vox256_img2 --start 0 --end 16

Training an AE model then SVR model on the 13 categories takes about 5 days on one GeForce RTX 2080 Ti GPU.

Training options

You can use --phase N to specify which phase the network will be trained on.

You can train the network on phase 0 → phase 1 or phase 0 → phase 2 or phase 0 → phase 3 or phase 0 → phase 4.

Testing options

You can use different testing functions provided in main.py:

There is an optional post-processing step to remove convexes that are inside the shape. Removing those "inside" convexes has very little impact on the visual appearance of the shapes (because those "inside" convexs are not visible), which is reflected by Light Field Distance (2939.15 → 2938.31). The impact on Chamfer Distance is more significant (0.001432 → 0.001455). In function test_mesh_point those "inside" convexes are kept; in other testing functions they are removed. Check the code for implementation details.

Code for 2D experiments

Please find the code in bsp_2d.

Evaluation

Our code for computing Chamfer Distance and Normal Consistency can be found at evaluation.

The Light Field Distance (LFD) is produced by LightField descriptor.

Note that the code for LightField descriptor is written in C and the executable only does closest shape retrieval according to the Light Field Distance. If you want to use it in your own experiments, you might need to change some lines to get the actual distance and recompile the code.

License

This project is licensed under the terms of the MIT license (see LICENSE for details).