Awesome
IRS
IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation
Introduction
IRS is an open dataset for indoor robotics vision tasks, especially disparity and surface normal estimation. It contains totally 103,316 samples covering a wide range of indoor scenes, such as home, office, store and restaurant.
<img src="/imgs/left.png" width="100%" > | <img src="/imgs/right.png" width="100%" > |
---|---|
Left image | Right image |
<img src="/imgs/disparity.png" width="100%" > | <img src="/imgs/normal.png" width="100%" > |
Disparity map | Surface normal map |
Overview of IRS
Rendering Characteristic | Options |
---|---|
indoor scene class | home(31145), office(43417), restaurant(22058), store(6696) |
object class | desk, chair, sofa, glass, mirror, bed, bedside table, lamp, wardrobe, etc. |
brightness | over-exposure(>1300), darkness(>1700) |
light behavior | bloom(>1700), lens flare(>1700), glass transmission(>3600), mirror reflection(>3600) |
We give some sample of different indoor scene characteristics as follows. The parameters of the virtual stereo camera in UE4 are as follows:
- Baseline: 0.1 meter
- Focal Length: 480 for both the x-axis and y-axis.
<img src="/imgs/home.png" width="100%" > | <img src="/imgs/office.png" width="100%"> | <img src="/imgs/restaurant.png" width="100%" > |
---|---|---|
Home | Office | Restaurant |
<img src="/imgs/normal_light.png" width="100%" > | <img src="/imgs/over_exposure.png" width="100%" > | <img src="/imgs/dark.png" width="100%" > |
Normal light | Over exposure | Darkness |
<img src="/imgs/glass.png" width="100%" > | <img src="/imgs/mirror.png" width="100%" > | <img src="/imgs/metal.png" width="100%" > |
Glass | Mirror | Metal |
Network Structure of DTN-Net
We designed a novel deep model, DTN-Net, to predict the surface normal map by refining the initial one transformed from the predicted disparity. DTN-Net (Disparity To Normal Network) is comprised of two modules, RD-Net and NormNetS. First, RD-Net predicts the disparity map for the input stereo images. Then we apply the transformation from disparity to normal in GeoNet, denoted by D2N Transform, to produces the initial coarse normal map. Finally, NormNetS takes the stereo images, the predicted disparity map by RD-Net and the initial normal map as input and predicts the final normal map. The structure of NormNetS is similar to DispNetS except that the final convolution layer outputs three channels instead of one, as each pixel normal has three dimension (x,y,z).
<div align="center"> <img src="/imgs/DTN-Net.png" width="95%" > </div>Paper
Q. Wang<sup>*,1</sup>, S. Zheng<sup>*,1</sup>, Q. Yan<sup>*,2</sup>, F. Deng<sup>2</sup>, K. Zhao<sup>†,1</sup>, X. Chu<sup>†,1</sup>.
IRS: A Large Naturalistic Indoor Robotics Stereo Dataset to Train Deep Models for Disparity and Surface Normal Estimation. [preprint]
<font size=2> * indicates equal contribution. † indicates corresponding authors.<br> <sup>1</sup>Department of Computer Science, Hong Kong Baptist University. <sup>2</sup>School of Geodesy and Geomatics, Wuhan University. </font> <!-- Q. Wang<sup>*,1</sup>, S. Zheng<sup>*,1</sup>, Q. Yan<sup>*,2</sup>, F. Deng<sup>2</sup>, K. Zhao<sup>†,1</sup>, X. Chu<sup>†,1</sup>.[preprint](/pdfs/IRS_indoor_robotics_stereo_dataset.pdf) [IRS : A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation](https://www.github.com) <font size=2> * indicates equal contribution. † indicates corresponding authors.<br> <sup>1</sup>Department of Computer Science, Hong Kong Baptist University. <sup>2</sup>School of Geodesy and Geomatics, Wuhan University. </font> -->Download
You can use the following OneDrive link to download our dataset.
OneDrive: <a href="https://1drv.ms/f/s!AmN7U9URpGVGem0coY8PJMHYg0g?e=nvH5oB" target="_blank">https://1drv.ms/f/s!AmN7U9URpGVGem0coY8PJMHYg0g?e=nvH5oB</a>
Video Demonstration
<!-- <iframe width="560" height="315" src="https://www.youtube.com/embed/jThNQFHNU_s" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> -->Usage
Dependencies
- Python 3.7
- PyTorch 1.6.0+
- torchvision 0.5.0+
- CUDA 10.1 (https://developer.nvidia.com/cuda-downloads)
Install
We recommend using conda for installation:
conda env create -f environment.yml
Install dependencies:
cd layers_package
./install.sh
# install OpenEXR (https://www.openexr.com/)
sudo apt-get update
sudo apt-get install openexr
Dataset
Download IRS dataset from https://1drv.ms/f/s!AmN7U9URpGVGem0coY8PJMHYg0g?e=nvH5oB (OneDrive).
Extract zip files and put them in correct folder:
data
└── IRSDataset
├── Home
├── Office
├── Restaurant
└── Store
Pretrained Model
"FT3D" denotes FlyingThings3D.
IRS | FT3D | IRS+FT3D | |
---|---|---|---|
FADNet | fadnet-ft3d.pth | fadnet-irs.pth | fadnet-ft3d-irs.pth |
GwcNet | gwcnet-ft3d.pth | gwcnet-irs.pth | gwcnet-ft3d-irs.pth |
DTN-Net | DNF-Net | NormNetS | |
---|---|---|---|
IRS | dtonnet-irs.pth | dnfusionnet-irs.pth | normnets-irs.pth |
Train
There are configurations for train in "exp_configs" folder. You can create your own configuration file as samples.
As an example, following configuration can be used to train a DispNormNet on IRS dataset:
/exp_configs/dtonnet.conf
net=dispnormnet
loss=loss_configs/dispnetcres_irs.json
outf_model=models/${net}-irs
logf=logs/${net}-irs.log
lr=1e-4
devices=0,1,2,3
dataset=irs #sceneflow, irs, sintel
trainlist=lists/IRSDataset_TRAIN.list
vallist=lists/IRSDataset_TEST.list
startR=0
startE=0
endE=10
batchSize=16
maxdisp=-1
model=none
Then, the configuration should be specified in the "train.sh"
/train.sh
dnn="${dnn:-dispnormnet}"
source exp_configs/$dnn.conf
python main.py --cuda --net $net --loss $loss --lr $lr \
--outf $outf_model --logFile $logf \
--devices $devices --batch_size $batchSize \
--dataset $dataset --trainlist $trainlist --vallist $vallist \
--startRound $startR --startEpoch $startE --endEpoch $endE \
--model $model \
--maxdisp $maxdisp \
--manualSeed 1024 \
Lastly, use the following command to start a train
./train.sh
Evaluation
There is a script for evaluation with a model from a train
/detech.sh
dataset=irs
net=dispnormnet
model=models/dispnormnet-irs/model_best.pth
outf=detect_results/${net}-${dataset}/
filelist=lists/IRSDataset_TEST.list
filepath=data
CUDA_VISIBLE_DEVICES=0 python detecter.py --model $model --rp $outf --filelist $filelist --filepath $filepath --devices 0 --net ${net} --disp-on --norm-on
Use the script in your configuration, and then get result in detect_result folder.
Disparity results are saved in png format as default.
Normal results are saved in exr format as default.
If you want to change the output format, you need to modify "detecter.py" and use save function as follow
# png
skimage.io.imsave(filepath, image)
# pfm
save_pfm(filepath, data)
# exr
save_exr(data, filepath)
EXR Viewer
For viewing files in exr format, we recommand a free software
Contact
Please contact us at qiangwang@comp.hkbu.edu.hk if you have any question.