Home

Awesome

EvisonNet

使用无监督方法同时进行相机的标定,运动估计,深度估计

FBI WARNING!!

警告:谨慎参考和使用,未完成工作,BUG较多,随意使用有BOOM风险.

1. 目录说明<br>

2. 环境说明<br>

3.对比实验

3.1. depth_from_video_in_the_wild

3.2. SfmLeaner

3.3. struct2depth

4. 参考文献<br>

[1]. Pyramid stereo matching network. PSMNet. <br> [2]. TILDE: a temporally invariant learned detector.TILDE. <br> [3]. Deep Ordinal Regression Network for Monocular Depth Estimation. <br> [4]. Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints." Future Internet 10.10 (2018): 92. <br> [5]. Liu, Qiang, et al. "Using Unsupervised Deep Learning Technique for Monocular Visual Odometry. <br> [6]. DeepCalib: a deep learning approach for automatic intrinsic calibration of wide field-of-view cameras.[关键词:Camera Calibrate deep learning]. <br> [7]. Depth from Videos in the Wild:Unsupervised Monocular Depth Learning from Unknown Cameras. <br> [8]. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. <br>

5. 其他网址

[1]. middlebury 数据集. <br> [2]. KITTI 数据集. <br> [3]. VIsion-SceneFlowDatasets数据集. <br> [4]. PSMNet解析. <br> [5]. 中科院自动化所三维重建数据集. <br> [6]. SfMLearner(Depth and Ego-Motion)解析. <br> [7]. OpenMVS. <br> [8]. OpenMVG. <br> [9]. CVonline,图片数据集汇总. <br> [10]. VisualData数据集搜索. <br> [11]. 360D-zenodo Dataset. <br> [12]. RGB-D Panorama Dataset. <br> [13]. Deep Depth Completion of a Single RGB-D Image解析. <br> [14]. Unsupervised Learning of Depth and Ego-Motion解析. <br> [15]. 视觉里程计 第二部分:匹配、鲁棒、优化和应用. <br> [16]. 怎样通过照片获得高质量3D模型. <br> [17]. tqdm.postfix. <br> [18]. KITTI_odometry_evaluation_tool. <br>

6.性能记录

表1:性能指标

ATE in seq 09ATE in seq 10Abs RelSq Relrmslog_rmsA1A2A3备注
0.0160 ± 0.00900.0130 ± 0.00900.1831.5956.7000.2700.7340.9020.959SfmLeaner Github<sup>1</sup>
0.0210 ± 0.01700.0200 ± 0.01500.2081.7686.8560.2830.6780.8850.957SfmLeaner Paper<sup>2</sup>
0.0179 ± 0.01100.0141 ± 0.01150.1811.3416.2360.2620.7330.9010.964SfmLeaner third party Github<sup>3</sup>
0.0107 ± 0.00620.0096 ± 0.00720.22602.3106.8270.3010.6770.8780.947Ours SfmLeaner-Pytorch<sup>4</sup>
0.0312 ± 0.02170.0237 ± 0.02080.23302.46436.8300.3140.67040.8690.940intri_pred<sup>5</sup>
-------------0.14171.13855.52050.21860.82030.94150.9762struct2depth baseline <sup>6</sup>
0.0110 ± 0.00600.0110 ± 0.01000.10870.82504.75030.18660.87380.95770.9825struct2depth M+R <sup>7</sup>
0.0090 ± 0.01500.0080 ± 0.01100.1290.9825.230.2130.8400.9450.976DFV Given intrinsics <sup>8</sup>
0.0120 ± 0.01600.0100 ± 0.01000.1280.9595.230.2120.8450.9470.976DFV Learned intrinsics <sup>9</sup>

附表1:备注

  1. SfMLearner文中(参考文献[5])所附Github的readme给出的最好结果,作者说明更改为:增加了数据扩增,移除了BN,一些微调,只用KITTI数据,没有使用explainability regularization.该效果部分略好于论文上的结果<br>
  2. SfMLearner文中(参考文献[5])给出的KITTI上的最好成绩.<br>
  3. SfmLeaner-pytorch的Github上给出的最佳结果.与原作者不同的地方为:Smooth loss从应用到视差上改为应用到深度上,loss除以2.3而不是2.<br>
  4. 我们的SfMLearner-pytorch, -b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3.<br>
  5. 不提供内参,使用简单的内参预测手段-b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3.<br>
  6. from Table.1 in struct2depth paper.<br>
  7. from Table.1 and Table.3 in struct2depth paper.<br>
  8. from Table.1 and Table.6 in the Depth from Video in the wild paper.<br>
  9. from Table.1 and Table.6 in the Depth from Video in the wild paper.<br>
  10. struct2depth 和 Depth from Video in the wild 这两个工作除了使用KITTI等训练数据集,还使用了一个目标检测模型来生成“object mask”,其作用是在motion mask的生成上进行边界限定.<br>
  11. struct2deptht提供了预训练的模型可以进行测试,Depth from Video in the wild的模型下载链接全部都删除了.<br>

7.评价指标说明

  1. 深度指标:<br> <a href="https://www.codecogs.com/eqnedit.php?latex=abs\_rel=Mean(\left&space;|\frac{gt-pred}{gt}\right|)\\&space;sq\_rel=Mean(\frac{(gt-pred)^{2}}{gt})\\&space;rms=\sqrt{Mean((gt-pred)^{2})}\\&space;log\_rms=\sqrt{Mean([(log(gt)-log(pred)]^{2})}\\&space;a1=Mean((thresh<1.25))\\&space;a2=Mean((thresh<1.25^{2}))\\&space;a3=Mean((thresh<1.25^{3}))\\&space;thresh=np.maximum((\frac{gt}{pred}),&space;(\frac{pred}{&space;gt}))\\" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\\abs\_rel=Mean(\left&space;|\frac{gt-pred}{gt}\right|)\\&space;sq\_rel=Mean(\frac{(gt-pred)^{2}}{gt})\\&space;rms=\sqrt{Mean((gt-pred)^{2})}\\&space;log\_rms=\sqrt{Mean([(log(gt)-log(pred)]^{2})}\\&space;a1=Mean((thresh<1.25))\\&space;a2=Mean((thresh<1.25^{2}))\\&space;a3=Mean((thresh<1.25^{3}))\\&space;thresh=np.maximum((\frac{gt}{pred}),&space;(\frac{pred}{&space;gt}))\\" title="abs\_rel=Mean(\left |\frac{gt-pred}{gt}\right|)\\ sq\_rel=Mean(\frac{(gt-pred)^{2}}{gt})\\ rms=\sqrt{Mean((gt-pred)^{2})}\\ log\_rms=\sqrt{Mean([(log(gt)-log(pred)]^{2})}\\ a1=Mean((thresh<1.25))\\ a2=Mean((thresh<1.25^{2}))\\ a3=Mean((thresh<1.25^{3}))\\ thresh=np.maximum((\frac{gt}{pred}), (\frac{pred}{ gt}))\\" /></a><br>
  2. ego-motion指标:<br> ATE(Absolute Trajectory Error,绝对轨迹误差)在测试集上的均值和标准差,RE是旋转误差.(ATE (Absolute Trajectory Error) is computed as long as RE for rotation (Rotation Error). RE between R1 and R2 is defined as the angle of R1*R2^-1 when converted to axis/angle. It corresponds to RE = arccos( (trace(R1 @ R2^-1) - 1) / 2). While ATE is often said to be enough to trajectory estimation, RE seems important here as sequences are only seq_length frames long).<br>

8.注意事项记录

  1. windows上的anaconda需要Anaconda3,Anaconda3/Library/bin,Anaconda3/Scripts,Anaconda3/condabin这四个环境变量. <br>
  2. DFV提到了一种"Randomized Layer Normalization",这种操作在PyTorch中很构造出文中描述的实现效果,我搞了一个似是而非的写法,在evision_model/_Deprecated.py中, 事实上这个方法如果真的想文中描述的那样有效,那么症结一定在别的地方. <br>
  3. evision_model/_PlayGround.py用于在开发过程中测试一些函数,其中的代码没有被其他文件依赖,可以所以修改甚至删除. <br>