Awesome
PixelFormer: Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention
This is the official PyTorch implementation for WACV 2023 paper 'Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention'.
Paper <br />
Installation
conda create -n pixelformer python=3.8
conda activate pixelformer
conda install pytorch=1.10.0 torchvision cudatoolkit=11.1
pip install matplotlib tqdm tensorboardX timm mmcv
Datasets
You can prepare the datasets KITTI and NYUv2 according to here, and then modify the data path in the config files to your dataset locations.
Training
First download the pretrained encoder backbone from here, and then modify the pretrain path in the config files.
Training the NYUv2 model:
python pixelformer/train.py configs/arguments_train_nyu.txt
Training the KITTI model:
python pixelformer/train.py configs/arguments_train_kittieigen.txt
Evaluation
Evaluate the NYUv2 model:
python pixelformer/eval.py configs/arguments_eval_nyu.txt
Evaluate the KITTI model:
python pixelformer/eval.py configs/arguments_eval_kittieigen.txt
Pretrained Models
- You can download the pretrained models "nyu.pt" and "kitti.pt" from here.
Citation
If you find our work useful in your research, please cite the following:
@InProceedings{Agarwal_2023_WACV,
author = {Agarwal, Ashutosh and Arora, Chetan},
title = {Attention Attention Everywhere: Monocular Depth Prediction With Skip Attention},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2023},
pages = {5861-5870}
}
Contact
For questions about our paper or code, please contact (@ashutosh1807) or raise an issue on GitHub.
Acknowledgements
Most of the code has been adpated from CVPR 2022 paper NewCRFS. We thank Weihao Yuan for releasing the source code for the same.
Also, thanks to Microsoft Research Asia for opening source of the excellent work Swin Transformer.