Home

Awesome

ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images [CVPR2023]

:sparkles:Paper    :sparkles:Poster    :sparkles:Presentation (YouTube)    :sparkles:Slide    :sparkles:OpenReview    

https://user-images.githubusercontent.com/65707367/223019204-6948e71f-1f30-4659-9498-353ef74ed1c9.mp4

Implementation version

Pytorch 1.8.1 & CUDA 10.1.
Please referring to requirements.txt for details.
If your CUDA version is 10.1, you can directly execute the following command to install the environment:

conda create -n scandmm python==3.7  
conda activate scandmm
pip install -r requirements.txt

Training

  1. To reproduce the training and validation dataset, please referring to data_process.py. Alternatively, using the ready-to-use data.
  2. Execute:
python train.py --seed=1234 --dataset='./Datasets/Sitzmann.pkl' --lr=0.0003 --bs=64 --epochs=500 --save_root='./model/'
  1. Check the training log and checkpoints in Log (created automatically) and ./model files, respectively.

Test

  1. Prepare the test images and put them in a folder (e.g, ./demo/input)
  2. Create a folder to store the results (e.g, ./demo/output)
  3. A pre-trained weights (e.g, './model/model_lr-0.0003_bs-64_epoch-435.pkl')
  4. Execute:
python inference.py --model='./model/model_lr-0.0003_bs-64_epoch-435.pkl' --inDir='./demo/input' --outDir='./demo/output' --n_scanpaths=200 --length=20 --if_plot=True
  1. Check the results:
sp_P48_5376x2688.png

Snow

scanpaths = np.load(P48_5376x2688.npy)
print(scanpaths.shape)
(200, 20, 2)
# (n_scanpaths, length, (y, x)). (y, x) are normalized coordinates in the range [0, 1] (y/x = 0 indicate the top/left edge).
sp_P8_7500x3750.png

Mu

scanpaths = np.load(P8_7500x3750.npy)
print(scanpaths.shape)
(200, 20, 2)

Bibtex

@InProceedings{scandmm2023,
  title={ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images},
  author={Xiangjie Sui and Yuming Fang and Hanwei Zhu and Shiqi Wang and Zhou Wang},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition}, 
  year={2023}
}

Acknowledgment

The author would like to thank Kede Ma for his inspiration, Daniel Martin et al. for publishing ScanGAN model and visualization functions, and Bingham Eli et al. for the implementation of Pyro. We sincerely appreciate for their contributions.