Home

Awesome

pyTOFlow

This repository is based on the paper 'TOFlow: Video Enhancement with Task-Oriented Flow'. It contains some pre-trained models and all of the codes we used including training code, network structure and so on.

What's more, you can describe it as the python version of the TOFlow presented there -> toflow

Note: There are still some TODOs should be done now, it is welcome to create a pull request to improve this repository. Let's make it better for fun and application!

Video Demo

IMAGE ALT TEXT

Evaluation Result

Vimeo interp.

MethodsPSNRSSIM
TOFlow33.530.9668
TOFlow + Mask33.730.9682
pyTOFlow33.100.9631

Vimeo-Gaussian

MethodsPSNRSSIM
TOFlow29.100.9544
pyTOFlow34.730.9518

Vimeo SR

MethodsPSNRSSIM
TOFlow33.080.9417
pyTOFlow31.460.9230

Prerequisites

PyTorch

Our implementation is based on PyTorch 0.4.1 (https://pytorch.org/).

PIL and matplotlib

For loading images. Check on matplotlib v2.0.2, but it doesn't work on higher version of it like v3.1.0 (Noted that v2.0.2 is not the bar.) See Issue#6, Issue#7 for details.

opencv-python(cv2)

For processing videos.

CUDA [optional]

CUDA is suggested (https://developer.nvidia.com/cuda-toolkit) for fast inference. The demo code is still runnable without CUDA, but much slower.

FFmpeg [optional]

We use FFmpeg (http://ffmpeg.org/) for processing videos. That's ok if you don't have a FFmpeg, but maybe it will cost you lot of time to processing.

Installation

Our current release has been tested on Ubuntu 16.04 LTS.

Clone the repository

sh git clone https://github.com/Coldog2333/pytoflow.git

Install some required packages

Download tiny Vimeo dataset (1‰ of Vimeo-90K)

You would like to have a quick start to understand the whole features of pytoflow. Then you can download the tiny Vimeo dataset and have a try. The tiny Vimeo dataset is chosen randomly from the origin Vimeo dataset and the processed Vimeo-90K (mixed noise, blurred).

cd .
sh download_tiny_dataset.sh
unzip tiny.zip

Train

python3 train.py [[option] [value]]...

Options

Examples

python3 train.py --task interp --dataDir ./tiny/vimeo_triplet/sequences --pathlist ./tiny/vimeo_triplet/tri_trainlist.txt --gpuID 1
python3 train.py --task denoising --dataDir ./tiny/vimeo_septuplet/sequences --ex_dataDir ./tiny/vimeo_septuplet/sequences_with_noise --pathlist ./tiny/vimeo_septuplet/sep_trainlist.txt --gpuID 1
python3 train.py --task super-resolution --dataDir ./tiny/vimeo_septuplet/sequences --ex_dataDir ./tiny/vimeo_septuplet/sequences_blur --pathlist ./tiny/vimeo_septuplet/sep_trainlist.txt --gpuID 1

Evaluate

python3 evaluate.py [[option] [value]]...

Options

Examples

python3 evaluate.py --task interp --dataDir ./tiny/vimeo_triplet/sequences --pathlist ./tiny/vimeo_triplet/tri_testlist.txt --model ./toflow_models/interp.pkl --gpuID 1
python3 evaluate.py --task denoising --dataDir ./tiny/vimeo_septuplet/sequences_with_noise --ex_dataDir ./tiny/vimeo_septuplet/sequences_with_noise --pathlist ./tiny/vimeo_septuplet/sep_testlist.txt --model ./toflow_models/denoise.pkl --gpuID 1
python3 evaluate.py --task super-resolution --dataDir ./tiny/vimeo_septuplet/sequences_blur --ex_dataDir ./tiny/vimeo_septuplet/sequences_blur --pathlist ./tiny/vimeo_septuplet/sep_testlist.txt --model ./toflow_models/sr.pkl --gpuID 1

Usage

python3 ./unstable/run.py --f1 ./unstable/example/im1.png --f2 ./unstable/example/im3.png --o ./unstable/example/out.png --gpuID 0

Options

References

  1. Xue T , Chen B , Wu J , et al. Video Enhancement with Task-Oriented Flow[J]. 2017.(http://arxiv.org/abs/1711.09078)
  2. Our SpyNet is based on sniklaus/pytorch-spynet

Acknowledgments

Thanks for the author of the origin paper @anchen1011, he gives me a lot of advice during the time I reproduce this paper and teaches me a lot. Then thanks for the School of Mathematics, Sun Yat-Sen University that provices me the computing server. I can do nothing without this powerful server. At last, thanks for the accompany of my teammates Qian and Junjie.

TODO

For example, we can omit the last resnet layer that requires an extra system resources but only promotes a little improvement. After a comprehensive ablation analysis, we can convincingly determine which structures we can omit without dropping too much performance.

Maybe we can accelarate it with running more than 1 sequence of pictures at a time (using matrix multiplication, etc.)