Home

Awesome

F16 Video Dataset

OVERVIEW

Code to generate a data set comprised of segments of 16 consecutive frames extracted from videos. The videos are downloaded from the web using the command line program youtube-dl. The code consists of some C simple image manipulation functions (stolen from imscript by Enric Meinhardt-Llopis) and a bash script url-to-bursts.sh. It has been tested in Ubuntu 16.04.

The script takes as inputs the url of a video and a destination folder and performs the following steps:

  1. Creates the destination folder
  2. Download the full video using youtube-dl
  3. Extracts from the video bursts of 16 frames using ffmpeg. The burst are evenly spaced throughout the video. The minimal time between consecutive bursts is 10s. The maximum number of bursts is 20. These parameters can be configured by editing the script.
  4. Downscales the extracted frames to 540 rows. To avoid aliasing a Gaussian filter is applied before downscaling. The scale of the filter is sigma = sigma0*sqrt(factor^2 - 1) to avoid introducing a bias in the dataset, we draw sigma0 at random for each burst (within a certain range).
  5. Removes the video file.

The bursts are left as sequences of png files in separate folders 01, 02, etc. Two log files are left in the outpuf folder: u2b_err.log and u2b_out.log.

DEPENDENCIES

To compile the image tools we need

We also need the following command line programs to be installed in the system:

COMPILATION

The code is compilable on Unix/Linux (probably on Mac OS as well, but we didn't test). We provide a simple Makefile to compile the C code.

$ cd imscript-lite && make && cd ..

USAGE

To download the bursts in a url:

$ ./url-to-bursts.sh [url] [folder]

To download all the bursts for the videos provided in the url folder:

$ . download-all-bursts

To generate the download-all-bursts file from a forder with many url files

$ ./generate-command.sh

URLS

We provide a set of urls from youtube videos that were used to train the video denoising network described in this paper. These urls were selected by Jessie Levillain and Raymond Zhang while they were interns at CMLA. The urls were collected by searching in youtube using 64 queries. Only videos with Creative Commons license and with at least HD resolution were downloaded. Videos were selected according to the following criteria: