Home

Awesome

KinetiX-Downloader

A Python3 script for downloading the Kinetics 400, 600, and 700 datasets using PyTube3.

Kinetics is a large-scale, high-quality dataset of URL links to approximately 650,000 video clips that covers 700 human action classes, including human-object interactions such as playing instruments, as well as human-human interactions such as shaking hands and hugging. Each action class has at least 600 video clips. Each clip is human annotated with a single action class and lasts around 10s.

Available versions of Kinetics:

Prerequisites

Download dataset

Script download.py can be used for downloading videos for the Kinetics dataset of given version (i.e., 400, 600, or 700) and for given dataset subset (i.e., train, test, validate, or all of them):

usage: Kinetics dataset downloader [-h] [-v {400,600,700}] [-s {train,test,validate,all}] [-w WORKERS]

optional arguments:
  -h, --help            show this help message and exit
  -v {400,600,700}, --version {400,600,700}
                        choose dataset version ('400', '600', '700')
  -s {train,test,validate,all}, --subset {train,test,validate,all}
                        choose dataset subset ('train', 'test', 'validate', or 'all')
  -w WORKERS, --workers WORKERS
                        Set number of multiprocessing workers

Dataset videos will be saved under the directory Kinetics<version>/. It is expected that some videos will not be available for downloading. This may happen for various reasons (e.g., due to an unexpected error of PyTube or due to unavailability of the YouTube video). A error log file will be created as soon as a video download error occurs, and will store all YouTube IDs of the videos that have not been downloaded (Kinetics<version>_<subset>_errors.log). After download process is complete, you may re-run it for attempting to download the videos that have failed (if a video has been downloaded successfully will be omitted).

TODO