Awesome
VRDP (NeurIPS 2021)
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language <br> Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, and Chuang Gan <br>
More details can be found at the Project Page.
If you find our work useful in your research please consider citing our paper:
@inproceedings{ding2021dynamic,
author = {Ding, Mingyu and Chen, Zhenfang and Du, Tao and Luo, Ping and Tenenbaum, Joshua B and Gan, Chuang},
title = {Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language},
booktitle = {Advances In Neural Information Processing Systems},
year = {2021}
}
Prerequisites
- Python 3
- PyTorch 1.3 or higher
- All relative packages are covered by Miniconda
- Both CPUs and GPUs are supported
Dataset preparation
-
Download videos, video annotation, questions and answers, and object proposals accordingly from the official website
-
Transform videos into ".png" frames with ffmpeg.
-
Organize the data as shown below.
clevrer ├── annotation_00000-01000 │ ├── annotation_00000.json │ ├── annotation_00001.json │ └── ... ├── ... ├── image_00000-01000 │ │ ├── 1.png │ │ ├── 2.png │ │ └── ... │ └── ... ├── ... ├── questions │ ├── train.json │ ├── validation.json │ └── test.json ├── proposals │ ├── proposal_00000.json │ ├── proposal_00001.json │ └── ...
-
We also provide data for physics learning and program execution in Google Drive. You can download them optionally and put them in the
./data/
folder. -
Download the processed data executor_data.zip for the executor. Put it in and unzip it to
./executor/data/
.
Get Object Dictionaries (Concepts and Trajectories)
Download the object proposals from the region proposal network and follow the Step-by-step Training
in DCL to get object concepts and trajectories.
The above process includes:
- trajectory extraction
- concept learning
- trajectory refinement
Or you can download our extracted object dictionaries object_dicts.zip directly from Google Drive.
Learning
1. Differentiable Physics Learning
After we get the above object dictionaries, we learn physical parameters from object properties and trajectories.
cd dynamics/
python3 learn_dynamics.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
The output object physical parameters object_dicts_with_physics.zip can be downloaded from Google Drive.
2. Physics Simulation (counterfactual)
Physical simulation using learned physical parameters.
cd dynamics/
python3 physics_simulation.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
The output simulated trajectories/events object_simulated.zip can be downloaded from Google Drive.
3. Physics Simulation (predictive)
Correction of long-range prediction according to video observations.
cd dynamics/
python3 refine_prediction.py 10000 15000
# Here argv[1] and argv[2] represent the start and end processing index respectively.
The output refined trajectories/events object_updated_results.zip can be downloaded from Google Drive.
Evaluation
After we get the final trajectories/events, we perform the neuro-symbolic execution and evaluate the performance on the validation set.
cd executor/
python3 evaluation.py
The test json file for evaluation on evalAI can be generated by
cd executor/
python3 get_results.py
The Generalized Clerver Dataset (counterfactual_mass)
- Download causal_mass.zip and counterfactual_mass.zip from Google Drive.
- Generate counterfactual data on the collision event by
python3 counterfactual_mass/generate_data.py
Examples
- Predictive question
- Counterfactual question
Acknowledgements
For questions regarding VRDP, feel free to post here or directly contact the author (mingyuding@hku.hk).