Home

Awesome

World Models

This repo reproduces the original implementation of World Models. This implementation uses TensorFlow 2.2.

Docker

The easiest way to handle dependencies is with Nvidia-Docker. Follow the instructions below to generate and attach to the container.

docker image build -t wm:1.0 -f docker/Dockerfile.wm .
docker container run -p 8888:8888 --gpus '"device=0"' --detach -it --name wm wm:1.0
docker attach wm

Visualizations

To visualize the environment from the agents perspective or generate synthetic observations use the visualizations jupyter notebook. It can be launched from your container with the following:

jupyter notebook --no-browser --port=8888 --ip=0.0.0.0 --allow-root
Real Frame SampleReconstructed Real FrameImagined Frame
alt-text-1alt-text-2alt-text-3
Ground Truth (CarRacing)Reconstructed
<img src="imgs/true_traj.gif" alt="drawing" width="500"/><img src="imgs/reconstruct_traj.gif" alt="drawing" width="500"/>
Ground Truth Environment (DoomTakeCover)Dream Environment
<img src="imgs/doom_real_traj.gif" alt="drawing" width="500"/><img src="imgs/doom_dream_traj.gif" alt="drawing" width="500"/>

Reproducing Results From Scratch

These instructions assume a machine with a 64 core cpu and a gpu. If running in the cloud it will likely financially make more sense to run the extraction and controller processes on a cpu machine and the VAE, preprocessing, and RNN tasks on a GPU machine.

DoomTakeCover-v0

CAUTION The doom environment leaves some processes hanging around. In addition to running the doom experiments, the script kills processes including 'vizdoom' in the name (be careful with this if you are not running in a container). To reproduce results for DoomTakeCover-v0 run the following bash script.

bash launch_scripts/wm_doom.bash

CarRacing-v0

To reproduce results for CarRacing-v0 run the following bash script

bash launch_scripts/carracing.bash

Disclaimer

I have not run this for long enough(~45 days wall clock time) to verify that we produce the same results on CarRacing-v0 as the original implementation.

Average return curves comparing the original implementation and ours. The shaded area represents a standard deviation above and below the mean.

alt text

For simplicity, the Doom experiment implementation is slightly different than the original

\tauReturns Dream Environment       Returns Actual Environment       
D. Ha Original1.01145 +/- 690868 +/- 511
Eager1.01465 +/- 633849 +/- 499