Home

Awesome

VIN: Value Iteration Networks

This is an implementation of Value Iteration Networks (VIN) in TensorFlow to reproduce the results.(PyTorch version)

Architecture of Value Iteration Network

Key idea

Learned Reward Image and Its Value Images for each VI Iteration

VisualizationGrid worldReward ImageValue Images
8x8<img src="imgs/grid_8x8.jpeg" width="150"><img src="imgs/reward_8x8.png" width="300"><img src="imgs/value_function_8x8.gif" width="300">
16x16<img src="imgs/grid_16x16.jpeg" width="150"><img src="imgs/reward_16x16.png" width="300"><img src="imgs/value_function_16x16.gif" width="300">
28x28<img src="imgs/grid_28x28.jpeg" width="150"><img src="imgs/reward_28x28.png" width="300"><img src="imgs/value_function_28x28.gif" width="300">

Dependencies

This repository requires following packages:

Datasets

Each data sample consists of (x, y) coordinates of current state in grid world, followed by an obstacle image and a goal image.

Dataset size8x816x1628x28
Train set777607764404510695
Test set12960129440751905

Running Experiment: Training

Grid world 8x8

python run.py --datafile data/gridworld_8x8.npz --imsize 8 --lr 0.005 --epochs 30 --k 10 --batch_size 128

Grid world 16x16

python run.py --datafile data/gridworld_16x16.npz --imsize 16 --lr 0.008 --epochs 30 --k 20 --batch_size 128

Grid world 28x28

python run.py --datafile data/gridworld_28x28.npz --imsize 28 --lr 0.003 --epochs 30 --k 36 --batch_size 128

Flags:

Benchmarks

GPU: TITAN X

Performance: Test Accuracy

NOTE: This is the accuracy on test set. It is different from the table in the paper, which indicates the success rate from rollouts of the learned policy in the environment.

Test Accuracy8x816x1628x28
TensorFlow99.03%90.2%82%
PyTorch99.16%92.44%88.20%

Speed with GPU

Speed per epoch8x816x1628x28
TensorFlow4s25s165s
PyTorch3s15s100s

Frequently Asked Questions

References

Further Readings