Awesome

PPO Pytorch C++

This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment to test the algorithm. Below is a small visualization of the environment, the algorithm is tested in.

<figure> <img src="img/test_mode.gif" width="50%" height="50%" hspace="0"> <figcaption>Fig. 1: The agent in testing mode. </figcaption> </figure>

Build

You first need to install PyTorch. For a clean installation from Anaconda, checkout this short tutorial, or this tutorial, to only install the binaries.

mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/absolut/path/to/libtorch ..
make

Run

Run the executable with

cd build
./train_ppo

To plot the results, run

cd ..
python plot.py --online_view --csv_file data/data.csv --epochs 1 10

It should produce something like shown below.

<figure> <img src="img/epoch_1.gif" width="50%" height="50%" hspace="0"><img src="img/epoch_10.gif" width="50%" height="50%" hspace="0"> <figcaption>Fig. 2: From left to right, the agent for successive epochs in training mode as it takes actions in the environment to reach the goal. </figcaption> </figure>

The algorithm can also be used in test mode, once trained. Therefore, run

cd build
./test_ppo

To plot the results, run

cd ..
python plot.py --online_view --csv_file data/data_test.csv --epochs 1

Visualization

The results are saved to data/data.csv and can be visualized by running python plot.py. Run

python plot.py --help

for help.