Home

Awesome

PPO-PyTorch

UPDATE [April 2021] :

Open PPO_colab.ipynb in Google Colab Open In Colab

Introduction

This repository provides a Minimal PyTorch implementation of Proximal Policy Optimization (PPO) with clipped objective for OpenAI gym environments. It is primarily intended for beginners in Reinforcement Learning for understanding the PPO algorithm. It can still be used for complex environments but may require some hyperparameter-tuning or changes in the code. A concise explaination of PPO algorithm can be found here and a thorough explaination of all the details for implementing best performing PPO can be found here (All are not implemented in this repo yet).

To keep the training procedure simple :

Usage

Note :

Citing

Please use this bibtex if you want to cite this repository in your publications :

@misc{pytorch_minimal_ppo,
    author = {Barhate, Nikhil},
    title = {Minimal PyTorch Implementation of Proximal Policy Optimization},
    year = {2021},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/nikhilbarhate99/PPO-PyTorch}},
}

Results

PPO Continuous RoboschoolHalfCheetah-v1PPO Continuous RoboschoolHalfCheetah-v1
PPO Continuous RoboschoolHopper-v1PPO Continuous RoboschoolHopper-v1
PPO Continuous RoboschoolWalker2d-v1PPO Continuous RoboschoolWalker2d-v1
PPO Continuous BipedalWalker-v2PPO Continuous BipedalWalker-v2
PPO Discrete CartPole-v1PPO Discrete CartPole-v1
PPO Discrete LunarLander-v2PPO Discrete LunarLander-v2

Dependencies

Trained and Tested on:

Python 3
PyTorch
NumPy
gym

Training Environments

Box-2d
Roboschool
pybullet

Graphs and gifs

pandas
matplotlib
Pillow

References