Home

Awesome

Dyna Gym

This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the <a href="https://gym.openai.com/">OpenAI Gym</a> toolkit. It contains both the dynamic environments i.e. whose transition and reward functions depend on the time and some algorithms implementations.

Environments

The implemented environments are the following and can be found at dyna-gym/dyna_gym/envs. For each environment, the id given as argument to the gym.make function is writen in bold.

<p align="center"> <img height="100" width="auto" src="img/nsfrozenlake.gif"> </p> <p align="center"> <b>NSFrozenLakeEnv-v0</b> environment. The probability distribution of a next state's transition depends on time. </p> <p align="center"> <img height="250" width="auto" src="img/cartpole_nstransition.gif"> </p> <p align="center"> Cart pole in the <b>NSCartPole-v0</b> environment. The red bar indicates the direction of the gravitational force. </p> <p align="center"> <img height="250" width="auto" src="img/cartpole_nsreward1.gif"> </p> <p align="center"> Cart pole in the <b>NSCartPole-v1</b> environment. The two red dots correspond to the limiting interval. </p> <p align="center"> <img height="250" width="auto" src="img/cartpole_nsreward2.gif"> </p> <p align="center"> Cart pole in the <b>NSCartPole-v2</b> environment. The two black lines correspond to the limiting angle interval. </p>

Algorithms

The implemented algorithms are the following and can be found at dyna-gym/dyna_gym/agents.

Installation

Type the following commands in order to install the package:

cd dyna-gym
pip install -e .

Examples are provided in the example/ repository. You can run them using your installed version of Python.

Dependencies

Edit June 12 June 2018.

The package depends on several classic Python libraries. An up to date list is the following: copy; csv; gym; itertools; logging; math; matplotlib; numpy; random; setuptools; statistics.

Non classic libraries are also used by some algorithms: scikit-learn (see <a href="http://scikit-learn.org/stable/index.html">website</a>); LWPR (see <a href="https://github.com/lhlmgr/lwpr">git repository</a> for a Python 3 binding).