Awesome
AgentNet
A lightweight library to build and train deep reinforcement learning and custom recurrent networks using Theano+Lasagne
What is AgentNet?
<img src='http://s33.postimg.org/ytx63kwcv/whatis_agentnet_png.png' alt='agentnet structure' title='agentnet structure' width=600 />
No time to play games? Let machines do this for you!
AgentNet is a deep reinforcement learning framework, which is designed for ease of research and prototyping of Deep Learning models for Markov Decision Processes.
All techno-babble set aside, you can use it to train your pet neural network to play games! [e.g. OpenAI Gym]
We have a full in-and-out support for Lasagne deep learning library, granting you access to all convolutions, maxouts, poolings, dropouts, etc. etc. etc.
AgentNet handles both discrete and continuous control problems and supports hierarchical reinforcement learning [experimental].
List of already implemented reinforcement techniques:
- Q-learning (or deep Q-learning, since we support arbitrary complexity of network)
- N-step Q-learning
- SARSA
- N-step Advantage Actor-Critic (A2c)
- N-step Deterministic Policy Gradient (DPG)
As a side-quest, we also provide a boilerplate to custom long-term memory network architectures (see examples).
Installation
Try without installing
- If you use other similar tools, see Repo with a dockerfile
Quick install
- install bleeding edge lasagne
- [sudo] pip install --upgrade https://github.com/yandexdataschool/AgentNet/archive/master.zip
Full install (with examples)
- Clone this repository:
git clone https://github.com/yandexdataschool/AgentNet.git && cd AgentNet
- Install dependencies:
pip install -r requirements.txt
- Install library itself:
pip install -e .
Docker container
On Windows/OSX install Docker Kitematic,
then simply run justheuristic/agentnet
container and click on 'web preview'.
On other linux/unix systems:
- install Docker,
- make sure
docker
daemon is running (sudo service docker start
) - make sure no application is using port 1234 (this is the default port that can be changed)
[sudo] docker run -d -p 1234:8888 justheuristic/agentnet
- Access from browser via localhost:1234
Documentation and tutorials
A quick dive-in can be found here:
- Click
- classwork.ipynb = your tutorial
- classwork_solution.ipynb = a fully implemented version with simple CNN for reference
(incomplete) Documentation pages can be found here.
AgentNet also has full embedded documentation, so calling help(some_function_or_object)
or
pressing shift+tab in IPython yields a description of object/function.
A standard pipeline of AgentNet experiment is shown in following examples:
-
Simple Deep Recurrent Reinforcement Learning setup
- Most basic demo, if a bit boring. Covers the problem of learning "If X1 than Y1 Else Y2".
- Uses a single RNN memory and Q-learning algorithm
-
Playing Atari SpaceInvaders with Convolutional NN via OpenAI Gym
- Step-by-step explanation of what you need to do to recreate DeepMind Atari DQN
- Written in a generic way, so that adding recurrent memory or changing learning algorithm could be done in a couple of lines
Advanced examples
If you wish to get acquainted with the current library state, view some of the ./examples
- Playing Atari with Convolutional NN via OpenAI Gym
- Can switch to any visual game thanks to awesome Gym interface
- Very simplistic, non-recurrent suffering from atari flickering, etc.
- Deep Recurrent Kung-Fu training with GRUs and actor-critic
- Uses the "Playing atari" example with minor changes
- Trains via Advantage actor-critic (value+policy-based)
- Simple Deep Recurrent Reinforcement Learning setup
- Trying to guess the interconnected hidden factors on a synthetic problem setup
- Stack-augmented GRU generator
- Reproducing http://arxiv.org/abs/1503.01007 with less code
- MOAR deep recurrent value-based LR for wikipedia facts guessing
- Trying to figure a policy on guessing musician attributes (genres, decades active, instruments, etc)
- Using several hidden layers and 3-step Q-learning
- More to come
AgentNet is under active construction, so expect things to change. If you wish to join the development, we'd be happy to accept your help.