Home

Awesome

alt text

version codestyle Documentation PyPI version

xpag ("exploring agents") is a modular reinforcement learning library with JAX agents, currently in beta version.


Install

<details><summary>Option 1: conda (preferred option)</summary> <p>

This option is preferred because it relies mainly on conda-forge packages (which among other things simplifies the installation of JAX).

git clone https://github.com/perrin-isir/xpag.git
cd xpag
conda update conda

Install micromamba if you don't already have it (you can also simply use conda, by replacing below micromamba create, micromamba update and micromamba activate respectively by conda env create, conda env update and conda activate, but this will lead to a significantly slower installation):

conda install -c conda-forge micromamba

Choose an environment name, for instance xpagenv.
The following command creates the xpagenv environment with the requirements listed in environment.yaml:

micromamba create --name xpagenv --file environment.yaml

If you prefer to update an existing environment (existing_env):

micromamba update --name existing_env --file environment.yaml

Then, activate the xpagenv environment:

micromamba activate xpagenv

Finally, install the xpag library in the activated environment:

pip install -e .
</p> </details> <details><summary>Option 2: pip</summary> <p>

For the pip install, you need to properly install JAX yourself. Otherwise, if JAX is installed automatically as a pip dependency of xpag, it will probably not work as desired (e.g. it will not be GPU-compatible). So you should install it beforehand, following these guidelines:

https://github.com/google/jax#installation

Then, install xpag with:

pip install xpag
</p> </details> <details><summary>JAX</summary> <p>

To verify that the JAX installation went well, check the backend used by JAX with the following command:

python -c "import jax; print(jax.lib.xla_bridge.get_backend().platform)"

It will print "cpu", "gpu" or "tpu" depending on the platform JAX is using.

</p> </details> <details><summary>Tutorials</summary> <p>

The following libraries, not required by xpag, are required for the tutorials:

</p> </details>

Tutorials

The xpag-tutorials repository contains a list of tutorials (colab notebooks) for xpag:
https://github.com/perrin-isir/xpag-tutorials


Short documentation

<details><summary><B><I>xpag</I>: a platform for RL, goal-conditioned RL, and more.</B></summary>

xpag allows standard reinforcement learning, but it has been designed with goal-conditioned reinforcement learning (GCRL) in mind (check out the train_gmazes.ipynb tutorial for a simple example of GCRL).

In GCRL, agents have a goal, which is part of the input they take, and the reward mainly depends on the degree of achievement of that goal. Beyond the usual modules in RL platforms (environment, agent, buffer/sampler), xpag introduces a module called "setter" which, among other things, can help to set and manage goals (for example modifying the goal several times in a single episode). Although the setter is largely similar to an environment wrapper, it is separated from the environment because in some cases it should be considered as an independent entity (e.g. a teacher), or as a part of the agent itself.

xpag relies on a single reinforcement learning loop (the learn() function in xpag/tools/learn.py) in which the environment, the agent, the buffer and the setter interact (see below). The learn() function has the following first 3 arguments (returned by gym_vec_env() and brax_vec_env()):

learn() also takes in input the agent, the buffer and the setter and various parameters. Detailed information about the arguments of learn() can be found in the code documentation (check xpag/tools/learn.py).

The components that interact during learning are:

<details><summary><B>the environment (env)</B></summary>

In xpag, environments must allow parallel rollouts, and xpag keeps the same API even in the case of a single rollout, i.e. when the number of "parallel environments" is 1. Basically, all environments are "vector environments".

</details> <details><summary><B>the agent (agent)</B></summary>

xpag only considers off-policy agents. (TODO)

</details> <details><summary><B>the buffer (buffer)</B></summary> TODO </details> <details><summary><B>the sampler (sampler)</B></summary> TODO </details> <details><summary><B>the setter (setter)</B></summary> TODO </details>

The figure below summarizes the RL loop and the interactions between the components: (TODO)

</details>

Acknowledgements


Citing the project

To cite this repository in publications:

@misc{xpag,
  author = {Perrin-Gilbert, Nicolas},
  title = {xpag: a modular reinforcement learning library with JAX agents},
  year = {2022},
  url = {https://github.com/perrin-isir/xpag}
}