Home

Awesome

<div id="top"></div>

Awesome Exploration Methods in Reinforcement Learning

Updated on 2024.11.29

<p align="center"> <img src="./assets/minigrid_hard_exploration.png" alt="minigrid_hard_exploration" width="40%" height="40%" /><br> <em style="display: inline-block;">A typical hard-exploration environment: MiniGrid-ObstructedMaze-Full-v0.</em> </p>

Table of Contents

A Taxonomy of Exploration RL Methods

<details open> <summary>(Click to Collapse)</summary>

In general, we can divide reinforcement learning process into two phases: collect phase and train phase. In the collect phase, the agent chooses actions based on the current policy and then interacts with the environment to collect useful experience. In the train phase, the agent uses the collected experience to update the current policy to obtain a better performing policy.

According to the phase the exploration component is explicitly applied, we simply divide the methods in Exploration RL into two main categories: Augmented Collecting Strategy, Augmented Training Strategy:

Note that there may be overlap between these categories, and an algorithm may belong to several of them. For other detailed survey on exploration methods in RL, you can refer to Tianpei Yang et al and Susan Amin et al.

<center> <figure> <img style="border-radius: 0.3125em; box-shadow: 0 2px 4px 0 rgba(34,36,38,.12),0 2px 10px 0 rgba(34,36,38,.08);" src="./assets/erl_taxonomy.png" width=100% height=100%> <br> <figcaption align = "center"><b>A non-exhaustive, but useful taxonomy of methods in Exploration RL. We provide some example methods for each of the different categories, shown in blue area above. </b></figcaption> </figure> </center>

Here are the links to the papers that appeared in the taxonomy:

[1] Go-Explore: Adrien Ecoffet et al, 2021
[2] NoisyNet, Meire Fortunato et al, 2018
[3] DQN-PixelCNN: Marc G. Bellemare et al, 2016
[4] #Exploration Haoran Tang et al, 2017
[5] EX2: Justin Fu et al, 2017
[6] ICM: Deepak Pathak et al, 2018
[7] RND: Yuri Burda et al, 2018
[8] NGU: Adrià Puigdomènech Badia et al, 2020
[9] Agent57: Adrià Puigdomènech Badia et al, 2020
[10] VIME: Rein Houthooft et al, 2016
[11] EMI: Wang et al, 2019
[12] DIYAN: Benjamin Eysenbach et al, 2019
[13] SAC: Tuomas Haarnoja et al, 2018
[14] BootstrappedDQN: Ian Osband et al, 2016
[15] PSRL: Ian Osband et al, 2013
[16] HER Marcin Andrychowicz et al, 2017
[17] DQfD: Todd Hester et al, 2018
[18] R2D3: Caglar Gulcehre et al, 2019

</details>

Papers

format:
- [title](paper link) (presentation type, openreview score [if the score is public])
  - author1, author2, author3, ...
  - Key: key problems and insights
  - ExpEnv: experiment environments

NeurIPS 2024

<details open> <summary>(Click to Collapse)</summary> </details>

ICML 2024

<details open> <summary>(Click to Collapse)</summary> </details>

ICLR 2024

<details open> <summary>(Click to Collapse)</summary> </details>

NeurIPS 2023

<details open> <summary>(Click to Collapse)</summary> </details>

ICML 2023

<details open> <summary>(Click to Collapse)</summary> </details>

ICLR 2023

<details open> <summary>(Click to Collapse)</summary> </details>

NeurIPS 2022

<details open> <summary>(Click to Collapse)</summary> </details>

ICML 2022

<details open> <summary>(Click to Collapse)</summary> </details>

ICLR 2022

<details open> <summary>(Click to Collapse)</summary> </details>

NeurIPS 2021

<details open> <summary>(Click to Collapse)</summary> </details>

Classic Exploration RL Papers

<details open> <summary>(Click to Collapse)</summary> <!-- - [How can we define intrinsic motivation?](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.567.6524&rep=rep1&type=pdf) *Conf. on Epigenetic Robotics, 2008* - Pierre-Yves Oudeyer, Frederic Kaplan. - Key: intrinsic motivation - ExpEnv: None --> </details>

Contributing

Our purpose is to provide a starting paper guide to who are interested in exploration methods in RL. If you are interested in contributing, please refer to HERE for instructions in contribution.

License

Awesome Exploration RL is released under the Apache 2.0 license.

<p align="right">(<a href="#top">Back to top</a>)</p>