Home

Awesome

Transfer and Multi-task Reinforcement Learning

Introduction

Reinforcement Learning (RL) is a learning paradigm to solve many decision-making problems, which are usually formalized as Markov Decision Processes (MDP). Recently, Deep Reinforcement Learning (DRL) has achieved a lot of success in human-level control problems, such as video games, robot control, autonomous vehicles, smart grids, and so on. However, DRL is still faced with the sample-inefficiency problem especially when the state-action space becomes large, which makes it difficult to learn from scratch. This means the agent has to use a large number of samples to learn a good policy. Furthermore, the sample-inefficiency problem is much more severe in Multiagent Reinforcement Learning (MARL) due to the exponential increase of the state-action space.

This repository contains the released codes of representative benchmarks and algorithms of TJU-RL-Lab on the topic of Transfer and Multi-task Reinforcement Learning, including the single-agent domain and multi-agent domain, addressing the sample-inefficiency problem in different ways.

This repository will be constantly updated to include new research works.

<p align="center"><img align="center" src="./assets/overview.png" alt="overview" style="zoom:60%;" /></p>

Challenges

Sample-inefficiency problem: The main challenge that transfer and multi-task RL aims to solve is the sample-inefficiency problem. This problem forces the agent to collect a huge amount of training data to learn the optimal policy. For example, the Rainbow DQN requires around 18 million frames of training data to exceed the average level of human players, which is equivalent to 60 hours of games played by human players. However, human players can usually learn an Atari game within a few minutes and can reach the average level of the same player after one hour of training.

Solutions

Directory Structure of this Repo

This repository consists of

An overview of research works in this repository:

CategorySub-CategoriesMethodIs ContainedPublicationLink
Single-agent Transfer RLSame-domain TransferPTF:white_check_mark:IJCAI 2020https://dl.acm.org/doi/abs/10.5555/3491440.3491868
Single-agent Transfer RLCross-domain TransferCAT:white_check_mark:UAI 2022https://openreview.net/forum?id=ShN3hPUsce5
Multi-agent Transfer RLSame task, transfer across agentsMAPTF:white_check_mark:NeurIPS 2021https://proceedings.neurips.cc/paper/2021/hash/8d9a6e908ed2b731fb96151d9bb94d49-Abstract.html
Multi-agent Transfer RLPolicy reuse across tasksBayes-ToMoP:white_check_mark:IJCAI 2019https://dl.acm.org/doi/abs/10.5555/3367032.3367121
<!--| Single-agent Transfer RL | Cross-domain Transfer| MIKT | :x: | UAI 2020 | https://dl.acm.org/doi/abs/10.5555/3306127.3331795 |--> <!--| Single-agent Transfer RL | Same-domain Transfer | CAPS | :white_check_mark: |AAMAS 2019| https://dl.acm.org/doi/abs/10.5555/3306127.3331795 |--> <!--| Multi-agent Transfer RL | Same task, transfer across agents | DVM | :white_check_mark: | IROS 2019 | https://dl.acm.org/doi/abs/10.5555/3306127.3331795|-->

Liscense

This repo uses MIT Liscense

Acknowledgements

[To add some acknowledgements]

*Update Log

2022-03-18: