Awesome
UPDeT
Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight)
The framework is inherited from PyMARL. UPDeT is written in pytorch and uses SMAC as its environment.
Installation instructions
Installing dependencies:
pip install -r requirements.txt
Download SC2 into the 3rdparty/
folder and copy the maps necessary to run over.
bash install_sc2.sh
Run an experiment
Before training your own transformer-based multi-agent model, there are a list of things to note.
- Currently, this repository supports marine-based battle scenarios. e.g.
3m
,8m
,5m_vs_6m
. - If you are interested in training a different unit type, carefully modify the
Transformer Parameters
block atsrc/config/default.yaml
and revise the_build_input_transformer
function inbasic_controller.python
. - Before running the experiment, check the agent type in
Agent Parameters
block atsrc/config/default.yaml
. - This repository contains two new transformer-based agents from the UPDeT paper including
- Standard UPDeT
- Aggregation Transformer
Training script
python3 src/main.py --config=vdn --env-config=sc2 with env_args.map_name=5m_vs_6m
All results will be stored in the Results/
folder.
Performance
Single battle scenario
Surpass the GRU baseline on hard 5m_vs_6m
with:
- QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
- VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
- QTRAN: QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Multiple battle scenarios
Zero-shot generalize to different tasks:
- Result on
7m-5m-3m
transfer learning.
Note: Only UPDeT can be deployed to other scenarios without changing the model's architecture.
More details please refer to UPDeT paper.
Bibtex
@article{hu2021updet,
title={UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers},
author={Hu, Siyi and Zhu, Fengda and Chang, Xiaojun and Liang, Xiaodan},
journal={arXiv preprint arXiv:2101.08001},
year={2021}
}
License
The MIT License