Home

Awesome

EfficientZero (NeurIPS 2021)

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Environments

EfficientZero requires python3 (>=3.6) and pytorch (>=1.8.0) with the development headers.

We recommend to use torch amp (--amp_type torch_amp) to accelerate training.

Prerequisites

Before starting training, you need to build the c++/cython style external packages. (GCC version 7.5+ is required.)

cd core/ctree
bash make.sh

The distributed framework of this codebase is built on ray.

Installation

As for other packages required for this codebase, please run pip install -r requirements.txt.

Usage

Quick start

Bash file

We provide train.sh and test.sh for training and evaluation.

Required ArgumentsDescription
--envName of the environment
--case {atari}It's used for switching between different domains(default: atari)
--opr {train,test}select the operation to be performed
--amp_type {torch_amp,none}use torch amp for acceleration
Other ArgumentsDescription
--forcewill rewrite the result directory
--num_gpus 4how many GPUs are available
--num_cpus 96how many CPUs are available
--cpu_actor 14how many cpu workers
--gpu_actor 20how many gpu workers
--seed 0the seed
--use_priorityuse priority in replay buffer sampling
--use_max_priorityuse the max priority for the newly collectted data
--amp_type 'torch_amp'use torch amp for acceleration
--info 'EZ-V0'some tags for you experiments
--p_mcts_num 8set the parallel number of envs in self-play
--revisit_policy_search_rate 0.99set the rate of reanalyzing policies
--use_root_valueuse root values in value targets (require more GPU actors)
--renderrender in evaluation
--save_videosave videos for evaluation

Architecture Designs

The architecture of the training pipeline is shown as follows:

Some suggestions

New environment registration

If you wan to apply EfficientZero to a new environment like mujoco. Here are the steps for registration:

  1. Follow the directory config/atari and create dir for the env at config/mujoco.
  2. Implement your MujocoConfig(BaseConfig) class and implement the models as well as your environment wrapper.
  3. Register the case at main.py.

Results

Evaluation with 32 seeds for 3 different runs (different seeds).

Citation

If you find this repo useful, please cite our paper:

@inproceedings{ye2021mastering,
  title={Mastering Atari Games with Limited Data},
  author={Weirui Ye, and Shaohuai Liu, and Thanard Kurutach, and Pieter Abbeel, and Yang Gao},
  booktitle={NeurIPS},
  year={2021}
}

Contact

If you have any question or want to use the code, please contact ywr20@mails.tsinghua.edu.cn .

Acknowledgement

We appreciate the following github repos a lot for their valuable code base implementations:

https://github.com/koulanurag/muzero-pytorch

https://github.com/werner-duvaud/muzero-general

https://github.com/pytorch/ELF