Awesome
Model-based Offline Policy Optimization (MOPO)
Code to reproduce the experiments in MOPO: Model-based Offline Policy Optimization.
Installation
- Install MuJoCo 2.0 at
~/.mujoco/mujoco200
and copy your license key to~/.mujoco/mjkey.txt
- Create a conda environment and install mopo
cd mopo
conda env create -f environment/gpu-env.yml
conda activate mopo
# Install viskit
git clone https://github.com/vitchyr/viskit.git
pip install -e viskit
pip install -e .
Usage
Configuration files can be found in examples/config/
. For example, run the following command to run HalfCheetah-mixed benchmark in D4RL.
mopo run_local examples.development --config=examples.config.d4rl.halfcheetah_mixed --gpus=1 --trial-gpus=1
Currently only running locally is supported.
Logging
This codebase contains viskit as a submodule. You can view saved runs with:
viskit ~/ray_mopo --port 6008
assuming you used the default log_dir
.
Citing MOPO
If you use MOPO for academic research, please kindly cite our paper the using following BibTeX entry.
@article{yu2020mopo,
title={MOPO: Model-based Offline Policy Optimization},
author={Yu, Tianhe and Thomas, Garrett and Yu, Lantao and Ermon, Stefano and Zou, James and Levine, Sergey and Finn, Chelsea and Ma, Tengyu},
journal={arXiv preprint arXiv:2005.13239},
year={2020}
}