Home

Awesome

Status: Stable release

PyPI

Crafter

Open world survival game for evaluating a wide range of agent abilities within a single environment.

Crafter Terrain

Overview

Crafter features randomly generated 2D worlds where the player needs to forage for food and water, find shelter to sleep, defend against monsters, collect materials, and build tools. Crafter aims to be a fruitful benchmark for reinforcement learning by focusing on the following design goals:

See the research paper to find out more: Benchmarking the Spectrum of Agent Capabilities

@article{hafner2021crafter,
  title={Benchmarking the Spectrum of Agent Capabilities},
  author={Danijar Hafner},
  year={2021},
  journal={arXiv preprint arXiv:2109.06780},
}

Play Yourself

python3 -m pip install crafter  # Install Crafter
python3 -m pip install pygame   # Needed for human interface
python3 -m crafter.run_gui      # Start the game
<details> <summary>Keyboard mapping (click to expand)</summary>
KeyAction
WASDMove around
SPACECollect material, drink from lake, hit creature
TABSleep
TPlace a table
RPlace a rock
FPlace a furnace
PPlace a plant
1Craft a wood pickaxe
2Craft a stone pickaxe
3Craft an iron pickaxe
4Craft a wood sword
5Craft a stone sword
6Craft an iron sword
</details>

Crafter Video

Interface

To install Crafter, run pip3 install crafter. The environment follows the OpenAI Gym interface. Observations are images of size (64, 64, 3) and outputs are one of 17 categorical actions.

import gym
import crafter

env = gym.make('CrafterReward-v1')  # Or CrafterNoReward-v1
env = crafter.Recorder(
  env, './path/to/logdir',
  save_stats=True,
  save_video=False,
  save_episode=False,
)

obs = env.reset()
done = False
while not done:
  action = env.action_space.sample()
  obs, reward, done, info = env.step(action)

Evaluation

Agents are allowed a budget of 1M environmnent steps and are evaluated by their success rates of the 22 achievements and by their geometric mean score. Example scripts for computing these are included in the analysis directory of the repository.

Scoreboards

Please create a pull request if you would like to add your or another algorithm to the scoreboards. For the reinforcement learning and unsupervised agents categories, the interaction budget is 1M. The external knowledge category is defined more broadly.

Reinforcement Learning

AlgorithmScore (%)RewardOpen Source
Curious Replay19.4±1.6-AutonomousAgentsLab/cr-dv3
PPO (ResNet)15.6±1.610.3±0.5snu-mllab/Achievement-Distillation
DreamerV314.5±1.611.7±1.9danijar/dreamerv3
LSTM-SPCNN12.1±0.8astanic/crafter-ood
EDE11.7±1.0yidingjiang/ede
OC-SA11.1±0.7astanic/crafter-ood
DreamerV210.0±1.29.0±1.7danijar/dreamerv2
PPO4.6±0.34.2±1.2DLR-RM/stable-baselines3
Rainbow4.3±0.26.0±1.3Kaixhin/Rainbow

Unsupervised Agents

AlgorithmScore (%)RewardOpen Source
Plan2Explore2.1±0.12.1±1.5danijar/dreamerv2
RND2.0±0.10.7±1.3alirezakazemipour/PPO-RND
Random1.6±0.02.1±1.3

External Knowledge

AlgorithmScore (%)RewardUsesInteractionOpen Source
Human50.5±6.814.3±2.3Life experience0crafter_human_dataset
SPRING27.3±1.212.3±0.7LLM, scene description, Crafter paper0
Achievement Distillation21.8±1.412.6±0.3Reward structure1Msnu-mllab/Achievement-Distillation
ELLM6.0±0.4LLM, scene description5M

Baselines

Baseline scores of various agents are available for Crafter, both with and without rewards. The scores are available in JSON format in the scores directory of the repository. For comparison, the score of human expert players is 50.5%. The baseline implementations are available as a separate repository.

<img src="https://github.com/danijar/crafter/raw/main/media/scores.png" width="400"/>

Questions

Please open an issue on Github.