Home

Awesome

<img src="figs/steveonllm.png" alt="steveonllm" width="45px" /> LLaMA-Rider: Spurring Large Language Models to Explore the Open World

[Arxiv Paper]


<div align=center> <img src="figs/introfig.png" alt="llama-rider" width="400px" /> </div>

LLaMA-Rider is a two-stage learning framework that spurs Large Language Models (LLMs) to explore the open world and learn to accomplish multiple tasks. This repository contains the implementation of LLaMA-Rider in the sandbox game Minecraft, and the code is largely based on the Plan4MC repository.

Installation

The installation of MineDojo and Plan4MC is the same as that in the Plan4MC repository:

Method overview

<img src="figs/llama-rider.png" alt="llama-rider" style="zoom:100%;" />

LLaMA-Rider is a two-stage framework:

Exploration stage

In the exploration stage, for tasks based on logs/stones/mobs, run

python collect_feedback.py

For tasks based on iron ore, run

python collect_feedback_iron.py

Available tasks are listed in envs/hard_task_conf.yaml. One can modify the file to change task settings.

Learning stage

One can process the explored experiences into a supervised dataset by calling:

python process_data.py

For learning stage, we use QLoRA to train the LLM. Run

sh train/scripts/sft_70B.sh

Evaluation

For evaluation with the LLM after SFT, run

python collect_feedback.py --adapter /path/to/adatper

Main results

LLaMA-Rider outperforms ChatGPT planner on average across 30 tasks in Minecraft based on LLaMA-2-70B-chat.

Besides, LLaMA-Rider can accomplish 56.25% more tasks after learning stage using only a 1.3k supervised data, showing the efficiency and effectiveness of the framework.

mresult

We also found LLaMA-Rider can achieve better performance in unseen iron-based tasks, which are more difficult, after exploration & learning in 30 log/stone/mob-based tasks, showing the generalization of the learned decision making capabilities.

ironresult

Citation

If you use our method or code in your research, please consider citing the paper as follows:

@article{feng2023llama,
      title={LLaMA Rider: Spurring Large Language Models to Explore the Open World}, 
      author={Yicheng Feng and Yuxuan Wang and Jiazheng Liu and Sipeng Zheng and Zongqing Lu},
      journal={arXiv preprint arXiv:2310.08922},
      year={2023}
}