Home

Awesome

LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots

This repo contains the implementation of our paper LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots.

Demo Animation

Installation

Conda Installation

Clone this repo, and follow the installation instruction from the Housekeep Benchmark. Note that the Housekeep code is within the root of this repo, so please follow the Housekeep installation instruction and install within this repo, no need to clone the original Housekeep repo again.

The above installation is sufficient to run a human demonstration agent. In addition, this work uses OpenAI GPT-3.5-Turbo. Therefore, to train and run an LLM-based agent using OpenAI GPT's, please setup an OpenAI account, and additionally install the python module:

pip install openai==0.27.9

Docker

Alternatively, you can pull and use the docker:

docker pull apple90gege/housekeepdocker:latest

And follow installation guide from this repo.

Troubleshooting

Installation

  1. [habitat-lab setup] You may need to install additionally the following packages in the conda environment before habitat-lab setup if the setup code reports errors.
pip install cython
pip install pkgconfig
  1. [libcublas and libcdnn] If you encounter the following errors:
OSError: libcublas.so.11: cannot open shared object file: No such file or directory
OSError: libcudnn.so.8: cannot open shared object file: No such file or directory

You may need to install cuda and cudnn by running the following commands.

conda install -c conda-forge cudatoolkit=11.0 -y
conda install -c conda-forge cudnn=8.0 -y
  1. [protobuf] If you encounter the TypeError: Descriptors cannot not be created directly. Please downgrade protobuf by running the following command:
pip install protobuf==3.20.0

Headless rendering

  1. This work runs the 3D habitat simulator on a headless GPU server. Here is an issue and its solution related to the installation. For more issues and troubleshoot guides please refer to the official Habitat Simulator Repo.

Running the code

Code Overview

To create and run an agent, create the run configuration, then run a simulation as follows:

Step 1: Generating a configuration file

Step 2: Running a simulation

A list of environment ID <ENV_ID> are included in the file cos_eor/configs/local/envs.yaml. For example, to run the agent in the environment pomaria_1_int for 1 episode, you can run the command as ./run_cli.sh test pomaria_1_int 1.

The simulation logs are then written to logs/test/demo/<ENV_ID>/data_<datetime>.json. In addition, to record a video, you may change the config file L143 VIDEO_INTERVAL to 1 and find the video inside the log folder.

Fine-tuning guide

Demonstration Bootstrapping

Iterative Self-training

Citing

Our paper is available on Arxiv. If you find our code useful, please consider citing us!

@misc{han2024llmpersonalize,
      title={LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots}, 
      author={Dongge Han and Trevor McInroe and Adam Jelley and Stefano V. Albrecht and Peter Bell and Amos Storkey},
      year={2024},
      eprint={2404.14285},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Acknowledgement

The housekeep benchmark and the housekeep simulator code used in this work (included in this repo) is developed by Kant et. al. for their paper: Housekeep: Tidying Virtual Households using Commonsense Reasoning.. The simulator is based on the Habitat simulator introduced in the paper Habitat 2.0: Training Home Assistants to Rearrange their Habitat.