Home

Awesome

tidybot

This code release accompanies the following project:

TidyBot: Personalized Robot Assistance with Large Language Models

Jimmy Wu, Rika Antonova, Adam Kan, Marion Lepert, Andy Zeng, Shuran Song, Jeannette Bohg, Szymon Rusinkiewicz, Thomas Funkhouser

Autonomous Robots (AuRo) - Special Issue: Large Language Models in Robotics, 2023

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023

Project Page | PDF | arXiv | Video

Abstract: For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.

Overview

Here is an overview of how this codebase is organized:

Setup

We recommend using Conda environments. Our setup (tested on Ubuntu 20.04.6 LTS) uses the following 3 environments:

  1. tidybot env on the server for general use
  2. tidybot env on the robot for general use
  3. vild env on the server for object detection only

See the respective READMEs inside the server and robot directories for detailed setup instructions.

TidyBot Quickstart

Unless otherwise specified, the tidybot Conda env should always be used:

conda activate tidybot

Teleoperation Mode

We provide a teleoperation interface (teleop.py) to operate the robot using primitives such as pick, place, or toss.

First, run this command to start the teleop interface on the server (workstation), where <robot-num> is 1, 2, or 3, depending on the robot to be controlled:

python teleop.py --robot-num <robot-num>

On the robot (mobile base computer), make sure that the convenience stop and mobile base driver are both running. Then, run this command to start the controller:

python controller.py

Once the server and robot both show that they have successfully connected to each other, use these controls to teleop the robot:

Notes:


To generate paths with an occupancy map rather than manually clicking waypoints, use the --shortest-path flag.

python teleop.py --robot-num <robot-num> --shortest-path

This will load the receptacles specified in scenarios/test.yml as obstacles and build an occupancy map to avoid running into them.


For additional debugging visualization, the --debug flag can be used.

Server:

python teleop.py --robot-num <robot-num> --debug

Robot:

python controller.py --debug

Fully Autonomous Mode

To operate the robot in fully autonomous mode, we use the demo interface in demo.py. By default, the demo will load the test scenario in scenarios/test.yml along with the corresponding LLM-summarized user preferences in preferences/test.yml.

To start the demo on the server, first start the object detector server with the vild Conda env:

conda activate vild
python object_detector_server.py

Then, in a separate terminal, start the demo interface (with the tidybot env):

python demo.py --robot-num <robot-num>

On the robot, make sure that the convenience stop and mobile base driver are both running. Then, run this command to start the controller:

python controller.py

These are the controls used to run the demo:

Note: If keypresses are not registering, make sure that the demo interface is the active window.


To load a different scenario (default is test), use the --scenario-name argument:

python demo.py --robot-num <robot-num> --scenario-name <scenario-name>

For example, to load scenario-08 and use robot #1, you can run:

python demo.py --robot-num 1 --scenario-name scenario-08

For additional debugging visualization, the --debug flag can be used.

Server:

python demo.py --robot-num <robot-num> --debug

Robot:

python controller.py --debug

Troubleshooting

Mobile Base Accuracy

The marker detection setup should output 2D robot pose estimates with centimeter-level accuracy. For instance, our setup can reliably pick up small Lego Duplo blocks (32 mm x 32 mm) from the floor. Inaccurate marker detection can be due to many reasons, such as inaccurate camera alignment or suboptimal camera settings (e.g., exposure and gain, see get_video_cap in utils.py). Also note that the mobile base motors should be calibrated (.motor_cal.txt) for more accurate movement.

Arm Accuracy

The 3 Kinova arms are repeatable but have slightly different zero heading positions, so they require some compensation to be consistent with each other. See the arm-dependent heading compensation in controller.py.

Server Ports

If multiple people have been using the server, you may run into this error:

OSError: [Errno 98] Address already in use

To kill all processes using the occupied ports, you can use the clear-ports.sh script (requires sudo):

./clear-ports.sh

For reference, here are all of the ports used by this codebase:

Camera Errors

The overhead cameras may occasionally output errors such as this:

[ WARN:16@1367.080] global /io/opencv/modules/videoio/src/cap_v4l.cpp (1013) tryIoctl VIDEOIO(V4L2:/dev/v4l/by-id/usb-046d_Logitech_Webcam_C930e_E4298F4E-video-index0): select() timeout.
[ WARN:16@2049.229] global /io/opencv/modules/videoio/src/cap_v4l.cpp (1013) tryIoctl VIDEOIO(V4L2:/dev/v4l/by-id/usb-046d_Logitech_Webcam_C930e_099A11EE-video-index0): select() timeout.
Corrupt JPEG data: 36 extraneous bytes before marker 0xd9
Corrupt JPEG data: premature end of data segment

Typically, these errors can be resolved by unplugging the camera and plugging it back in.

Be sure to also check the quality and length of the USB extension cable, as USB 2.0 does not support cable lengths longer than 5 meters.

Citation

If you find this work useful for your research, please consider citing:

@article{wu2023tidybot,
  title = {TidyBot: Personalized Robot Assistance with Large Language Models},
  author = {Wu, Jimmy and Antonova, Rika and Kan, Adam and Lepert, Marion and Zeng, Andy and Song, Shuran and Bohg, Jeannette and Rusinkiewicz, Szymon and Funkhouser, Thomas},
  journal = {Autonomous Robots},
  year = {2023}
}