Awesome

Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments

Mingyo Seo, Ryan Gupta, Yifeng Zhu, Alexy Skoutnev, Luis Sentis, Yuke Zhu

intro

Abstract

We tackle the problem of perceptive locomotion in dynamic environments. In this problem, a quadruped robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles. We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision making to predict navigation commands and low-level gait generation to realize the target commands. In this framework, we train the high-level navigation controller with imitation learning on human demonstrations collected on a steerable cart and the low-level gait controller with reinforcement learning (RL). Our method is, therefore, able to acquire complex navigation behaviors from human supervision and discover versatile gaits from trial and error. We demonstrate the effectiveness of our approach in simulation and with hardware experiments. Compared to state-of-the-art RL baselines, our method outperforms them by 38.6% in average distance traversed.

If you find our work useful in your research, please consider citing.

Dependencies

Python 3.8.5 (recommended)
Robomimic
Tianshou
PyBullet
PyTorch

Installation

Install the environments and dependencies by running the following commands.

pip3 install -e .

You need to locate asset files to ./data/*. These asset files are found here.

Creating a demo dataset for Navigation Controller

For collecting human demonstration data for Navigation Controller, please use the following commands. You may need a Spacemouse.

python3 scripts/demo_nav.py --env_type=ENV_TYPE --demo_name=DEMO_NAME

You may be able to specify the difficulty of environments by changing ENV_TYPE. Collected data would be saved in ./save/data_sim/DEMO_NAME as pickle files. Rendering videos and extra logs would be saved in ./save/raw_sim/DEMO_NAME.

To convert the collected data into hdf5 dataset file, please use the following commands. The converted dataset would be saved in PATH_TO_TARGET_FILE.

python3 scripts/utils/convert_dataset.py --folder=PATH_TO_DATA_FOLDER --demo_path=PATH_TO_TARGET_FILE

Then, please run the following commands to split the dataset for training and evaluation. The script would overwrite the split dataset on the original dataset file.

python3 scripts/utils/split_train_val.py --dataset=PATH_TO_TARGET_FILE

Dataset files consist of sequences of the following data structure.

hdf5 dataset
├── agentview_rgb: 212x120x3 array
├── agentview_depth: 212x120x1 array
├── yaw: 2D value
├── actions: 2D value
├── dones: 1D value
└── rewards: 1D value (not used)

Training

For training the Gait Controller, please use the following commands. The configuration at ./config/gait/GAIT_CONFIG would be used for training. Trained files would be saved in ./save/rl_checkpoint/gait/GAIT_POLICY.

python3 scripts/train_gait.py --config=GAIT_CONFIG --gait_policy=GAIT_POLICY

For training Navigation Controller, please use the following commands. You need to create or download (link) an hdf5-format Dataset file for training. The configuration at ./config/nav/NAV_CONFIG.json would be used for training. Trained files would be saved in ./save/bc_checkpoint.

python3 scripts/train_nav.py --config=NAV_CONFIG

Evaluation

You should locate pre-trained data to ./save/*. These pre-trained data would be released later.

For evaluating Gait Controller only, please use the following commands. The checkpoints of the Gait Controller at ./save/rl_checkpoint/gait/GAIT_POLICY would be loaded.

python3 scripts/eval_gait.py --gait_policy=GAIT_POLICY

To evaluate PRELUDE with both Gait Controller and Navigation Controller, please use the following commands. The checkpoints of the Navigation Controller at ./save/bc_checkpoint/NAV_POLICY would be loaded.

python3 scripts/eval_nav.py --gait_policy=GAIT_POLICY --nav_policy=NAV_POLICY

Dataset and pre-trained models

We provide our demonstration dataset in simulation environments (link) and trained models of the Navigation Controller (link) and the Gait Controller (link).

Implementation Details

Please see this page for more information about our implementation details, including the model architecture and training procedures.

Citing

@inproceedings{seo2022prelude,
   title={Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments},
   author={Seo, Mingyo and Gupta, Ryan and Zhu, Yifeng and Skoutnev, Alexy and Sentis, Luis and Zhu, Yuke},
   booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
   year={2023}
}