Awesome
ViPlanner: Visual Semantic Imperative Learning for Local Navigation
<p align="center"> <a href="https://leggedrobotics.github.io/viplanner.github.io/">Project Page</a> • <a href="https://arxiv.org/abs/2310.00982">arXiv</a> • <a href="https://youtu.be/8KO4NoDw6CM">Video</a> • <a href="#citing-viplanner">BibTeX</a>Click on the image for the demo video!
</p>ViPlanner is a robust learning-based local path planner based on semantic and depth images. Fully trained in simulation, the planner can be applied in dynamic indoor as well as outdoor environments. We provide it as an extension for NVIDIA Isaac-Sim within the IsaacLab project (details here). Furthermore, a ready-to-use ROS Noetic package is available within this repo for direct integration on any robot (tested and developed on ANYmal C and D).
Keywords: Visual Navigation, Local Planning, Imperative Learning
Install
- Install
pyproject.toml
with pip by running:
orpip install .
if you want to edit the code. To apply the planner in the ROS-Node, install it with the inference setting:pip install -e .[standard]
Make sure the CUDA toolkit is of the same version as used to compile torch. We assume 11.7. If you are using a different version, adjust the string for the mmcv install as given . If the toolkit is not found, set thepip install -e .[standard,inference]
CUDA_HOME
environment variable, as follows:
On the Jetson, please useexport CUDA_HOME=/usr/local/cuda
aspip install -e .[inference,jetson]
mmdet
requires torch.distributed which is only build until version 1.11 and not compatible with pypose. See the Dockerfile for a workaround.
Known Issue
- mmcv build wheel does not finish:
- fix by installing with defined CUDA version, as detailed here. For CUDA Version 11.7 and torch==2.0.x use
pip install mmcv==2.0.0 -f https://download.openmmlab.com/mmcv/dist/cu117/torch2.0/index.html
Extension
This work includes the switch from semantic to direct RGB input for the training pipeline to facilitate further research. For RGB input, an option exists to employ a backbone with mask2former pre-trained weights. For this option, include the GitHub submodule, install the requirements included there, and build the necessary Cuda operators. These steps are not necessary for the published planner!
pip install git+https://github.com/facebookresearch/detectron2.git
git submodule update --init
pip install -r third_party/mask2former/requirements.txt
cd third_party/mask2former/mask2former/modeling/pixel_decoder/ops \
sh make.sh
Remark
Note that for an editable installation of packages without setup.py, PEP660 has to be fulfilled. This requires the following versions (as described here in detail)
- pip >= 21.3
python3 -m pip install --upgrade pip
- setuptools >= 64.0.0
python3 -m pip install --upgrade setuptools
Inference and Model Demo
-
Real-World <br>
ROS-Node is provided to run the planner on the LeggedRobot ANYmal; for details, please see ROS-Node-README.
-
NVIDIA Isaac-Sim <br>
The planner can be executed within Nvidia Isaac Sim. It is implemented as part of the IsaacLab Framework with an own extension. For details, please see Omniverse Extension. This includes a planner demo in different environments with the trained model.
Training
Here is an overview of the steps involved in training the policy. For more detailed instructions, please refer to TRAINING.md.
-
Training Data Generation <br> Training data is generated from the Matterport 3D, Carla and NVIDIA Warehouse using IsaacLab. For detailed instructions on how to install the extension and run the data collection script, please see here
-
Build Cost-Map <br> The first step in training the policy is to build a cost-map from the available depth and semantic data. A cost-map is a representation of the environment where each cell is assigned a cost value indicating its traversability. The cost-map guides the optimization, therefore, it is required to be differentiable. Cost-maps are built using the cost-builder with configs here, given a pointcloud of the environment with semantic information (either from simulation or real-world information). The point-cloud of the simulated environments can be generated with the reconstruction-script with config here.
-
Training <br> Once the cost-map is constructed, the next step is to train the policy. The policy is a machine learning model that learns to make decisions based on depth and semantic measurements. An example training script can be found here with configs here
-
Evaluation <br> Performance assessment can be performed on simulation and real-world data. The policy will be evaluated regarding multiple metrics such as distance to the goal, average and maximum cost, and path length. In order to let the policy be executed on anymal in simulation, please refer to Omniverse Extension
Model Download
The latest model is available to download: [checkpoint] [config]
<a name="CitingViPlanner"></a>Citing ViPlanner
@inproceedings{roth2024viplanner,
title={Viplanner: Visual semantic imperative learning for local navigation},
author={Roth, Pascal and Nubert, Julian and Yang, Fan and Mittal, Mayank and Hutter, Marco},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
pages={5243--5249},
year={2024},
organization={IEEE}
}
License
This code belongs to Robotic Systems Lab, ETH Zurich. All right reserved
Authors: Pascal Roth, Julian Nubert, Fan Yang, Mayank Mittal, Ziqi Fan, and Marco Hutter<br /> Maintainer: Pascal Roth, rothpa@ethz.ch
The ViPlanner package has been tested under ROS Noetic on Ubuntu 20.04. This is research code, expect that it changes often and any fitness for a particular purpose is disclaimed.