Home

Awesome

<div align="center"> <img src="docs/assets/v-evaluation-banner.png" alt="Voltron Evaluation Logo"/> </div> <div align="center">

arXiv PyTorch Code Style: Black Ruff License

</div>

Evaluation Suite for Robotic Representation Learning

Repository for Voltron Evaluation: A Diverse Evaluation Tasks for Robotic Representation Learning, spanning Grasp Affordance Prediction, Referring Expression Grounding, Visuomotor Control, and beyond!


Quickstart

This repository is built with PyTorch; while specified as a dependency for the package, we highly recommend that you install the desired version (e.g., with accelerator support) for your given hardware and environment manager (e.g., conda).

PyTorch installation instructions can be found here. This repository should work with PyTorch >= 1.12, but has only been thoroughly tested with PyTorch 1.12.0, Torchvision 0.13.0, Torchaudio 0.12.0.

Once PyTorch has been properly installed, you can install this package locally via an editable installation (this will install voltron-robotics via PyPI if not already on your path):

git clone https://github.com/siddk/evaluation-dev
cd evaluation-dev
pip install -e .

Note: Once finalized (and data permissions are cleared) we hope to upload this directly to PyPI for easier setup.

Usage

V-Evaluation is structured as a series of evaluation applications, each with a semi-unified "harness"; for example, running the referring expression grounding task (on OCID-Ref) is as simple as:

from voltron import instantiate_extractor, load
import voltron_evaluation as vet

# Load a frozen Voltron (V-Cond) model & configure a MAP extractor
backbone, preprocess = load("v-cond", device="cuda", freeze=True)
map_extractor_fn = instantiate_extractor(backbone)

refer_evaluator = vet.ReferDetectionHarness("v-cond", backbone, preprocess, map_extractor_fn)
refer_evaluator.fit()
refer_evaluator.test()

"Vetting" a representation is a modular and straightforward process. Each harness takes a backbone, a callable that defines a new nn.Module for representation extraction, and (optionally) a callable defining a task-specific adapter (if you do not want to use the ones described in the paper). Calling harness.fit() will follow the same preprocessing and training protocols described in the paper, while harness.test() will give you the final metrics.

See examples/ for examples of the other evaluation applications, and voltron_evaluation/<task> for what a task configuration looks like.


Contributing

Before committing to the repository, make sure to set up your dev environment!

Here are the basic development environment setup guidelines:

Additional Contribution Notes:


Repository Structure

High-level overview of repository/project file-tree:


Citation

Please cite our paper if using any of the Voltron models, evaluation suite, or other parts of our framework in your work.

@inproceedings{karamcheti2023voltron,
  title={Language-Driven Representation Learning for Robotics},
  author={Siddharth Karamcheti and Suraj Nair and Annie S. Chen and Thomas Kollar and Chelsea Finn and Dorsa Sadigh and Percy Liang},
  booktitle={Robotics: Science and Systems (RSS)},
  year={2023}
}