Awesome
ReALFRED
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments,
Taewoong Kim*,
Cheolhong Min*,
Byeonghwi Kim,
Jinyeon kim,
Wonje Jeung,
Jonghyun Choi
ECCV 2024
Abstract: Simulated virtual environments have been widely used to learn robotic agents that perform daily household tasks. These environments encourage research progress by far, but often provide limited object interactability, visual appearance different from real-world environments, or relatively smaller environment sizes. This prevents the learned models in the virtual scenes from being readily deployable. To bridge the gap between these learning environments and deploying (i.e., real) environments, we propose the ReALFRED benchmark that employs real-world scenes, objects, and room layouts to learn agents to complete household tasks by understanding free-form language instructions and interacting with objects in large, multi-room and 3D-captured scenes. Specifically, we extend the ALFRED benchmark with updates for larger environmental spaces with smaller visual domain gaps. With ReALFRED, we analyze previously crafted methods for the ALFRED benchmark and observe that they consistently yield lower performance in all metrics, encouraging the community to develop methods in more realistic environments. Our code and data are publicly available.
Installation
Download builds.zip
here.
$ unzip builds.zip
# remove redundant file
$ rm builds.zip
$ conda create -n realfred python=3.6
$ conda activate realfred
$ pip install ai2thor==4.3.0
$ export LOCAL_BUILDS_PATH=builds/thor-Linux64-local/thor-Linux64-local
Play around
import os
from ai2thor.controller import Controller
controller = Controller(local_executable_path=os.environ['LOCAL_BUILDS_PATH'])
event = controller.step("MoveAhead")
Download
Download the annotation files from <a href="https://huggingface.co/datasets/SNUMPR/realfred_json">the Hugging Face repo</a>.
git clone https://huggingface.co/datasets/SNUMPR/realfred_json data
To train seq2seq, moca, and abp, download the ResNet-18 features and annotation files from <a href="https://huggingface.co/datasets/SNUMPR/realfred_feat">the Hugging Face repo</a>. <br> Note: It takes quite a large space (~2.3TB).
git clone https://huggingface.co/datasets/SNUMPR/realfred_feat data
Baseline code
This repository provides code for several baseline models implemented in the ReALFRED benchmark, including:
Please refer to each README.md
for detailed instructions to reproduce results in the paper.
Hardware
Tested on:
- GPU - RTX A6000
- CPU - Intel(R) Core(TM) i7-12700K CPU @ 3.60GHz
- RAM - 64GB
- OS - Ubuntu 20.04
Citation
@inproceedings{kim2024realfred,
author = {Kim, Taewoong and Min, Cheolhong and Kim, Byeonghwi and Kim, Jinyeon and Jeung, Wonje and Choi, Jonghyun},
title = {ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environment},
booktitle = {ECCV},
year = {2024}
}
Acknowledgements
This work was partly supported by the NRF grant (No.2022R1A2C400230012, 5%)
and IITP grants (No.RS-2022-II220077 (5%), No.RS-2022-II220113 (5%),
No.RS-2022-II220959 (5%), No.RS-2022-II220871 (15%), No.RS-2020-II201361 (5%, Yonsei AI),
No.RS-2021-II211343 (5%, SNU AI), No.RS-2021-II212068 (5%, AI Innov. Hub), No.RS-2022-II220951(50%))
funded by the Korea government(MSIT).