Home

Awesome

<!-- vimc: call SyntaxRange#Include('```sh', '```', 'sh', 'NonText'): --> <!-- vimc: call SyntaxRange#Include('```bibtex', '```', 'bib', 'NonText'): -->

NEWS!!

See our Change Log for details. The documents will be revised soon. A new tutorial w.r.t. episode event management is on plan.

For more details, please see our Change Log and Documents.

Please see our Change Log and Document.

Please see our Change Log and Document.

Please see our Change Log, Usage Document, and Task Definition Document.

Mobile-Env: Building Qualified Evaluation Benchmarks for GUI Interaction

Mobile-Env is a interaction platform for building evaluation benchmarks for GUI interaction and evaluating and training GUI agents. Our paper is available at arXiv.

Mobile-Env is developed based on AndroidEnv. The agent can take the screenshot and the view hierarchy (disabled defaultly for the long latency) as the observation and take a touch or type a token as the action to interact with the Android apps. Several episode signals like step instructions, rewards, or the episode end will be informed during interaction at some crucial steps. A so-called crucial step may be opening a target page, srolling to a correct area, etc. and is depending on the specific task definition.

The proposed WikiHow task set is available at the Hugging Face Platform.

Index

Platform Features

Mobile-Env is a flexible, adaptable, and easily-extendable platform for InfoUI interaction with the following features:

Getting Started

Installation

Install from PyPI:

pip insall mobile-env-rl

or clone the repository and build locally.

git clone https://github.com/X-LANCE/Mobile-Env
cd Mobile-Env
pip install .

Several Docker images with well-configured Android AVD are also available.

Load and Run Mobile-Env for Evaluation or Training

Before loading the Mobile-Env environment, you will need to set up an Android Emulator device. Then you can load the environment with some existing task definitions and start your experiments. A detailed guidance is provided in Evaluating and Traning Agents on Mobile-Env. Several examples with a random agent or a human agent is also provided under examples.

Extend a New Environment or a New Task

To extend a new environment for Mobile-Env, the environment designer needs to prepare the app package and ensure that the package manages to launch and run on some versions of Android Emulator. If the app requires varying online data, the necessary data should be crawled and dumped and then be replayed for a consistent evaluation. In such case, the designer is supposed to validate the certain effectiveness of certificate unpinning plan for the package. As regards to extend new tasks, task definition files are just required. Detailed instructions can be found in Extending a New Environment (App) or a New Task Based on Mobile-Env.

Several demo task definitions are provided under demos. Three of them are migrated from AndroidEnv:

Another one, openmoneybox.add_billings.textproto is defined upon an open-sourced billing app, OpenMoneyBox. Details are referred to in the task definition files.

Miscellaneous Auxiliary Tools

We also developed an annotation tool for the human demonstrations, and a suite of template tool to auto-generate task definitions according to templates and to combine multiple task definitions to form a multi-step task. The details are referred to in Miscellaneous Auxiliary Tools.

Reference Time-Consuming and Memory Usage o Mobile-Env

The data are measured under the configuration below:

ItemAvg TimeTime Std Dev
TOUCH action410.50 µs64.71 µs
LIFT action412.30 µs84.18 µs
TEXT action1.30 s 0.58s0.28 s 0.03 s
screenshot capturing19.94 ms21.47 ms
invocation of Sentence Transformer(all-MiniLM-L12-v2)8.51 ms0.17 ms
VH capturing2.53 s1.90 s
invocation of EasyOCR0.44 s0.08 s

When only an app of WikiHow 2.9.6 is running, the Android emulator occupies 6,031 MiB of virtual memory and 3,444 MiB of residual memory.

About

This library is developed and maintained by SJTU X-Lance. The corresponding paper is available at https://arxiv.org/abs/2305.08144.

If you find Mobile-Env useful in your research, you can cite the project using the following BibTeX:

@article{DanyangZhang2023_MobileEnv,
  title     = {{Mobile-Env}: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction},
  author    = {Danyang Zhang and
               Zhennan Shen and
               Rui Xie and
               Situo Zhang and
               Tianbao Xie and
               Zihan Zhao and
               Siyuan Chen and
               Lu Chen and
               Hongshen Xu and
               Ruisheng Cao and
               Kai Yu},
  journal   = {CoRR},
  volume    = {abs/2305.08144},
  year      = {2023},
  url       = {https://arxiv.org/abs/2305.08144},
  eprinttype = {arXiv},
  eprint    = {2305.08144},
}