Awesome
<div align= "center"> <h1> ๐งShortcutsBench๐ฑ</h1> </div> <div align="center"></div> <!-- <p align="center"> <a href="#model">Model</a> โข <a href="#data">Data Release</a> โข <a href="#web-ui">Web Demo</a> โข <a href="#tool-eval">Tool Eval</a> โข <a href="https://arxiv.org/pdf/2307.16789.pdf">Paper</a> โข <a href="#citation">Citation</a> </p> --> </div>
Read this in ไธญๆ.
What are Shortcuts?
Shortcuts are workflows built by developers in the Shortcuts app using a user-friendly graphical interface ๐ผ๏ธ with the provided basic actions. Apple describes them as "a quick way to get one or more tasks done with your apps." ๐ฑ
Project Task List (Continuously Updated) ๐
All data, data acquisition processes, data generated during cleaning, cleaning scripts, experiment scripts, results, and related files can be found in the following documents: deves_dataset/dataset_src/README.md
(English) or Chinese, deves_dataset/dataset_src_valid_apis/README.md
(English) or Chinese, and experiments/README.md
(English) or Chinese.
- ShortcutsBench Paper Main Text
- ShortcutsBench Paper Appendix
- Scripts for Data Acquisition, Data Cleaning and Processing, Experiment Code, and Experiment Results
- We provide shortcuts with bilingual explanations for regular users: listed in
users_dataset/${website name}/${category name}/README.md
(English) orusers_dataset/${website name}/${category name}/README_ZH.md
(Chinese). Regular users can find suitable shortcuts for their work or life in our repository, which they can import into the Shortcuts app on Apple devices. Each shortcut includes:- The iCloud link for the shortcut
- A description of the shortcut's functionality
- The source of the shortcut
- For Shortcut Researchers:
ShortcutsBench
provides: (1) Shortcuts (i.e., sequences of actions ingolden
); (2) Queries (i.e., tasks assigned to the agent); (3) APIs (i.e., tools available to the agent).-
Shortcuts
-
Raw Shortcut Dataset, i.e., the file
1_final_detailed_records_remove_repeat.json
, can be downloaded as described indeves_dataset/dataset_src/README.md
(English) ordeves_dataset/dataset_src/README_ZH.md
(Chinese), or directly from Google Drive or Baidu Cloud (password:shortcutsbench
).The APIs involved in the shortcuts in this file may not have corresponding API definition files.
-
Filtered Shortcut Dataset, i.e., the file
1_final_detailed_records_filter_apis.json
, can be downloaded as described indeves_dataset/dataset_src/README.md
(English) ordeves_dataset/dataset_src/README_ZH.md
(Chinese), or directly from Google Drive or Baidu Cloud (password:shortcutsbench
).The APIs involved in the shortcuts in this file all have corresponding API definition files. This file is a cleaned version of
1_final_detailed_records_remove_repeat.json
. If a shortcut contains APIs without definition files, the shortcut is removed. -
Shortcuts Dataset
<=30
, i.e., the file1_final_detailed_records_filter_apis_leq_30.json
, can be downloaded as described inexperiments/README.md
(English) orexperiments/README_ZH.md
(Chinese), or directly from Google Drive or Baidu Cloud (password:shortcutsbench
).Considering the context length limitation of language models, we only evaluated shortcuts with lengths
<=30
in the ShortcutsBench paper.
-
-
Queries. The generated queries are shown in
generated_success_queries.json
, which can be obtained from Google Drive or Baidu Cloud (password:shortcutsbench
).The queries are generated based on
1_final_detailed_records_filter_apis_leq_30.json
. -
APIs. The obtained APIs are shown in
4_api_json_filter.json
, which can be obtained from Google Drive or Baidu Cloud (password:shortcutsbench
).4_api_json_filter.json
has been manually deduplicated, but a few duplicates remain. The raw unprocessed files extracted directly from the app are in4_api_json.json
, which can be obtained from Google Drive or Baidu Cloud (password:shortcutsbench
).
-
How can this project help you?
The Apple Developer Conference WWDC'24 introduced a lot of AI features on Apple devices ๐ค. We are very interested in how Apple combines large language models like ChatGPT with devices to provide users with a smarter experience ๐ก. In this process, shortcuts will play a significant role! ๐
As a Shortcut User and Enthusiast ๐ฑ
You can find your favorite shortcuts in this dataset ๐ฑ to help you complete various complex tasks with one click! For example:
-
๐ก Daily Life ๐คน
-
๐๏ธ Shopping Enthusiasts ๐
-
๐งโ๐ Students ๐งฎ
- Calculator
- Relax Your Mind
- ......
-
โจ๏ธ Writers ๐ฃ
- Translator
- Create PDF
- ......
-
๐งโ๐ฌ Researchers ๐ซ
- Get arXiv BibTeX Entry
- ......
-
.....
As a Researcher ๐ฌ
- Research on building automated workflows: Shortcuts are essentially workflows composed of a series of API calls (actions) provided by Apple and third-party apps ๐.
- Research on low-code programming: Shortcuts include features like branches, loops, and variable assignments, while having a user-friendly graphical interface ๐ฅ๏ธ.
- Research on API-based agents: Enabling large language models to autonomously decide whether, when, and how to use APIs based on user queries (tasks) ๐ง.
- Research on fine-tuning large language models using shortcuts to closely integrate language models with phones, computers, and smartwatches, achieving the vision of an "operating system based on large language models" ๐.
- ......
๐Advantages of ShortcutsBench Over Existing API-Based Agent Datasets๐
ShortcutsBench has significant advantages in terms of the authenticity, richness, and complexity of APIs, the validity of queries and corresponding action sequences, the accurate filling of parameter values, the awareness of obtaining information from the system or users, and the overall scale.
To our knowledge, ShortcutsBench is the first large-scale agent benchmark based on real APIs, considering APIs, queries, and corresponding action sequences. ShortcutsBench provides a rich set of real APIs, queries of varying difficulty and task types, high-quality human-annotated action sequences (provided by shortcut developers), and queries from real user needs. Additionally, it offers precise parameter value filling, including raw data types, enumeration types, and using outputs from previous actions as parameter values, and evaluates the agent's awareness of requesting necessary information from the system or users. Moreover, the scale of APIs, queries, and corresponding action sequences in ShortcutsBench rivals or even surpasses benchmarks and datasets created by LLMs or modified from existing datasets. A comprehensive comparison between ShortcutsBench and existing benchmarks/datasets is shown in the table below.
If you find this project helpful, please give us a Star โญ๏ธ! Thank you for your support! ๐
Keywords: Shortcuts, Apple, WWDC'24, Siri, iOS, macOS, watchOS, Workflow, API Calls, Low-Code Programming, Agent, Large Language Model
User Guide for Shortcuts (For Users) ๐ฑ
Search for the Shortcut You Want ๐
In this repository, the users_dataset/${website name}/${category name}/README.md
file records the metadata of all shortcuts in the category, including name, description, iCloud download link, etc. Each README.md
file follows this structure:
### Name: Wine Shops # Shortcut Name
- URL: https://www.icloud.com/shortcuts/78ffd18288fd4da286bfd570993ea46e # iCloud Link
- Source: https://shortcutsgallery.com # Source
- Description: Look for Wine shops near you # Description
Use the shortcut Ctrl + F
to search by keyword in the shortcut name directly in your browser ๐. You can also visit Shortcut Collection Sites to search for the shortcuts you want ๐.
Import the Found Shortcut ๐ฅ
On your Apple device, click the iCloud link in the URL, and the shortcut will automatically open and be imported into your Shortcuts app ๐ฒ.
Download Shortcut Source Files
Besides downloading shortcuts one by one using the iCloud links, you can directly get the complete data from the following links:
Data Sources and Links ๐
Introduction to Shortcut Source Files
The shortcut source data in the cloud drive is organized in the following directory structure:
users_dataset/
โโโ matthewcassinelli.com_sirishortcuts_library_free # Website Name
โ โโโ file1
โ โโโ file2
โ โโโ file3
or
users_dataset/
โโโ jiejingku.net # Website Name
โ โโโ category1 # Category
โ โ โโโ file1 # Each specific shortcut
โ โ โโโ file2
โ โโโ category2
โ โ โโโ file3
Each file represents a shortcut. The file name is generated by simply processing the shortcut name, using the following code:
file_name = re.sub(r'[^a-zA-Z0-9]', '_', name)
The shortcut source files we provide are in JSON
format, whereas shortcuts exported from Apple devices are in the form of iCloud links (shared as links) or encrypted shortcut files with the .shortcut
extension.
To import a shortcut source file into the Shortcuts app on macOS
, follow these steps:
- Convert the
JSON
file format toPLIST
format ๐:import xml.etree.ElementTree as ET def parse_element(element): """ Recursively parse XML elements and return dictionaries and lists. """ if element.tag == 'dict': return {element[i].text: parse_element(element[i+1]) for i in range(0, len(element), 2)} elif element.tag == 'array': return [parse_element(child) for child in element] elif element.tag == 'true': return True elif element.tag == 'false': return False elif element.tag == 'integer': return int(element.text) elif element.tag == 'string': return element.text elif element.tag == 'real': return float(element.text) else: raise ValueError("Unsupported tag: " + element.tag) tree = ET.parse(file_path) root_element = tree.getroot() parsed_data = parse_element(root_element[0]) data = parsed_data save_path = "./" with open(save_path, 'w') as f: json.dump(data, f, indent=4)
- Sign the
PLIST
file ๐ usingshortcuts sign --mode anyone --input $input_file --output $output_file
, replacing$input_file
and$output_file
with the actual file paths. - Import the signed file into the Shortcuts app ๐ฒ.
ShortcutsBench Dataset Construction Guide ๐
We detail the construction process of ShortcutsBench in the main text of our paper. For more details, please refer to our paper. Below are some additional details.
How to use shortcuts? How to share shortcuts? How to view the source files of shortcuts?
-
Import shortcuts into the Shortcuts app.
You can import shortcuts into the Shortcuts app on Apple devices by clicking the iCloud link and using the shortcut as a regular user.
-
Share shortcuts.
- You can share the shortcut as an iCloud link using the
Share
option in the Shortcuts app onmacOS
oriOS
. - You can share the shortcut as a source file using the
Share
option in the Shortcuts app onmacOS
, resulting in a shortcut file with the.shortcut
extension. Note: The shared source file is encrypted by Apple and cannot be directly parsed using theplist
package in Python.
- You can share the shortcut as an iCloud link using the
-
Decrypt single or multiple shortcuts. If you want to decrypt a specific shortcut, you can use the following shortcuts to decrypt other shortcuts. The decrypted files will be in
plist
format.- Get Plist - Parse a single shortcut to a plist file
- Get Plist Loop - Parse all shortcuts in the Shortcuts app to plist files and save them
To make it easier to read, you can choose to convert the
plist
files tojson
format. The shortcut source files we provide are all injson
format. -
How to acquire shortcut source files on a large scale?
Instead of using
Get Plist
andGet Plist Loop
to parse shortcuts, we follow these two steps for quicker and more efficient mass acquisition of shortcut source files:- Obtain iCloud links in the format
https://www.icloud.com/shortcuts/${unique_id}
. - Request partial metadata of the shortcut from
https://www.icloud.com/shortcuts/api/records/${unique_id}
, including the shortcut name and download link for the source file. - Use the download link
cur_dict["fields"]["shortcut"]["value"]["downloadURL"]
obtained in the previous step to request the source file of the shortcut. Note: The download link expires quickly, so you need to use it promptly.
The directly downloaded source file is in
plist
format. You can choose to convert theplist
format tojson
format.The following code (simplified) demonstrates the entire process, with the final
response_json
being thejson
format shortcut source file:response = requests.get(f"https://www.icloud.com/shortcuts/api/records/{unique_id}") cur_dict = response.json() downloadURL = cur_dict["fields"]["shortcut"]["value"]["downloadURL"] new_response = requests.get(downloadURL) # Convert using the plist package to json and store in response_json response_json = biplist.readPlistFromString(new_response.content)
- Obtain iCloud links in the format
License Statement ๐
All code and datasets in this project are licensed under the Apache License 2.0
. This means you are free to use, copy, modify, and distribute the content of this project, but must comply with the following conditions:
- Copyright Notice: The original copyright notice and license statement must be included in all copies of the project.
- State Changes: If you modify the code, you must indicate the changes in any modified files.
- Trademark Use: This license does not grant the right to use project trademarks, service marks, or trade names.
For the full text of the license, please see LICENSE.
Additionally, you must comply with the license agreements of the shortcut sharing sites that provided the data sources for this project.
Citation
If you find this project helpful, please consider citing our work:
@misc{
shen2024shortcutsbenchlargescalerealworldbenchmark,
title={ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents},
author={Haiyang Shen and Yue Li and Desong Meng and Dongqi Cai and Sheng Qi and Li Zhang and Mengwei Xu and Yun Ma},
year={2024},
eprint={2407.00132},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2407.00132},
}