Home

Awesome

🛒 WebShop

Python version License PyPI version Pytest workflow

Implementation of the WebShop environment and search agents for the paper:

WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan

<p float="left"> <img src="assets/diagram.gif"> </p>

This repository contains code for reproducing results. If you find this work useful in your research, please cite:

@inproceedings{yao2022webshop,
  bibtex_show = {true},
  title = {WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents},
  author = {Yao, Shunyu and Chen, Howard and Yang, John and Narasimhan, Karthik},
  booktitle = {ArXiv},
  year = {preprint},
  html = {https://arxiv.org/abs/2207.01206},
  tag = {NLP}
}

📖 Table of Contents <!-- omit in toc -->

👋 Overview

WebShop is a simulated e-commerce website environment with 1.18 million real-world products and 12,087 crowd-sourced text instructions. In this environment, an agent needs to navigate multiple types of webpages and issue diverse actions to find, customize, and purchase a product given an instruction. WebShop provides several challenges including understanding compositional instructions, query (re-)formulation, dealing with noisy text in webpages, and performing strategic exploration.

Hugging Face Demo: Devise your own natural language query for a product and ask for an agent trained with WebShop to find it on Amazon or eBay, deployed as a 🤗 Hugging Face space here!

🚀 Setup

Our code is implemented in Python. To setup, do the following:

  1. Install Python 3.8.13
  2. Install Java
  3. Download the source code:
> git clone https://github.com/princeton-nlp/webshop.git webshop
  1. Create a virtual environment using Anaconda and activate it
> conda create -n webshop python=3.8.13
> conda activate webshop
  1. Install requirements into the webshop virtual environment via the setup.sh script
> ./setup.sh [-d small|all]

The setup script performs several actions in the following order:

  1. By default the WebShop only loads 1,000 products for a faster environment preview. To load all products, change web_agent_site/utils.py:
# DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2_1000.json')
# DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle_1000.json')
DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2.json')
DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle.json')
  1. (Optional) Download ResNet image feature files here and put into data/ for running models that require image features.

  2. (Optional) Human demonstration data and be downloaded here.

🛠️ Usage

The WebShop environment can be rendered in two modes - html and simple - each of which offer a different observation space. The simple mode strips away the extraneous meta-data that the html mode includes to make model training and evaluation easier.

Webpage Environment (html mode)

Launch the WebShop webpage:

> ./run_dev.sh

The site should then be viewable in the browser. Go to http://localhost:3000/ABC, where you should land on the search home page with a random instruction.

Navigating the website will automatically generate a corresponding trajectory file in the user_session_logs/mturk folder. Each file corresponds to a single instruction/web session, and each step of the file corresponds to a single action (i.e. search[...], click[...]).

The current WebShop build comes with two flags:

Text Environment (simple mode)

The simple mode of the WebShop environment is packaged and readily available as an OpenAI environment. The OpenAI gym definitions of the text environment can be found in the web_agent_site/envs folder.

To start using the gym and building agents that interact with the WebShop environment, include the following statements in your Python file:

import gym
from web_agent_site.envs import WebAgentTextEnv

env = gym.make('WebAgentTextEnv-v0', observation_mode='text', num_products=...)

Now, you can write your own agent that interacts with the environment via the standard OpenAI gym interface.

Examples of a RandomPolicy agent interacting with the WebShop environment in both html and simple mode can be found in the run_envs folder. To run these examples locally, run the run_web_agent_text_env.sh or run_web_agent_site_env.sh script:

> ./run_web_agent_text_env.sh
Products loaded.
Keys Cleaned.
Attributes Loaded.
100%|██████████████████| 1000/1000
Loaded 6910 goals.
Amazon Shopping Game [SEP] Instruction: [SEP] Find me slim f...
Available actions: {'has_search_bar': True, 'clickables': ['search']}
Taking action "search[shoes]" -> Reward = 0.0
...

In order to run the run_web_agent_site_env.sh script, you must download a version of ChromeDriver compatible with your Chrome browser version. Once you have downloaded and unzipped the executable, rename it chromedriver and place it in the webshop/envs folder.

Baseline Models

To run baseline models (rule, IL, RL, IL+RL) from the paper, please refer to the README.md in the baseline_models folder.

Sim-to-real Transfer

To read more about how the sim-to-real transfer of agents trained on WebShop to other environments works, please refer to the README.md in the transfer folder.

💫 Contributions

We would love to hear from the broader NLP and Machine Learning community, and we welcome any contributions, pull requests, or issues! To do so, please either file a new pull request or issue and fill in the corresponding templates accordingly. We'll be sure to follow up shortly!

🪪 License

Check LICENSE.md