Awesome
Mapping natural language commands to web elements
Data
Due to its large size, the data is hosted outside Github: https://nlp.stanford.edu/projects/phrasenode/
You can download the dataset by running the script as follows.
bash download_dataset.sh
Setup
The code was developed in the following environment:
- Python 2.7
- Pytorch 0.4.1
- CUDA 9
To install dependencies:
-
(Optional) Create a virtualenv / conda environment
virtualenv.py -p python2.7 env source env/bin/activate
-
Python dependencies
sudo apt-get install python-dev pip install -r requirements.txt
Alternatively, use the docker image ppasupat/phrasenode
For latest image: docker pull ppasupat/phrasenode:1.06
Quick start
If you just want to see something happen:
export WEBREP_DATA=./data
./main.py configs/base.txt configs/model/encoding.txt configs/node-embedder/allan.txt -n testrun
- This executes the main entrypoint script
main.py
with three config files. - The config files are specified in HOCON format. If multiple files are specified, they will be merged together (with later configs overwriting values from previous ones).
- You can also add ad-hoc config strings using the
-c
option. These are applied last. -n
specifies the experiment directory name.
Configurations
Here are the configerations used in the final experiments:
base.txt
: Used in all experimentsmodels/encoding.txt
: The embedding-based methodmodels/alignment.txt
: The alignment-based methodnode-embedder/allan.txt
: The node embedder as described in the paperablation/*.txt
: Ablation
Note that the visual neighbor is off by default.
To turn it on, use general/neighbors.txt
.
Experiment management
All training runs are managed by the PhraseNodeTrainingRuns
object. For example,
to get training run #141, do this:
runs = PhraseNodeTrainingRuns() # Note the final "s"
run = runs[141] # a PhraseNodeTrainingRun object
A TrainingRun
is responsible for constructing a model, training it, saving it
and reloading it (see superclasses gtd.ml.TrainingRun
and
gtd.ml.TorchTrainingRun
for details.)
The most important methods on PhraseNodeTrainingRun
are:
__init__
: the model, data storage, etc, are initializedtrain
: actual training of the model happens here
TensorBoard
Statistics are logged to TensorBoard. To view:
tensorboard --logdir=data/experiments
Demo Chrome extension
-
Start the server with
export WEBREP_DATA=./data ./server.py data/experiments/0_testrun/config.txt -m data/experiments/0_testrun/checkpoints/20000.checkpoint/model
where
0_testrun
should be changed to the model's directory, and20000
should be changed to the checkpoint number you want. -
Install the unpacked Chrome extension in
demo/phrasenode-demo
- Follow the instruction from here to load the extension. (The manifest file should already be in
demo/phrasenode-demo
) - An extension button should now show up on the toolbar
- On any web page, click on the extension button and enter a phrase in the prompt that pops up.
- If the server does not throw an error, the selected element should be highlighted with a red border. Details can be viewed in the developer console.
- Follow the instruction from here to load the extension. (The manifest file should already be in
Referenece
Panupong Pasupat, Tian-Shun Jiang, Evan Liu, Kelvin Guu, Percy Liang.
Mapping natural language commands to web elements.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
CodaLab: https://worksheets.codalab.org/worksheets/0x0097f249cd944284a81af331093c3579/