Awesome

Hybrid Ranking Network for Text-to-SQL

Code for our paper Hybrid Ranking Network for Text-to-SQL

Environment Setup

Python 3.8
Pytorch 1.7.1 or higher
pip install -r requirements.txt

We can also run experiments with docker image: docker build -t hydranet -f Dockerfile .

The built image above contains processed data and is ready for training and evaluation.

Data Preprocessing

Create data folder and output folder first: mkdir data && mkdir output
Clone WikiSQL repo: git clone https://github.com/salesforce/WikiSQL && tar xvjf WikiSQL/data.tar.bz2 -C WikiSQL
Preprocess data: python wikisql_gendata.py

Training

Run python main.py train --conf conf/wikisql.conf --gpu 0,1,2,3 --note "some note".
Model will be saved to output folder, named by training start datetime.

Evaluation

Modify model, input and output settings in wikisql_prediction.py and run it.
Run WikiSQL evaluation script to get official numbers: cd WikiSQL && python evaluate.py data/test.jsonl data/test.db ../output/test_out.jsonl

Note: the WikiSQL evaluation script will encounter error when running in Windows system. Hence we included the fixed version for Windows User (run in root folder): python wikisql_evaluate.py WikiSQL/data/test.jsonl WikiSQL/data/test.db output/test_out.jsonl

Trained Model

Trained model that can reproduce reported number on WikiSQL leaderboard is attached in the releases (see under "Releases" in the right column). Model prediction outputs are also attached.