5ae4e368e2 | ||
---|---|---|
.. | ||
data/spider_schema_linking_tag | ||
dataset_post/spider_sl | ||
eval | ||
genre | ||
semparse | ||
third_party | ||
unisar | ||
README.md | ||
interactive.py | ||
multiprocessing_bpe_encoder.py | ||
requirements.txt | ||
running_pipeline.sh | ||
step1_schema_linking.py | ||
step2_serialization.py | ||
step3_evaluate.py | ||
train.py |
README.md
Introduction
This paper introduces UniSAr, which extends existing autoregressive language models to incorporate three non-invasive extensions to make them structure-aware: (1) adding structure mark to encode database schema, conversation context, and their relationships; (2) constrained decoding to decode well structured SQL for a given database schema; and (3) SQL completion to complete potential missing JOIN relationships in SQL based on database schema.
Dataset and Model
Spider -> ./data/spider
Fine-tuned BART model -> ./models/spider_sl
(Please download this model by git-lfs
to avoid the issue.)
sudo apt-get install git-lfs
git lfs install
git clone https://huggingface.co/dreamerdeo/mark-bart
Main dependencies
- Python version >= 3.6
- PyTorch version >= 1.5.0
pip install -r requirements.txt
- fairseq is going though changing without backward compatibility. Install
fairseq
from source and use this commit for reproducibilty. See here for the current PR that should fixfairseq/master
.
Evaluation Pipeline
Step 1: Preprocess via adding schema-linking and value-linking tag.
python step1_schema_linking.py
Step 2: Building the input and output for BART.
python step2_serialization.py
Step 3: Evaluation Script with/without constrained decoding.
python step3_evaluate.py --constrain
Results
Prediction: 69.34
Prediction with Constrain Decoding: 70.02
Interactive
python interactive.py --logdir ./models/spider-sl --db_id student_1 --db-path ./data/spider/database --schema-path ./data/spider/tables.json
Reference Code
https://github.com/ryanzhumich/editsql
https://github.com/benbogin/spider-schema-gnn-global
https://github.com/ElementAI/duorat
https://github.com/facebookresearch/GENRE