Generative Retrieval Transformer

Перейти к файлу

Igor Shalyminov b7cf5b1550 README.md updated		2020-03-04 19:09:52 +00:00
grtr	Initial project commit	2020-01-20 17:15:09 -05:00
img	README.md updated	2020-03-04 19:09:52 +00:00
scripts	More updates for release	2020-02-18 13:16:15 -05:00
.gitignore	Initial commit	2020-01-20 22:06:17 +00:00
CODE_OF_CONDUCT.md	Initial CODE_OF_CONDUCT.md commit	2020-01-20 14:06:21 -08:00
Dockerfile	Initial project commit	2020-01-20 17:15:09 -05:00
LICENSE	Initial LICENSE commit	2020-01-20 14:06:22 -08:00
README.md	README.md updated	2020-03-04 19:09:52 +00:00
SECURITY.md	Initial SECURITY.md commit	2020-01-20 14:06:23 -08:00
environment.yml	Initial project commit	2020-01-20 17:15:09 -05:00
metalwoz_dataspec.json	More updates for release	2020-02-18 13:16:15 -05:00
requirements.txt	Initial project commit	2020-01-20 17:15:09 -05:00
setup.cfg	Initial project commit	2020-01-20 17:15:09 -05:00
setup.py	Initial project commit	2020-01-20 17:15:09 -05:00

README.md

GRTr: Generative-Retrieval Transformers

Code for the paper "Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation" [slides].

By Igor Shalyminov, Alessandro Sordoni, Adam Atkinson, Hannes Schulz.

Installation

$ cd code-directory
$ git submodule init
$ git submodule update
$ conda create -n hybrid_retgen python=3.7
$ conda activate hybrid_retgen
$ conda install cython
$ pip install -e ./dstc8-metalearn-baseline
$ pip install -e .

For mixed precision training:

$ pip install git+https://github.com/nvidia/apex

Datasets and experimental setup

Training a base GPT-2 model on MetaLWOz

$ python scripts/train <MetaLWOz zipfile> metalwoz_dataspec.json --dataset_cache cache exp/grtr --train_batch_size 4 --valid_batch_size 4 --early_stopping_after -1 --n_epochs 25

Add --fp16 O1 to use mixed precision training.

Predictions

generate-and-rank

python scripts/predict_generate_and_rank <MetaLWOz/MultiWoz zipfile> <testspec json> <output dir> <base GPT-2 model dir> --fine-tune --dataset_cache cache exp/grtr --train_batch_size 4 --valid_batch_size 4

generate only

python scripts/predict <MetaLWOz/MultiWoz zipfile> <testspec json> <output dir> <base GPT-2 model dir> --fine-tune --dataset_cache cache exp/grtr --train_batch_size 4 --valid_batch_size 4

Convenience bash scripts are provided in scripts/ to produce predictions for each of the three test specs.

Evaluation can be done using the evaluate script in the competition baseline repository.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.