vert-papers/papers/SubTagger
Börje Karlsson 0355993660 Adding missing license files for repo and per paper directory 2023-03-16 18:41:33 +08:00
..
configs Renaming paper code directory. 2021-07-02 10:51:53 +08:00
dataset_readers Renaming paper code directory. 2021-07-02 10:51:53 +08:00
fig Renaming paper code directory. 2021-07-02 10:51:53 +08:00
metrics Renaming paper code directory. 2021-07-02 10:51:53 +08:00
models Renaming paper code directory. 2021-07-02 10:51:53 +08:00
modules Renaming paper code directory. 2021-07-02 10:51:53 +08:00
vocabulary/SoftDictHSCRF_vocabulary Renaming paper code directory. 2021-07-02 10:51:53 +08:00
LICENSE Adding missing license files for repo and per paper directory 2023-03-16 18:41:33 +08:00
README.md Updating paper description 2021-07-02 10:58:42 +08:00
requirements_pip.txt Renaming paper code directory. 2021-07-02 10:51:53 +08:00

README.md

Towards Improving Neural Named Entity Recognition with Gazetteers

This repository contains the open-sourced official implementation of our soft dictionary-enhanced NER model paper:

Towards Improving Neural Named Entity Recognition with Gazetteers (ACL 2019).
Tianyu Liu, Jin-Ge Yao, and Chin-Yew Lin

If you find this repo helpful, please cite either of the following versions of the paper:

@inproceedings{liu-etal-2019-towards,
    title = {Towards Improving Neural Named Entity Recognition with Gazetteers},
    author = {Tianyu Liu and Jin-Ge Yao and Chin-Yew Lin},
    booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
    year = 2019,
    address = {Florence, Italy},
    publisher = {Association for Computational Linguistics},
    url = {https://aclanthology.org/P19-1524},
    doi = {10.18653/v1/P19-1524},
    pages = {5301--5307},
}

Overall architecture

Installation

First of all:

   git clone https://github.com/microsoft/vert-papers.git ; cd papers/SubTagger
  1. Create a virtual environment with Conda
    conda create -n softdict --file requirements_conda.txt -c conda-forge/label/broken -c conda-forge
  1. Activate the new environment
    conda activate softdict
  1. Install the pip requirements
    pip install -r requirements_pip.txt
  1. Prepare the configurations
    sed -i 's@INSTALLATION_DIR@'"$PWD"'@' configs/*.config

Training

    allennlp train configs/HSCRF_softDictionary.conll2003.config -s dump_directory/ --include-package models 

Evaluating

    allennlp evaluate dump_directory/model.tar.gz https://www.jeffliu.page/files/DATA/conll2003/test.txt --include-package models    

The Gazetteer

The gazetteers, and the pretrained subtagger module can be found here