Natural Language Processing Best Practices & Examples

azure-ml best-practices deep-learning machine-learning mlflow natural-language natural-language-inference natural-language-processing natural-language-understanding nli nlp nlu pretrained-models sota text text-classification transfomer

Перейти к файлу

Liqun Shao a7e0555235 fix the path		2019-06-17 16:57:09 -04:00
.ci	source activate	2019-06-11 13:56:57 -04:00
.github	issue template	2019-05-14 12:21:40 +01:00
benchmarks	Intial commit to put the receipe template in	2019-04-05 13:55:58 -04:00
docs	Intial commit to put the receipe template in	2019-04-05 13:55:58 -04:00
scenarios	add outputs, fix typos and shorten some texts, move all the imports to the top	2019-06-17 16:57:09 -04:00
tests	Added fixtures to ner test code.	2019-06-14 23:03:50 +00:00
tools	Resolved conflict and merged staging.	2019-06-17 12:01:10 -04:00
utils_nlp	fix the path	2019-06-17 16:57:09 -04:00
.amlignore	Added AML Ignore	2019-06-17 16:57:08 -04:00
.flake8	Rijai reposetup (#1 )	2019-04-05 19:01:56 -04:00
.gitignore	gitignore and conda file	2019-06-04 14:51:35 +01:00
.pre-commit-config.yaml	Changed python version in pre-commit-config back to 3.6	2019-06-13 14:46:57 -04:00
AUTHORS.md	Intial commit to put the receipe template in	2019-04-05 13:55:58 -04:00
CONTRIBUTING.md	Rijai reposetup (#1 )	2019-04-05 19:01:56 -04:00
LICENSE	Intial commit to put the receipe template in	2019-04-05 13:55:58 -04:00
README.md	documentation additions	2019-06-14 16:29:32 -04:00
SETUP.md	documentation additions	2019-06-14 16:29:32 -04:00
pyproject.toml	Rijai reposetup (#1 )	2019-04-05 19:01:56 -04:00

README.md

Branch	Status		Branch	Status
master			staging

NLP Best Practices

This repository contains examples and best practices for building NLP systems, provided as Jupyter notebooks and utility functions. The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language.

The following section includes a list of the available scenarios. Each scenario is demonstrated in one or more Jupyter notebook examples that make use of the core code base of models and utilities.

Scenarios

Scenario	Applications	Languages	Models
Text Classification	Topic Classification	en, zh, ar	BERT
Named Entity Recognition	Wikipedia NER	en, zh	BERT
Sentence Similarity	STS Benchmark	en	Representation: TF-IDF, Word Embeddings, Doc Embeddings Metrics: Cosine Similarity, Word Mover's Distance
Embeddings	Custom Embeddings Training	en	Word2Vec fastText GloVe

Planning

All feature planning is done via projects, milestones, and issues in this repository.

Getting Started

To get started, navigate to the Setup Guide, where you'll find instructions on how to setup your environment and dependencies.

Contributing

This project welcomes contributions and suggestions. Before contributing, please see our contribution guidelines.