3.0 KiB

Executable File

Исходник Ответственный История

Branch	Status		Branch	Status
master			staging

NLP Best Practices

This repository contains examples and best practices for building NLP systems, provided as Jupyter notebooks and utility functions. The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language.

The following section includes a list of the available scenarios. Each scenario is demonstrated in one or more Jupyter notebook examples that make use of the core code base of models and utilities.

Scenarios

Scenario	Applications	Languages	Models
Text Classification	Topic Classification	en, zh, ar	BERT
Named Entity Recognition	Wikipedia NER	en, zh	BERT
Sentence Similarity	STS Benchmark	en	Representation: TF-IDF, Word Embeddings, Doc Embeddings Metrics: Cosine Similarity, Word Mover's Distance
Embeddings	Custom Embeddings Training	en	Word2Vec fastText GloVe

Planning

All feature planning is done via projects, milestones, and issues in this repository.

Getting Started

To get started, navigate to the Setup Guide, where you'll find instructions on how to setup your environment and dependencies.

Contributing

This project welcomes contributions and suggestions. Before contributing, please see our contribution guidelines.