Ramp up your custom natural language processing (NLP) task, allowing you to bring your own data, use your preferred frameworks and bring models into production.
Перейти к файлу
nonstoptimm 437675555b added new logo 2020-05-11 18:07:25 +02:00
assets NER Demo Update, sample tickets 2020-05-06 18:19:01 +02:00
code setup prepare.py for multi-label classification 2020-05-07 11:59:39 +02:00
demo added new logo 2020-05-11 18:05:40 +02:00
deploy NER Demo Update, sample tickets 2020-05-06 18:19:01 +02:00
notebook Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
pipeline Update README.md 2020-04-01 09:21:46 +02:00
project distilbert for multi-lang 2020-03-08 16:24:41 +01:00
scraper training and deployment pipeline 2020-02-18 14:43:58 +01:00
tests first push 2020-01-24 19:07:28 +01:00
.amlignore updated end2end finetuned 2020-03-27 13:46:38 +01:00
.flake8 first push 2020-01-24 19:07:28 +01:00
.gitignore Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
CODE_OF_CONDUCT.md Initial CODE_OF_CONDUCT.md commit 2020-01-24 05:53:48 -08:00
LICENSE Initial LICENSE commit 2020-01-24 05:53:46 -08:00
README.md added new logo 2020-05-11 18:07:25 +02:00
SECURITY.md Initial SECURITY.md commit 2020-01-24 05:53:50 -08:00
config.sample.ini Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
environment.yml v0.4 KM Ready 2020-04-20 23:51:15 +02:00

README.md



NLP Toolkit

See the wiki for detailed documentation.

Supported Use cases

  • Binary, multi-class & multi-label classification
  • Named entity recognition
  • Question answering

Live Demo

https://aka.ms/nlp-demo

Naming

Assets

<project name>(-<task>)-<step>(-<environment>)

  • where step in [source, train, deploy], for data assets.
  • where task is an int, referring to the parameters, for models.

TODO

Classification

  • (IP) multi label support
  • integrate handling for larger documents vs short documents
  • dictionaries for business logic
  • integrate handling for unbalanced datasets
  • upload best model to AML Model

NER

  • improve duplicate handling
  • basic custom NER

Rank

  • (IP) Improve answer quality

Deployment

  • Param script for deploy
  • Deploy to Azure Function (without AzureML)

Notebooks

  • review prepared data
  • (IP) review model results (auto generate after each training step)
  • review model bias (auto generate after each training step)
  • available models benchmark (incl AutoML)

Tests

  • unit tests (pytest)
  • automated benchmarks

New Features (TBD)

  • Summarization
  • Deployable feedback loop
  • Integration with GitHub Actions

Acknowledgements

Verseagility is built in part using the following:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.