Ramp up your custom natural language processing (NLP) task, allowing you to bring your own data, use your preferred frameworks and bring models into production.
Перейти к файлу
Martin Kayser 00825c3109 Adding summariration, minor fixes 2020-04-17 15:00:12 +02:00
assets updated end2end finetuned 2020-03-27 13:46:38 +01:00
code Adding summariration, minor fixes 2020-04-17 15:00:12 +02:00
demo Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
deploy Adding summariration, minor fixes 2020-04-17 15:00:12 +02:00
notebook Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
pipeline Update README.md 2020-04-01 09:21:46 +02:00
project distilbert for multi-lang 2020-03-08 16:24:41 +01:00
scraper training and deployment pipeline 2020-02-18 14:43:58 +01:00
tests first push 2020-01-24 19:07:28 +01:00
.amlignore updated end2end finetuned 2020-03-27 13:46:38 +01:00
.flake8 first push 2020-01-24 19:07:28 +01:00
.gitignore Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
CODE_OF_CONDUCT.md Initial CODE_OF_CONDUCT.md commit 2020-01-24 05:53:48 -08:00
LICENSE Initial LICENSE commit 2020-01-24 05:53:46 -08:00
README.md Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
SECURITY.md Initial SECURITY.md commit 2020-01-24 05:53:50 -08:00
config.sample.ini Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00
environment.yml Update Verseagility to v0.3 2020-04-02 23:10:13 +02:00

README.md





NLP Toolkit

See the wiki for detailed documentation.

Supported Use cases

  • Binary, multi-class & multi-label classification
  • Named entity recognition
  • Question answering

Live Demo

https://aka.ms/nlp-demo

Naming

Azure

nlp-<component>-<environment>

Assets

<project name>(-<task>)-<step>(-<environment>)

  • where step in [source, train, deploy], for data assets.
  • where task is an int, referring to the parameters, for models.

TODO

Project

  • Overview architecture
  • Detailed documentation

Prepare

  • connect to CosmosDB (pipeline ready)
  • document cracking to standardized format

Classification

  • multi label support
  • integrate handling for larger documents
  • dictionaries for business logic
  • integrate handling for unbalanced datasets
  • upload best model to AML Model

NER

  • Improve duplicate handling
  • basic custom NER

Rank

  • (IP) Improve answer quality

Deployment

  • Param script for deploy
  • Deploy to Azure Function (without AzureML)

Notebooks

  • review prepared data
  • (IP) review model results (auto generate after each training step)
  • review model bias (auto generate after each training step)
  • available models benchmark (incl AutoML)

Tests

  • unit tests (pytest)
  • automated benchmarks

New Features (TBD)

  • Summarization
  • Deployable feedback loop
  • Integration with GitHub Actions

Acknowledgements

Verseagility is built in part using the following:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.