Hunyang/oc orchestrator nlr version document entity and multilingual models (#6194)

* nlr version 0.2 with additional base models

* new line at the end of a json file

* no newline at the end of file

* add newline at the end of files

* preview to experimental

* editing
This commit is contained in:
Hung-chih Yang 2021-02-08 10:18:16 -08:00 коммит произвёл GitHub
Родитель b3eb60aa67
Коммит 985b056829
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
5 изменённых файлов: 222 добавлений и 22 удалений

Просмотреть файл

@ -14,7 +14,7 @@ Orchestrator is an LU solution optimized for conversational AI applications. It
**Intent recognition**: You can use Orchestrator as an intent recognizer with [adaptive dialogs][6] to route user input to an appropriate skill or sub-component.
**Entity extraction** is not yet supported.
**Entity extraction** is currently experimental and not yet for production use.
## Authoring experience

Просмотреть файл

@ -1,19 +1,74 @@
# Prebuilt Language Models
Prebuilt language models have been trained towards more sophisticated tasks for both monolingual as well as multilingual scenarios. In public preview only English models are made available.
Prebuilt language models have been trained towards more sophisticated tasks for both monolingual as well as multilingual scenarios, including intent prediction and entity extraction.
Entity extraction is currently experimental and not yet for production use.
## Models Description
The public preview of Orchestrator includes the following prebuilt language models available in [versions repository][2].
### pretrained.20200924.microsoft.dte.00.03.en.onnx
This is a fast and small base model with sufficient accuracy but if the accuracy and not speed and memory size is critical then consider other options. It is a 3-layer pretrained BERT model optimized for conversation for example-based use ([KNN][3]).
## Default Models
### pretrained.20200924.microsoft.dte.00.06.en.onnx
This is a high quality base model that strikes the balance between size, speed and accuracy. It is a 6-layer pretrained BERT model optimized for conversation for example-based use ([KNN][3]). This is the default model used if none explicitly specified.
This is a high quality EN-only base model for intent detection that strikes the balance between size,
speed and predictive performance.
It is a 6-layer pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]),
thus it can be used out of box. This is the default model used if none explicitly specified.
### pretrained.20201210.microsoft.dte.00.06.unicoder_multilingual.onnx
This is a high quality multilingual base model for intent detection. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
## Alternate Models
### pretrained.20200924.microsoft.dte.00.03.en.onnx
This is a fast and small EN-only base model for intent detection with sufficient prediction performance.
We suggest using this model if speed and memory size is critical to your deployment environment,
otherwise consider other options. It is a generic 3-layer pretrained
[Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20200924.microsoft.dte.00.12.en.onnx
This is a highest quality base model but is larger and slower than other options. It is a 12-layer pretrained BERT model optimized for conversation for example-based use ([KNN][3]).
This is a high quality EN-only base model for intent detection, but is larger and slower than other options.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20201210.microsoft.dte.00.12.unicoder_multilingual.onnx
This is a high quality multilingual base model for intent detection.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
## Experimental Models
### pretrained.20210205.microsoft.dte.00.12.bert_example_ner.en.onnx (experimental)
This is a high quality EN-only base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction.
It is a 12-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx (experimental)
This is a high quality EN-only base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
### pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx (experimental)
This is a high quality multilingual base model for entity extraction. It's smaller and faster than its 12-layer alternative.
It is a 6-layer pretrained pretrained [Transformer][7] model optimized for conversation.
Its architecture is pretrained for example-based use ([KNN][3]), thus it can be used out of box.
## Models Evaluation
For a more quantitative comparison analysis of the different models see the following performance characteristics.
@ -43,8 +98,6 @@ The following table shows how accurate is each model relative to provided traini
The models are released under the following [License Terms][6].
## References
* [UniLMv2 Paper][1]
@ -57,7 +110,7 @@ The models are released under the following [License Terms][6].
* [Snips NLU Metrics][5]
* [Transformer][7]
[1]: https://arxiv.org/abs/2002.12804 "UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training"
[2]: https://aka.ms/nlrversions
@ -65,6 +118,4 @@ The models are released under the following [License Terms][6].
[4]: https://github.com/snipsco/snips-nlu "Snips NLU"
[5]: https://github.com/snipsco/snips-nlu-metrics "Snips NLU Metrics"
[6]: ./LICENSE.md "License agreement"
[7]: https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)

Просмотреть файл

@ -145,4 +145,4 @@
}
},
"additionalProperties": true
}
}

Просмотреть файл

@ -2,7 +2,7 @@
"version": "0.2",
"default": {
"en_intent": "pretrained.20200924.microsoft.dte.00.06.en.onnx",
"multi_intent": "pretrained.20201210.microsoft.dte.00.12.unicoder_multilingual.onnx"
"multilingual_intent": "pretrained.20201210.microsoft.dte.00.12.unicoder_multilingual.onnx"
},
"models": {
"pretrained.20200924.microsoft.dte.00.03.en.onnx": {
@ -23,10 +23,10 @@
"description": "Bot Framework SDK release 4.10 - English ONNX V1.4 12-layer per-token intent base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20201012.microsoft.dte.00.12.bert_example_ner.en.onnx": {
"releaseDate": "10/12/2020",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20201012.microsoft.dte.00.12.bert_example_ner.en.onnx.zip",
"description": "Bot Framework SDK release 4.10 - English ONNX V1.4 12-layer per-token entity base model",
"pretrained.20210205.microsoft.dte.00.12.bert_example_ner.en.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.12.bert_example_ner.en.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - English ONNX V1.4 12-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20201210.microsoft.dte.00.12.unicoder_multilingual.onnx": {
@ -38,13 +38,38 @@
"pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx": {
"releaseDate": "01/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210105.microsoft.dte.00.12.bert_example_ner_multilingual.onnx.zip",
"description": "Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx": {
"releaseDate": "01/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210105.microsoft.dte.00.12.tulr_example_ner_multilingual.onnx.zip",
"description": "Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 12-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.bert_example_ner.en.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - English ONNX V1.4 6-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20201210.microsoft.dte.00.06.unicoder_multilingual.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.unicoder_multilingual.onnx.zip",
"description": "Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token intent base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.bert_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token entity base model",
"minSDKVersion": "4.10.0"
},
"pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx": {
"releaseDate": "02/05/2021",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20210205.microsoft.dte.00.06.tulr_example_ner_multilingual.onnx.zip",
"description": "(experimental) Bot Framework SDK release 4.10 - Multilingual ONNX V1.4 6-layer per-token entity base model",
"minSDKVersion": "4.10.0"
}
}
}
}

Просмотреть файл

@ -0,0 +1,124 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://schemas.botframework.com/schemas/orchestrator/v0.2/nlr_versions.schema",
"type": "object",
"title": "Orchestrator base model versions schema",
"description": "Orchestrator base model versions information",
"default": {},
"examples": [
{
"version": "0.2",
"models": {
"pretrained.20200924.microsoft.dte.00.06.en.onnx": {
"releaseDate": "09/24/2020",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20200924.microsoft.dte.00.06.en.onnx.zip",
"description": "Bot Framework SDK release 4.10 - English Onnx V1.4 6-layer per-token intent base model",
"minSDKVersion": "4.10.0"
}
}
}
],
"required": [
"version",
"models"
],
"properties": {
"version": {
"$id": "#/properties/version",
"type": "string",
"title": "The version schema",
"description": "Orchestrator base model schema version",
"default": "",
"examples": [
"0.2"
]
},
"models": {
"$id": "#/properties/models",
"type": "object",
"title": "Orchestrator models schema",
"description": "All Orchestrator base model",
"default": {},
"examples": [
{
"pretrained.20200924.microsoft.dte.00.06.en.onnx": {
"releaseDate": "09/24/2020",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20200924.microsoft.dte.00.06.en.onnx.zip",
"description": "Bot Framework SDK release 4.10 - English Onnx V1.4 6-layer per-token intent base model",
"minSDKVersion": "4.10.0"
}
}
],
"required": [
"1.0.0-pretrained.20200729.microsoft.dte.en.onnx",
],
"properties": {
"1.0.0-pretrained.20200729.microsoft.dte.en.onnx": {
"$id": "#/properties/models/properties/1.0.0-pretrained.20200729.microsoft.dte.en.onnx",
"type": "object",
"title": "The 1.0.0-pretrained.20200729.microsoft.dte.en.onnx schema",
"description": "An explanation about the purpose of this instance.",
"default": {},
"examples": [
{
"releaseDate": "09/24/2020",
"modelUri": "https://models.botframework.com/models/dte/onnx/pretrained.20200924.microsoft.dte.00.06.en.onnx.zip",
"description": "Bot Framework SDK release 4.10 - English Onnx V1.4 6-layer per-token intent base model",
"minSDKVersion": "4.10.0"
}
],
"required": [
"releaseDate",
"modelUri",
"description",
"minSDKVersion"
],
"properties": {
"releaseDate": {
"$id": "#/properties/models/properties/pretrained.20200924.microsoft.dte.00.06.en.onnx/properties/releaseDate",
"type": "string",
"title": "Model release date",
"description": "Model release date",
"default": "",
"examples": [
"09/24/2020"
]
},
"modelUri": {
"$id": "#/properties/models/properties/pretrained.20200924.microsoft.dte.00.06.en.onnx/properties/modelUri",
"type": "string",
"title": "Orchestrator model URL",
"description": "Orchestrator model URL",
"default": "",
"examples": [
"https://models.botframework.com/models/dte/onnx/pretrained.20200924.microsoft.dte.00.06.en.onnx.zip"
]
},
"description": {
"$id": "#/properties/models/properties/pretrained.20200924.microsoft.dte.00.06.en.onnx/properties/description",
"type": "string",
"title": "Orchestrator model description",
"description": "Orchestrator model description",
"default": "",
"examples": [
"Bot Framework SDK release 4.10 - English Onnx V1.4 6-layer per-token intent base model"
]
},
"minSDKVersion": {
"$id": "#/properties/models/properties/pretrained.20200924.microsoft.dte.00.06.en.onnx/properties/minSDKVersion",
"type": "string",
"title": "Bot Framework minimum SDK version",
"description": "Minimum SDK version required to work with this model",
"default": "",
"examples": [
"4.10.0"
]
}
},
"additionalProperties": true
}
"additionalProperties": true
}
},
"additionalProperties": true
}