386381caf0
Co-authored-by: Michael Wang (Centific Technologies Inc) <v-michwang@microsoft.com> |
||
---|---|---|
.. | ||
libraries | ||
samples | ||
tests | ||
.gitignore | ||
.pre-commit-config.yaml | ||
.pylintrc | ||
CreateAllPackages.cmd | ||
README.md | ||
build.ci.cmd | ||
build.cmd | ||
build.sh | ||
requirements.txt | ||
setup.cfg | ||
setup.py |
README.md
Microsoft.Recognizers.Text for Python
Getting Started
Recognizer's are organized into groups and designed to be used in C#, Node.js, Python and Java to help you build great applications! To use the samples clone our GitHub repository using Git.
Cloning and building the Repository
git clone https://github.com/Microsoft/Recognizers-Text.git
cd Recognizers-Text
Manual Build
Open a terminal and run the following commands:
cd python/libraries/resource-generator
pip install -r .\requirements.txt
python index.py ..\recognizers-number\resource-definitions.json
python index.py ..\recognizers-number-with-unit\resource-definitions.json
python index.py ..\recognizers-date-time\resource-definitions.json
You can then install each of the local packages:
pip install -e .\libraries\recognizers-text\
pip install -e .\libraries\recognizers-number\
pip install -e .\libraries\recognizers-number-with-unit\
pip install -e .\libraries\recognizers-date-time\
To run tests:
pytest --tb=line
Automatized Build
Launch Build.cmd
file to install requirements, generate resources, install local packages and run all tests.
Installation from PyPI
Install Recognizer's by launching the following commands:
-
Get the numbers Recognizer's features:
pip install recognizers-text-number
-
Get the number with unit Recognizer's features:
pip install recognizers-text-number-with-unit
-
Get the date time Recognizer's features:
pip install recognizers-text-date-time
Or install Recognizer's suite with the following command:
pip install recognizers-text-suite
API Documentation
Once the proper package is installed, you'll need to reference the package:
from recognizers_text import Culture, ModelResult
from recognizers_number import NumberRecognizer
from recognizers_number_with_unit import NumberWithUnitRecognizer
from recognizers_date_time import DateTimeRecognizer
from recognizers_sequence import SequenceRecognizer
Or, using the suite package:
import recognizers_suite
Recognizer's Models
This is the preferred way if you need to parse multiple inputs based on the same context (e.g.: language and options):
recognizer = NumberRecognizer(Culture.English)
model = recognizer.get_number_model()
result = model.parse('Twelve')
Or, for less verbosity, you use the helper methods:
from recognizers_number import recognize_number, Culture
result = recognize_number("Twelve", Culture.English)
Internally, both methods will cache the instance models to avoid extra costs.
Microsoft.Recognizers.Text.Number
-
Numbers
This recognizer will find any number from the input. E.g. "I have two apples" will return "2".
recognize_number('I have two apples', Culture.English)
Or you can obtain a model instance using:
NumberRecognizer(Culture.English).get_number_model()
-
Ordinal Numbers
This recognizer will find any ordinal number. E.g. "eleventh" will return "11".
recognize_ordinal('eleventh', Culture.English)
Or you can obtain a model instance using:
NumberRecognizer(Culture.English).get_ordinal_model()
-
Percentages
This recognizer will find any number presented as percentage. E.g. "one hundred percents" will return "100%".
recognize_percentage('one hundred percents', Culture.English))
Or you can obtain a model instance using:
NumberRecognizer(Culture.English).get_percentage_model()
Microsoft.Recognizers.Text.NumberWithUnit
-
Ages
This recognizer will find any age number presented. E.g. "After ninety five years of age, perspectives change" will return "95 Year".
recognize_age('After ninety five years of age, perspectives change', Culture.English)
Or you can obtain a model instance using:
NumberWithUnitRecognizer(Culture.English).get_age_model()
-
Currencies
This recognizer will find any currency presented. E.g. "Interest expense in the 1988 third quarter was $ 75.3 million" will return "75300000 Dollar".
recognize_currency('Interest expense in the 1988 third quarter was $ 75.3 million', Culture.English)
Or you can obtain a model instance using:
NumberWithUnitRecognizer(Culture.English).get_currency_model()
-
Dimensions
This recognizer will find any dimension presented. E.g. "The six-mile trip to my airport hotel that had taken 20 minutes earlier in the day took more than three hours." will return "6 Mile".
recognize_dimension('The six-mile trip to my airport hotel that had taken 20 minutes earlier in the day took more than three hours.', Culture.English)
Or you can obtain a model instance using:
NumberWithUnitRecognizer(Culture.English).get_dimension_model()
-
Temperatures
This recognizer will find any temperature presented. E.g. "Set the temperature to 30 degrees celsius" will return "30 C".
recognize_temperature('Set the temperature to 30 degrees celsius', Culture.English)
Or you can obtain a model instance using:
NumberWithUnitRecognizer(Culture.English).get_temperature_model()
Microsoft.Recognizers.Text.DateTime
-
DateTime
This recognizer will find any date, time, duration and date/time ranges, even if its write in colloquial language. E.g. "I'll go back 8pm today" will return "2017-10-04 20:00:00".
recognize_datetime("I'll go back 8pm today", Recognizers.Culture.English)
Or you can obtain a model instance using:
DateTimeRecognizer(Recognizers.Culture.English).get_datetime_model()
Samples
Integration tips
The Recognizers aim to bridge people's spoken language and machine's programming languages. As such, Recognizers were designed to facilitate growing the number of supported cultures (i.e. spoken languages) and platforms (i.e. programming languages.)
With this goal in mind, they are designed to disjoint the specific culture's logic from the recognizer's core implementation. A shared set of tools are available at the heart of a cross-culture & cross-platform approach that will help with extending the number and range of the recognizers.
To achieve this, the recognizers contains the following folders: