DeepSpeech

История

Alexandre Lissy bf5ae9cf8a Fix #3299 : Build KenLM on CI		2020-09-25 13:25:38 +02:00
..
lm	Remove external scorer file and documentation and flag references	2020-07-27 21:09:32 +02:00
smoke_test	Fix #3299 : Build KenLM on CI	2020-09-25 13:25:38 +02:00
ted	Merge of pull requests #49 , #50 , and #52 . Fixes issues #2 , #4 , #11 , #12 , #46 , #47 , and #48	2016-10-13 15:15:39 -04:00
README.rst	Revert "Merge pull request #3237 from lissyx/rename-training-package"	2020-08-26 11:46:08 +02:00
alphabet.txt	Support custom alphabet mappings (Fixes #692 ) (#797 )	2017-08-31 11:51:15 +02:00

README.rst

Language-Specific Data
======================

This directory contains language-specific data files. Most importantly, you will find here:

1. A list of unique characters for the target language (e.g. English) in ``data/alphabet.txt``. After installing the training code, you can check ``python -m deepspeech_training.util.check_characters --help`` for a tool that creates an alphabet file from a list of training CSV files.

2. A script used to generate a binary n-gram language model: ``data/lm/generate_lm.py``.

For more information on how to build these resources from scratch, see the ``External scorer scripts`` section on `deepspeech.readthedocs.io <https://deepspeech.readthedocs.io/>`_.