зеркало из https://github.com/mozilla/kaldi.git
24 строки
825 B
Plaintext
24 строки
825 B
Plaintext
About the Fisher-English corpus
|
|
|
|
This is conversational telephone speech collected as 2-channel, 8kHz-sampled
|
|
data. The data is similar to Switchboard but the transcription was mostly
|
|
done in a "faster", lower-quality way.
|
|
|
|
Fisher comes in two parts, and the text and speech have separate LDC numbers.
|
|
This recipe uses both parts. The LDC numbers are
|
|
|
|
The speech: LDC2004S13, LDC2005S13
|
|
The text: LDC2004T19, LDC2005T19
|
|
|
|
|
|
Each subdirectory of this directory contains the
|
|
scripts for a sequence of experiments.
|
|
|
|
s5: This recipe is being worked on, it has the initial stages of
|
|
training ready. Note that the data normalization is not compatible
|
|
with our Switchboard setup, we have retained the conventions
|
|
of the Fisher corpus, e.g. lower-case, and acronyms like c._n._n.
|
|
|
|
|
|
|