kaldi/egs/wsj
Daniel Povey 9aee0f66bf script-level support for background-thread reading of nnet training examples 2016-03-16 17:23:41 -04:00
..
s5 script-level support for background-thread reading of nnet training examples 2016-03-16 17:23:41 -04:00
README.txt trunk: MAJOR COMMIT: Removing deprecated or little-used programs and scripts, which will now reside only in branches/complete 2013-12-16 22:33:32 +00:00

README.txt

About the Wall Street Journal corpus:
    This is a corpus of read
    sentences from the Wall Street Journal, recorded under clean conditions.
    The vocabulary is quite large.   About 80 hours of training data.
    Available from the LDC as either: [ catalog numbers LDC93S6A (WSJ0) and LDC94S13A (WSJ1) ]
    or: [ catalog numbers LDC93S6B (WSJ0) and LDC94S13B (WSJ1) ]
    The latter option is cheaper and includes only the Sennheiser
    microphone data (which is all we use in the example scripts).

Each subdirectory of this directory contains the
scripts for a sequence of experiments.  [note: most of the older
example scripts have been deleted, but are still available at
^/branches/complete].

  s5: This is the current recommended recipe.