kaldi/egs/wsj
Dan Povey 4377ba39d1 Changes to READMEs
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@1356 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2012-09-16 17:34:52 +00:00
..
s1 Added basis-fmllr to RESULTS file in wsj/s1 2012-06-30 19:46:06 +00:00
s2 Merging various small script changes from sandbox 2012-09-09 19:29:17 +00:00
s3 revert some changes 2012-08-23 12:07:41 +00:00
s5 Minor fixes: improve diagnostics of compute-cmvn-stats; fix to run.sh in swbd setup. 2012-09-16 04:48:16 +00:00
README.txt Changes to READMEs 2012-09-16 17:34:52 +00:00

README.txt

About the Wall Street Journal corpus:
    This is a corpus of read
    sentences from the Wall Street Journal, recorded under clean conditions.
    The vocabulary is quite large.   About 80 hours of training data.
    Available from the LDC as either: [ catalog numbers LDC93S6A (WSJ0) and LDC94S13A (WSJ1) ]
    or: [ catalog numbers LDC93S6B (WSJ0) and LDC94S13B (WSJ1) ]
    The latter option is cheaper and includes only the Sennheiser
    microphone data (which is all we use in the example scripts).


Each subdirectory of this directory contains the
scripts for a sequence of experiments.  Note: s3 is the "default" set of
scripts at the moment.

  s1: This setup is experiments with GMM-based systems with various 
      Maximum Likelihood 
      techniques including global and speaker-specific transforms.
      See a parallel setup in ../rm/s1 

  s2: This setup is experiments with pure hybrid system as well 
      as with Tandem bottleneck feature system.

  s3: This is the "new-style" recipe, now superseded by s5.

  s5: This is the newest-style recipe.  It should run, although the RESULTS
      file is not finalized.