kaldi

История

Dan Povey 45e4d2f0a0 Minor fixes to SGMM training. git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@1362 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8		2012-09-17 18:21:40 +00:00
..
s3	Very small cosmetic changes to some scripts.	2012-05-09 18:14:42 +00:00
s5	Minor fixes to SGMM training.	2012-09-17 18:21:40 +00:00
README.txt	Changes to READMEs	2012-09-16 17:34:52 +00:00

README.txt

About the Switchboard corpus

    This is conversational telephone speech collected as 2-channel, 8kHz-sampled
    data.  We are using just the Switchboard-1 Phase 1 training data.
    The catalog number LDC97S62 (Switchboard-1 Release 2) corresponds, we believe,
    to what we have.  We also use the Mississippi State transcriptions, which
    we download separately from
    http://www.isip.piconepress.com/projects/switchboard/releases/switchboard_word_alignments.tar.gz


Each subdirectory of this directory contains the
scripts for a sequence of experiments.

  s3: 
   This Switchboard recipe is not fully state-of-the-art, mostly because
   the language model and dictionary do not use any external sources of
   data, just what is in the LDC corpus.  As a result there is a
   high-perplexity language model and small dictionary.

  s5: This is the "new-new-style" recipe.  
    All further work will be on top of this style of recipe.