kaldi/egs/fisher_english
Peng Qi 55e226c097 Minor bugfixes for fisher_fix_speakerid.pl
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@4258 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-08-06 01:12:34 +00:00
..
s5 Minor bugfixes for fisher_fix_speakerid.pl 2014-08-06 01:12:34 +00:00
README.txt Committing initial version Fisher-English example scripts; related utils/ script extensions. 2013-07-02 00:21:23 +00:00

README.txt

About the Fisher-English corpus

    This is conversational telephone speech collected as 2-channel, 8kHz-sampled
    data.  The data is similar to Switchboard but the transcription was mostly
    done in a "faster", lower-quality way.

    Fisher comes in two parts, and the text and speech have separate LDC numbers.
    This recipe uses both parts.  The LDC numbers are

    The speech: LDC2004S13, LDC2005S13
    The text: LDC2004T19, LDC2005T19
 

Each subdirectory of this directory contains the
scripts for a sequence of experiments.

  s5: This recipe is being worked on, it has the initial stages of
      training ready.  Note that the data normalization is not compatible
      with our Switchboard setup, we have retained the conventions
      of the Fisher corpus, e.g. lower-case, and acronyms like c._n._n.