Граф коммитов

2915 Коммитов

Автор SHA1 Сообщение Дата
Jan Trmal 7083a00b34 (trunk/babel/s5b) Removing one more old BNF script
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3656 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-03 00:38:09 +00:00
Jan Trmal fbc7c13d55 (trunk/babel/s5b) Adding a new semi-supervised training script and removing old BNF scripts
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3655 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-03 00:37:09 +00:00
Jan Trmal 5c63969966 (trunk/babel/s5b) Lots of fixes around the files. Removed the old script for semisupervised training, as we already have a new one
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3653 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-02 22:36:48 +00:00
Vimal Manohar 1bc6f3821d trunk/egs/babel/s5b: Modified conf file for semi-supervised training of DNN
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3652 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-02 16:26:59 +00:00
Vimal Manohar c1ace00d33 trunk/egs/babel/s5b: Minor modifications of commands in script run-6-semisupervised-seg.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3651 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-02 15:47:16 +00:00
Jan Trmal 3ec5bf91ca (trunk/babel/s5b) Changing configs to contain link to the RADICAL UEM segmentation
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3648 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-02 03:28:18 +00:00
Dan Povey dae25f7041 trunk: BABEL script: removing some old, unused scripts.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3647 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-02 00:09:21 +00:00
Guoguo Chen 390127bbb3 Small fix to proxy keywords generation to avoid blowup.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3643 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-01 05:17:34 +00:00
Jan Trmal 4b2173c845 (trunk/babel/s5b) Fixes in the Hatian eval descriptions
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3640 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-03-01 00:01:36 +00:00
Jan Trmal 6b75ec4806 (trunk/babel/s5b) Adding configs for eval and shadow datasets.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3637 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 22:07:07 +00:00
Chao Weng bf746bd057 fixing chime run.sh after nnet scripts moved
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3636 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 21:47:32 +00:00
Chao Weng 004b22be57 fixing aurora run.sh after nnet scripts moved
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3635 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 21:38:30 +00:00
Karel Vesely 0a2102e2b9 trunk,nnet: removing svn:mime_type property
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3631 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 17:58:59 +00:00
Guoguo Chen 2c50f1d052 Adding minimization to phone determinization, which will speed up the downstream operations on the determinized lattices.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3630 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 17:57:30 +00:00
Karel Vesely 50517c2f19 trunk,nnet: WARNING! MOVING KARELS NN SCRIPTS! THIS CAN BREAK YOUR CUSTOM RECIPES!
* moving scripts related to Karel's DNN recipe to from steps/* to steps/nnet/*,
* updating the DNN recipes (rm,wsj,swbd), new features:
** can be resumed by --stage,
** renaming experiments to exp/dnn*,
** using scripts at its new location,

We are syncing the script directory structure to be same as in Dan's recipe,
if you encounter any difficulties, please let me know.



git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3629 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 17:56:03 +00:00
Vimal Manohar c7443c8f26 trunk: Minor fix in script egs/babel/s5b/local/combine_posteriors.sh to handle single decode directory. Minor fix in initialization of vectors in src/bin/vector-sum.cc.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3628 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-28 17:16:50 +00:00
Vimal Manohar c7a26d90e4 trunk/egs/babel: Adding a script steps/nnet2/update_nnet.sh that updates an existing nnet model without initializing it. Minor fix added in script run-6-semisupervised-seg.sh to use update_nnet.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3616 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-27 21:22:40 +00:00
Vimal Manohar 417778ab25 trunk/egs/babel/steps/nnet2: Modified get_lda.sh and train_pnorm.sh to optionally use FMLLR transforms from a directory different from the alignment directory
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3614 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-27 20:28:07 +00:00
Vassil Panayotov 6701a5ca18 trunk/egs: Small fix to steps/make_denlats_nnet.sh (thanks to Feiteng Li)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3613 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-27 06:10:56 +00:00
Ho Yin Chan 1b53a9c232 trunk:egs readme update on the removal of s5b
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3612 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-27 03:46:35 +00:00
Ho Yin Chan 19107f6556 trunk:src remove since not all data are released
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3611 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-27 03:34:17 +00:00
Ondrej Platek f55e8d0f42 fix '-ldl' error specific to gcc-4.6.3, ubuntu 12.04 by changing fst rule in tools/Makefile
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3609 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-26 23:10:11 +00:00
Dan Povey a9396adaf6 trunk: remove limitation on split not-shared tree building; remove some stuff that was never finished RE AWS; minor changes to run.sh, cosmetic.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3600 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-26 03:06:42 +00:00
Guoguo Chen ca07e7b7ee Added determinization after KxL2xE for proxy keywords; helps the speed, but not fast enough for the 1million lexicon
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3594 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-25 17:48:59 +00:00
Guoguo Chen 725f4abd68 WARNING: this change list changed GauPost to GaussPost, and also made it pdf-id indexed instead of transition-id indexed. Details: 1) Modified those posterior related programs that use TransitionIdToPdf to get a pdf-id, and later on only use the pdf-id. We merge the posteriors that corresponds to the same pdf-id to avoid redundant computation. 2) Modified phone lattice determinization, added a wrapper for the lattice type determinization to reduce redundant code in the decoding binaries.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3588 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-25 04:00:06 +00:00
Ho Yin Chan cd36df98b5 trunk:src remove unused file
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3587 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-24 22:41:27 +00:00
Dan Povey 24310794f9 trunk: minor change to some functions manipulating posteriors, for efficiency; efficiency improvements in apply-cmvn-sliding; cosmetic changes.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3585 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-24 22:16:33 +00:00
Vimal Manohar 0db68a27f5 trunk/egs/babel: Bug fix in script for getting examples for semi-supervised training of DNN - get_egs_semi_supervised.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3579 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-24 16:48:39 +00:00
Vassil Panayotov 0103c446dc trunk/egs/voxforge/s5: Small changes to improve acoustic model training.
Fixes the issues outlined in:
http://sourceforge.net/p/kaldi/discussion/1355348/thread/4db46866/#941f/4f9c/cf07

The test time LM is now built on all available transcripts, including those for
the test utterances, which is not the correct way to evaluate a system, but
for this particular recipe I don't think it matters much.


git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3572 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-23 12:19:49 +00:00
Ho Yin Chan 7ea3dff539 trunk:src not ready, nnet/nnet-cache-tgtmat.h not exist
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3571 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-23 10:15:28 +00:00
Dan Povey a2a1b3d0c3 trunk: renaming --zero-if-disjoint option to --drop-frames in calling scripts (should have been done a long time ago, when I renamed the option)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3567 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 19:52:26 +00:00
Vimal Manohar c191a76f82 trunk/egs/babel: Minor fix in segmentation script generate_segments.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3566 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 18:13:33 +00:00
Vimal Manohar 6a2db5e1b7 trunk/egs/babel: Minor change in segmentation.py to take python location from the environment
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3565 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 16:41:43 +00:00
Karel Vesely 6d45c140a6 trunk: disabling some tests, which are not ready yet
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3564 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 12:39:36 +00:00
Dan Povey b42b9b733f trunk: BABEL: changing default acoustic scale for decoding bottleneck system (thanks to Pegah Ghahremani); minor fix in train_deltas.sh (thanks to Simon Kluepfel)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3562 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 04:19:14 +00:00
Dan Povey efb56ad7d7 trunk: Minor, mostly cosmetic fixes
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3559 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-21 03:47:30 +00:00
Vimal Manohar 22285c7d2c trunk/egs/babel: Minor change in segmentation script generate_segments.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3557 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 23:18:52 +00:00
Jan Trmal ebdb816f84 (trunk/babel/s5b) Adding unsup dataset (dataset for unsupervised/semisupervised training)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3556 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 22:35:14 +00:00
Vimal Manohar 6f388b925d trunk: minor fix to scripts train_pnorm.sh and train_pnorn_ensemble.sh to work with non-GMM input model
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3553 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 18:07:31 +00:00
Jan Trmal 9692cf08a5 (trunk/babel/s5b) Fix the bug accidentaly introduced when renaming the bnf directories (removing the _app) suffix
Because the param was stored in the plp directory and the way the param names 
are constructed, this lead to overwriting the PLP parametrization of the same 
dataset. Now the BNF param is stored in the directory param_bnf (plp_bnf seemed
silly) 


git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3552 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 17:51:10 +00:00
Karel Vesely a6688e9daa trunk,nnet: updating the codebase
- removing 'Cache' class, replaced by 'Randomizer'
- removing obsolete training tools using 'Cache' class
- adding 2d convolutional components from Harish
- removing scaled inv-prior-weighted cross-entropy Class, not helpful




git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3551 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 10:55:18 +00:00
Vimal Manohar 3524adaac2 (trunk)
Updating semi-supervised DNN training scripts and associated programs in trunk.


git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3550 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 07:28:08 +00:00
Xiaohui Zhang 7318249685 fixed a typo in run-2a-nnet-ensemble.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3549 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 06:44:10 +00:00
Jan Trmal d1eb28997b (trunk/babel/s4b) Added support for data sources specified using multiple locations and filelists
This tremendously simplifies handling of untranscribed data and the shadow set processing
In the longer term, we will be able to remove the shadow-set specific code paths in decoding.
For example, how to specify the  data source using multple locations, see the example config
that is part of this commit


git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3546 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-20 06:07:12 +00:00
Dan Povey 717ad1c436 trunk: Change name of program sum-vecs to vector-sum (was not used, but is now needed; new name is more discoverable). Minor changes in usage messages
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3541 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-19 00:08:45 +00:00
Vimal Manohar d081efc2be (trunk)
Adding scripts and source files for semi-supervised DNN training to trunk 
The scripts affect only the Babel recipe.
Adds additional programs in src/bin and src/nnet2bin to get examples
for semi-supervised training.


git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3540 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-18 23:58:34 +00:00
Jan Trmal f9c8bdc451 (trunk/babel/s5b) Additional fixes in the new decoding pipeline +improvements for the combination scripts.
The kws_combine.sh now supports supplying individual weights of the systems (before this change, all the systems
were taken with the same weight). The score_combine.sh now properly filters <hes> tag (which is marked optionable
deletable in the eval documentation.



git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3538 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-18 21:06:20 +00:00
Jan Trmal 228a61d486 (trunk/babel/s5b) Mostly fixes to the new decoding script. New feature: added OOV search.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3537 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-18 20:54:57 +00:00
Dan Povey 7ea7b9f2c7 trunk: Script fix in train_quick.sh (thanks to Simon Klupfel); and clarifications about how to turn off CMN in scripts.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3536 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-18 19:40:09 +00:00
Vimal Manohar 1e2af4f0ef Segmentation scripts
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3535 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
2014-02-18 03:40:48 +00:00