David Snyder
|
1298918014
|
sandbox/language_id: Adding logistic-regression.conf config file, for setting max_steps and normalizer for the classifier.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3660 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-03-03 06:48:10 +00:00 |
Dan Povey
|
c2397e1d3b
|
sandbox/lid: Extending scripts to run on test data.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3641 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-03-01 00:19:06 +00:00 |
Dan Povey
|
c437252251
|
sandbox/language_id: code and script updates for evaluating the model.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3639 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 23:33:06 +00:00 |
Dan Povey
|
45bb9d0168
|
sandbox/language_id: Various improvements to the scripts; more logging in the code; Makefile fix.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3638 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 22:43:17 +00:00 |
David Snyder
|
8095c577db
|
sandbox/language_id: Adding util for creating a language to integer id table. This is needed for the logistic regression.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3627 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 16:44:31 +00:00 |
David Snyder
|
36c22072f6
|
sandbox/language_id: Adding util for converting utt2lang to a file in which the lang is an integer label. Also adding a script for running logistic regression binaries, which is currently incomplete.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3626 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 16:43:04 +00:00 |
David Snyder
|
5eb0118b72
|
sandbox/language_id: Removing erroneous changes to run.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3624 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 09:00:05 +00:00 |
David Snyder
|
a87acef8ad
|
sandbox/language_id: Adding util script for generating a map between language names and an id.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3623 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 08:55:47 +00:00 |
David Snyder
|
86142f09d6
|
sandbox/language_id: Fixing previous commit: Adding logistic-regression-eval.cc and removing the corresponding binary
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3622 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 08:53:01 +00:00 |
David Snyder
|
52954e97fc
|
sandbox/language_id: Adding logistic-regression-eval for reading in a model trained by LogisticRegression and writing posteriors and scores.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3621 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 08:50:28 +00:00 |
David Snyder
|
684d6f63fb
|
sandbox/language_id: Adding binary logistic-regression-train for training a model on generic training data. However, at the moment this will be primarily used for language-id. Also adding a LogisticRegressionConfig for options.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3619 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 03:24:33 +00:00 |
David Snyder
|
bb9a9e5a6a
|
sandbox/language_id: Adding LogisticRegression class which uses L-BFGS internally. Logistic regression is the standard classifier for language-id.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3618 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-28 00:17:27 +00:00 |
Dan Povey
|
d4d8d230e9
|
sandbox/language_id: merging trunk, and also committing change to utils/subset_data_dir.sh which I had previously forgotten to commit (necessary for example scripts to work)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3617 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 23:02:24 +00:00 |
Vimal Manohar
|
c7a26d90e4
|
trunk/egs/babel: Adding a script steps/nnet2/update_nnet.sh that updates an existing nnet model without initializing it. Minor fix added in script run-6-semisupervised-seg.sh to use update_nnet.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3616 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 21:22:40 +00:00 |
Vimal Manohar
|
417778ab25
|
trunk/egs/babel/steps/nnet2: Modified get_lda.sh and train_pnorm.sh to optionally use FMLLR transforms from a directory different from the alignment directory
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3614 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 20:28:07 +00:00 |
Vassil Panayotov
|
6701a5ca18
|
trunk/egs: Small fix to steps/make_denlats_nnet.sh (thanks to Feiteng Li)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3613 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 06:10:56 +00:00 |
Ho Yin Chan
|
1b53a9c232
|
trunk:egs readme update on the removal of s5b
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3612 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 03:46:35 +00:00 |
Ho Yin Chan
|
19107f6556
|
trunk:src remove since not all data are released
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3611 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-27 03:34:17 +00:00 |
Ondrej Platek
|
f55e8d0f42
|
fix '-ldl' error specific to gcc-4.6.3, ubuntu 12.04 by changing fst rule in tools/Makefile
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3609 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-26 23:10:11 +00:00 |
Dan Povey
|
5e4f0372ad
|
sandbox/language_id: extension to subset_data.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3608 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-26 22:30:47 +00:00 |
Dan Povey
|
a9396adaf6
|
trunk: remove limitation on split not-shared tree building; remove some stuff that was never finished RE AWS; minor changes to run.sh, cosmetic.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3600 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-26 03:06:42 +00:00 |
Dan Povey
|
cd23ee8ec5
|
sandbox/language_id: modified script to set up train and test subsets.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3599 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-25 23:29:25 +00:00 |
Dan Povey
|
c62c4464a0
|
sandbox/language_id: script bug fix.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3598 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-25 22:34:00 +00:00 |
Guoguo Chen
|
ca07e7b7ee
|
Added determinization after KxL2xE for proxy keywords; helps the speed, but not fast enough for the 1million lexicon
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3594 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-25 17:48:59 +00:00 |
Dan Povey
|
f686098e30
|
Changing training to use subsets initially.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3589 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-25 05:58:48 +00:00 |
Guoguo Chen
|
725f4abd68
|
WARNING: this change list changed GauPost to GaussPost, and also made it pdf-id indexed instead of transition-id indexed. Details: 1) Modified those posterior related programs that use TransitionIdToPdf to get a pdf-id, and later on only use the pdf-id. We merge the posteriors that corresponds to the same pdf-id to avoid redundant computation. 2) Modified phone lattice determinization, added a wrapper for the lattice type determinization to reduce redundant code in the decoding binaries.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3588 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-25 04:00:06 +00:00 |
Ho Yin Chan
|
cd36df98b5
|
trunk:src remove unused file
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3587 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 22:41:27 +00:00 |
Dan Povey
|
d529ef5112
|
sandbox/language_id: merging changes from trunk
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3586 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 22:27:50 +00:00 |
Dan Povey
|
24310794f9
|
trunk: minor change to some functions manipulating posteriors, for efficiency; efficiency improvements in apply-cmvn-sliding; cosmetic changes.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3585 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 22:16:33 +00:00 |
David Snyder
|
90132a6b22
|
sandbox/language_id: Adding additional LDC training data.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3584 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 20:53:42 +00:00 |
Dan Povey
|
e7c733ad70
|
sandbox/language_id: further reversion to data-preparation scripts
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3583 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 20:47:18 +00:00 |
Dan Povey
|
d6ad2beb33
|
sandbox/language_id: fix data-prep script to account for last check-in.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3582 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 20:44:45 +00:00 |
Dan Povey
|
677a1d1677
|
sandbox/language_id: rationalize the way the utt2lang is treated.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3581 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 20:35:22 +00:00 |
Vimal Manohar
|
0db68a27f5
|
trunk/egs/babel: Bug fix in script for getting examples for semi-supervised training of DNN - get_egs_semi_supervised.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3579 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 16:48:39 +00:00 |
Nagendra Goel
|
8cd6ba8928
|
sort order fix
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3577 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-24 04:47:22 +00:00 |
David Snyder
|
4ed9fc8c17
|
sandbox/language_id: In run.sh now combines and uses all sre08 training data
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3573 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-23 17:39:58 +00:00 |
Vassil Panayotov
|
0103c446dc
|
trunk/egs/voxforge/s5: Small changes to improve acoustic model training.
Fixes the issues outlined in:
http://sourceforge.net/p/kaldi/discussion/1355348/thread/4db46866/#941f/4f9c/cf07
The test time LM is now built on all available transcripts, including those for
the test utterances, which is not the correct way to evaluate a system, but
for this particular recipe I don't think it matters much.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3572 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-23 12:19:49 +00:00 |
Ho Yin Chan
|
7ea3dff539
|
trunk:src not ready, nnet/nnet-cache-tgtmat.h not exist
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3571 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-23 10:15:28 +00:00 |
Dan Povey
|
a2a1b3d0c3
|
trunk: renaming --zero-if-disjoint option to --drop-frames in calling scripts (should have been done a long time ago, when I renamed the option)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3567 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 19:52:26 +00:00 |
Vimal Manohar
|
c191a76f82
|
trunk/egs/babel: Minor fix in segmentation script generate_segments.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3566 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 18:13:33 +00:00 |
Vimal Manohar
|
6a2db5e1b7
|
trunk/egs/babel: Minor change in segmentation.py to take python location from the environment
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3565 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 16:41:43 +00:00 |
Karel Vesely
|
6d45c140a6
|
trunk: disabling some tests, which are not ready yet
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3564 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 12:39:36 +00:00 |
Dan Povey
|
b42b9b733f
|
trunk: BABEL: changing default acoustic scale for decoding bottleneck system (thanks to Pegah Ghahremani); minor fix in train_deltas.sh (thanks to Simon Kluepfel)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3562 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 04:19:14 +00:00 |
David Snyder
|
dff8b5b00f
|
sandbox/language_id: Adding needed sort checks in utils/fix_data_dirs.sh. Previous commit 3560 should've instead read 'adding VAD config file.
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3561 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 04:17:24 +00:00 |
David Snyder
|
5e2f176bdc
|
sandbox/language_id: Adding a needed sort checks in utils/fix_data_dirs.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3560 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 04:15:56 +00:00 |
Dan Povey
|
efb56ad7d7
|
trunk: Minor, mostly cosmetic fixes
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3559 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-21 03:47:30 +00:00 |
Vimal Manohar
|
22285c7d2c
|
trunk/egs/babel: Minor change in segmentation script generate_segments.sh
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3557 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-20 23:18:52 +00:00 |
Jan Trmal
|
ebdb816f84
|
(trunk/babel/s5b) Adding unsup dataset (dataset for unsupervised/semisupervised training)
git-svn-id: https://svn.code.sf.net/p/kaldi/code/trunk@3556 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-20 22:35:14 +00:00 |
Nagendra Goel
|
cd4482adb0
|
add missing file
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3555 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-20 19:55:31 +00:00 |
Nagendra Goel
|
e4debea29c
|
fix filename
git-svn-id: https://svn.code.sf.net/p/kaldi/code/sandbox/language_id@3554 5e6a8d80-dfce-4ca6-a32a-6e07a63d50c8
|
2014-02-20 18:16:01 +00:00 |