Граф коммитов

96 Коммитов

Автор SHA1 Сообщение Дата
Tilman Kamp b4a0e3bd85 Catalog tool 2020-02-17 18:27:41 +01:00
Tilman Kamp 06ab3a2d17 Meta data output independent of sample output format 2020-02-17 13:47:11 +01:00
Tilman Kamp 1383951191 Fix --no-progress option; better info logs 2020-02-17 12:34:56 +01:00
Tilman Kamp aacbda1676 Ability to drop samples with unknown and/or multi-instance meta data 2020-02-17 11:41:55 +01:00
Tilman Kamp 3342067a0b Meta data output on SDB export 2020-02-14 18:29:50 +01:00
Tilman Kamp 42556e1988 Late binding of opuslib, nicer progress bars on export 2020-02-13 12:21:25 +01:00
Tilman Kamp dfc565190d Progress indication during SDB finalization 2020-02-12 17:30:27 +01:00
Tilman Kamp 836b87f4a5 Removed debugging artifact 2020-02-12 14:46:13 +01:00
Tilman Kamp bf9fcd15ad Wave and dry-run support for SDB export 2020-02-12 13:55:49 +01:00
Tilman Kamp bb51dbf193 Updated SDB support 2020-02-12 11:48:08 +01:00
Tilman Kamp a8aa0ae2d5 Fix: Initializing Opus frame remainders with zeros 2020-02-10 11:43:32 +01:00
Tilman Kamp c3e879a642 SDB support 2020-02-07 18:29:30 +01:00
Tilman Kamp fba9c8971d Light refactoring to integrate DeepSpeech audio helpers. Additional VAD parameters. 2020-02-06 17:06:44 +01:00
Ryan Hileman d17489c29e Fix KenLM model generation on smaller texts
Add --discount-fallback argument to lmplz, which was necessary when generating a language model for a smaller transcription. This option enables a fallback, and shouldn't affect anything except KenLM models that would've otherwise errored out.
2020-01-31 16:24:11 +01:00
Tilman Kamp 9016b2b5a9
Merge pull request #23 from lunixbochs/patch-3
Skip inputs with empty clean transcripts
2020-01-31 16:21:13 +01:00
Ryan Hileman 86779514ec
Skip inputs with empty clean transcripts
Fix #22
2020-01-31 07:12:16 -08:00
Tilman Kamp a5f2808853
Merge pull request #17 from lunixbochs/patch-1
Fix align.sh example in README
2020-01-22 10:50:16 +01:00
Ryan Hileman c53411ead5
Fix align.sh example in README
Looks like the existing example didn't work anymore, this modified command seems to be working for me.
2020-01-22 00:36:52 -08:00
Tilman Kamp 6dcfd4dd4a Updating LM dependencies to DS 0.6.0 2019-12-20 17:33:16 +01:00
Tilman Kamp 699cacd286
Merge pull request #15 from BoneGoat/feature/deepspeech-0.6.0
Feature/deepspeech 0.6.0
2019-12-20 11:13:44 +01:00
Tobias 1963789ce8 Downloading alphabet.txt for DeepSpeech 2019-12-19 19:17:56 +01:00
Bias 420c68e58a Set requirements for DeepSpeech 0.6.0 2019-12-19 16:58:12 +01:00
Bias 0cfee09964 Changes needed for DeepSpeech 0.6.0 2019-12-19 16:50:26 +01:00
Tilman Kamp 91f39a944a Fix: Using file-size instead of sample number on list export 2019-12-02 11:26:33 +01:00
Tilman Kamp ac9ca29507 Closing temp file system-handles; fewer conversion workers 2019-11-28 18:13:24 +01:00
Tilman Kamp 665c44cb18 Export using SoX 2019-11-28 15:08:19 +01:00
Tilman Kamp 08b14658fb Reducing memory consumption during tar export 2019-11-28 12:07:44 +01:00
Tilman Kamp 1e0762feac Tar file export 2019-11-27 17:40:09 +01:00
Tilman Kamp 7365e3fc2f Alternative test catalog 2019-11-27 17:38:48 +01:00
Tilman Kamp a7a69a09bd Fix #4 2019-11-26 17:43:26 +01:00
Tilman Kamp cc45c0d97a Support for meta data in transcription logs 2019-11-26 17:39:17 +01:00
Tilman Kamp b7e13b1896 Log to stdout by default 2019-11-25 15:43:54 +01:00
Tilman Kamp 213e954caa Using imap_unordered for parallel alignment 2019-11-25 15:29:03 +01:00
Tilman Kamp d95b90b2c2 Better logging without progress bar 2019-11-25 14:13:38 +01:00
Tilman Kamp 9ba45ca4e6 Lazy loading models 2019-11-25 13:35:03 +01:00
Tilman Kamp f335419bc5 README update 2019-09-26 18:44:46 +02:00
Tilman Kamp a138d55d9f Always adding text fragments to .aligned file entries 2019-09-24 18:27:17 +02:00
Tilman Kamp 9070b867ad Support for catalogs, multiple audio formats and parallel processing 2019-09-24 16:57:51 +02:00
Tilman Kamp 5b6057af61 CSV headers, assert on sample times 2019-09-23 15:25:07 +02:00
Tilman Kamp db8e577e57 Print set durations 2019-09-18 15:52:33 +02:00
Tilman Kamp bec42d6c14 Better command-line help 2019-09-18 15:10:02 +02:00
Tilman Kamp ab3c66a6ba Added statistics tool 2019-09-18 14:10:37 +02:00
Tilman Kamp 36d3672a19 Better set splitting 2019-09-17 15:38:36 +02:00
Tilman Kamp 72b8cc45f4 Fix for split field case 2019-09-17 10:30:19 +02:00
Tilman Kamp eba1385523 Pretty printing JSON output 2019-09-16 18:30:27 +02:00
Tilman Kamp cb695226fc Exporter to DeepSpeech CSV files 2019-09-16 18:12:54 +02:00
Tilman Kamp a08b1e85d9 Fixed re-definition of skip 2019-09-12 10:21:46 +02:00
Tilman Kamp 4efdc7b365 Dealing with empty matches 2019-09-11 18:12:10 +02:00
Tilman Kamp ae584a901b No phrase snapping by default 2019-09-11 17:43:40 +02:00
Tilman Kamp d2530ed745 Shrinking support during fine-alignment 2019-09-11 16:21:42 +02:00