Граф коммитов

129 Коммитов

Автор SHA1 Сообщение Дата
Bias 5cfa401ff1 Making sure files are closed 2020-07-03 16:21:11 +02:00
Bias 9c09da471a Fixed unicode in output 2020-07-03 16:21:11 +02:00
Bias 05c96731c4 Removing temp lm files 2020-07-03 16:21:11 +02:00
Bias f6cf144382 Initial support for DeepSpeech 0.7.1 2020-07-03 16:21:11 +02:00
Tilman Kamp 39a633a434 Updated documentation and minor tool fixes 2020-07-01 17:59:46 +02:00
Tilman Kamp f6a16d92a0 Fix empty set-assignments 2020-05-04 18:18:52 +02:00
Tilman Kamp f3e594f566 Better progress logging in SDB tool 2020-03-09 12:08:53 +01:00
Tilman Kamp e06a702756 Fix missing utf-8 decoding on SDB meta data reading 2020-03-05 15:37:33 +01:00
Tilman Kamp ebb2e9721f Simplified main routine 2020-02-27 16:05:19 +01:00
Tilman Kamp 3dac0f8db3 Progress reporting on meta file writing; filenames in log messages 2020-02-27 16:03:59 +01:00
Tilman Kamp 149805990e Closing od SDB reader at end of SortingSDBWriter finalization 2020-02-27 16:02:26 +01:00
Tilman Kamp d41c0b1ff7 Exporter: Second chance conversion and ability to skip samples on audio errors 2020-02-26 11:50:52 +01:00
Tilman Kamp f8cd176b8d Post-refactoring fix of exporter's de-biasing 2020-02-25 18:53:40 +01:00
Tilman Kamp 0c2ea1b983 Export plan as a cache for export preparation steps 2020-02-25 16:54:53 +01:00
Tilman Kamp e03e830685 Better split-field checking 2020-02-25 16:10:41 +01:00
Tilman Kamp ecd2a74906 Better checking for existing target paths 2020-02-25 16:09:41 +01:00
Tilman Kamp 8897bb6cc7 Refactored exporter for better maintenance and lower memory footprint; --tmp-dir option; CSV meta files 2020-02-25 15:44:37 +01:00
Tilman Kamp 536dc6b006 Exporter argument parsing as own function 2020-02-24 11:19:59 +01:00
Tilman Kamp 23a4569ba5 Using heapq.merge for interleaving 2020-02-21 19:08:37 +01:00
Tilman Kamp 061f77bb62 Updated custom mime-types 2020-02-21 19:07:45 +01:00
Tilman Kamp 3a05b79285 Removed tqdm from stats.py 2020-02-21 13:10:28 +01:00
Tilman Kamp e788cee6dc Fix #25 2020-02-21 13:02:35 +01:00
Tilman Kamp e8fb5895ca Additional parameter for SDB finalization sample buffer size 2020-02-20 17:16:55 +01:00
Tilman Kamp 0d45123b29 Progress logging: Prevent division by 0 2020-02-20 17:16:03 +01:00
Tilman Kamp 842d50a950 Fix for meta fields that are single values instead of lists 2020-02-20 13:57:05 +01:00
Tilman Kamp 4e296d4011 Changed some debug log-messages to info ones 2020-02-20 11:44:09 +01:00
Tilman Kamp 7993f214fd Progress logging to stderr 2020-02-19 17:55:43 +01:00
Tilman Kamp 82d40d82f4 Better progress logging; Output of fragment meta data on sample cutting problem 2020-02-19 17:29:10 +01:00
Tilman Kamp 02305706d0 Remove incompatible stty sane call 2020-02-18 17:28:11 +01:00
Tilman Kamp f7e2f7f0ab Refactored CollectionSample to LabeledSample 2020-02-18 15:53:28 +01:00
Tilman Kamp fe8588565f Better progress logging in catalog tool 2020-02-18 14:08:27 +01:00
Tilman Kamp 7fa6773bc4 Fixes for logging and argument parsing 2020-02-18 14:07:08 +01:00
Tilman Kamp 28806dfff6 Fix #16 2020-02-18 14:05:53 +01:00
Tilman Kamp b4a0e3bd85 Catalog tool 2020-02-17 18:27:41 +01:00
Tilman Kamp 06ab3a2d17 Meta data output independent of sample output format 2020-02-17 13:47:11 +01:00
Tilman Kamp 1383951191 Fix --no-progress option; better info logs 2020-02-17 12:34:56 +01:00
Tilman Kamp aacbda1676 Ability to drop samples with unknown and/or multi-instance meta data 2020-02-17 11:41:55 +01:00
Tilman Kamp 3342067a0b Meta data output on SDB export 2020-02-14 18:29:50 +01:00
Tilman Kamp 42556e1988 Late binding of opuslib, nicer progress bars on export 2020-02-13 12:21:25 +01:00
Tilman Kamp dfc565190d Progress indication during SDB finalization 2020-02-12 17:30:27 +01:00
Tilman Kamp 836b87f4a5 Removed debugging artifact 2020-02-12 14:46:13 +01:00
Tilman Kamp bf9fcd15ad Wave and dry-run support for SDB export 2020-02-12 13:55:49 +01:00
Tilman Kamp bb51dbf193 Updated SDB support 2020-02-12 11:48:08 +01:00
Tilman Kamp a8aa0ae2d5 Fix: Initializing Opus frame remainders with zeros 2020-02-10 11:43:32 +01:00
Tilman Kamp c3e879a642 SDB support 2020-02-07 18:29:30 +01:00
Tilman Kamp fba9c8971d Light refactoring to integrate DeepSpeech audio helpers. Additional VAD parameters. 2020-02-06 17:06:44 +01:00
Ryan Hileman d17489c29e Fix KenLM model generation on smaller texts
Add --discount-fallback argument to lmplz, which was necessary when generating a language model for a smaller transcription. This option enables a fallback, and shouldn't affect anything except KenLM models that would've otherwise errored out.
2020-01-31 16:24:11 +01:00
Tilman Kamp 9016b2b5a9
Merge pull request #23 from lunixbochs/patch-3
Skip inputs with empty clean transcripts
2020-01-31 16:21:13 +01:00
Ryan Hileman 86779514ec
Skip inputs with empty clean transcripts
Fix #22
2020-01-31 07:12:16 -08:00
Tilman Kamp a5f2808853
Merge pull request #17 from lunixbochs/patch-1
Fix align.sh example in README
2020-01-22 10:50:16 +01:00