Граф коммитов

135 Коммитов

Автор SHA1 Сообщение Дата
Tilman Kamp c3e879a642 SDB support 2020-02-07 18:29:30 +01:00
Tilman Kamp fba9c8971d Light refactoring to integrate DeepSpeech audio helpers. Additional VAD parameters. 2020-02-06 17:06:44 +01:00
Ryan Hileman d17489c29e Fix KenLM model generation on smaller texts
Add --discount-fallback argument to lmplz, which was necessary when generating a language model for a smaller transcription. This option enables a fallback, and shouldn't affect anything except KenLM models that would've otherwise errored out.
2020-01-31 16:24:11 +01:00
Tilman Kamp 9016b2b5a9
Merge pull request #23 from lunixbochs/patch-3
Skip inputs with empty clean transcripts
2020-01-31 16:21:13 +01:00
Ryan Hileman 86779514ec
Skip inputs with empty clean transcripts
Fix #22
2020-01-31 07:12:16 -08:00
Tilman Kamp a5f2808853
Merge pull request #17 from lunixbochs/patch-1
Fix align.sh example in README
2020-01-22 10:50:16 +01:00
Ryan Hileman c53411ead5
Fix align.sh example in README
Looks like the existing example didn't work anymore, this modified command seems to be working for me.
2020-01-22 00:36:52 -08:00
Tilman Kamp 6dcfd4dd4a Updating LM dependencies to DS 0.6.0 2019-12-20 17:33:16 +01:00
Tilman Kamp 699cacd286
Merge pull request #15 from BoneGoat/feature/deepspeech-0.6.0
Feature/deepspeech 0.6.0
2019-12-20 11:13:44 +01:00
Tobias 1963789ce8 Downloading alphabet.txt for DeepSpeech 2019-12-19 19:17:56 +01:00
Bias 420c68e58a Set requirements for DeepSpeech 0.6.0 2019-12-19 16:58:12 +01:00
Bias 0cfee09964 Changes needed for DeepSpeech 0.6.0 2019-12-19 16:50:26 +01:00
Tilman Kamp 91f39a944a Fix: Using file-size instead of sample number on list export 2019-12-02 11:26:33 +01:00
Tilman Kamp ac9ca29507 Closing temp file system-handles; fewer conversion workers 2019-11-28 18:13:24 +01:00
Tilman Kamp 665c44cb18 Export using SoX 2019-11-28 15:08:19 +01:00
Tilman Kamp 08b14658fb Reducing memory consumption during tar export 2019-11-28 12:07:44 +01:00
Tilman Kamp 1e0762feac Tar file export 2019-11-27 17:40:09 +01:00
Tilman Kamp 7365e3fc2f Alternative test catalog 2019-11-27 17:38:48 +01:00
Tilman Kamp a7a69a09bd Fix #4 2019-11-26 17:43:26 +01:00
Tilman Kamp cc45c0d97a Support for meta data in transcription logs 2019-11-26 17:39:17 +01:00
Tilman Kamp b7e13b1896 Log to stdout by default 2019-11-25 15:43:54 +01:00
Tilman Kamp 213e954caa Using imap_unordered for parallel alignment 2019-11-25 15:29:03 +01:00
Tilman Kamp d95b90b2c2 Better logging without progress bar 2019-11-25 14:13:38 +01:00
Tilman Kamp 9ba45ca4e6 Lazy loading models 2019-11-25 13:35:03 +01:00
Tilman Kamp f335419bc5 README update 2019-09-26 18:44:46 +02:00
Tilman Kamp a138d55d9f Always adding text fragments to .aligned file entries 2019-09-24 18:27:17 +02:00
Tilman Kamp 9070b867ad Support for catalogs, multiple audio formats and parallel processing 2019-09-24 16:57:51 +02:00
Tilman Kamp 5b6057af61 CSV headers, assert on sample times 2019-09-23 15:25:07 +02:00
Tilman Kamp db8e577e57 Print set durations 2019-09-18 15:52:33 +02:00
Tilman Kamp bec42d6c14 Better command-line help 2019-09-18 15:10:02 +02:00
Tilman Kamp ab3c66a6ba Added statistics tool 2019-09-18 14:10:37 +02:00
Tilman Kamp 36d3672a19 Better set splitting 2019-09-17 15:38:36 +02:00
Tilman Kamp 72b8cc45f4 Fix for split field case 2019-09-17 10:30:19 +02:00
Tilman Kamp eba1385523 Pretty printing JSON output 2019-09-16 18:30:27 +02:00
Tilman Kamp cb695226fc Exporter to DeepSpeech CSV files 2019-09-16 18:12:54 +02:00
Tilman Kamp a08b1e85d9 Fixed re-definition of skip 2019-09-12 10:21:46 +02:00
Tilman Kamp 4efdc7b365 Dealing with empty matches 2019-09-11 18:12:10 +02:00
Tilman Kamp ae584a901b No phrase snapping by default 2019-09-11 17:43:40 +02:00
Tilman Kamp d2530ed745 Shrinking support during fine-alignment 2019-09-11 16:21:42 +02:00
Tilman Kamp 92f23253bb Support for different text similarity metrics 2019-09-10 18:39:03 +02:00
Tilman Kamp 11c61c7970 Default minimum of 1 for tlen and mlen 2019-09-06 17:52:29 +02:00
Tilman Kamp 0aa5bc7422 Meta data pass through 2019-09-05 16:27:24 +02:00
Tilman Kamp fab9ec601f Extended example data for new script support 2019-09-05 15:33:10 +02:00
Tilman Kamp 138026d875 Fix for multi-phrase support, direct loading of .tlog files 2019-09-04 18:45:34 +02:00
Tilman Kamp 718a430e59 Bug fixes; progress-bars; multi phrase text cleaning preserving assigned meta-data 2019-09-04 15:45:58 +02:00
Tilman Kamp b7c1e9ea7b Removed debug leftover 2019-09-02 15:23:29 +02:00
Tilman Kamp b912fef392 README updates 2019-09-02 13:42:04 +02:00
Tilman Kamp 03f5b39d37 Multi-process transcription, .tlog files for transcripts, some light refactoring and reduced logging 2019-09-02 12:26:10 +02:00
Tilman Kamp 3031f35561 Fix wrong index name 2019-08-20 10:25:07 +02:00
Tilman Kamp cc6507ef97 Fix for calling removed function 2019-08-20 10:19:09 +02:00