Add batch processing example (#58)
This commit is contained in:
Родитель
c2cc8c3112
Коммит
25be62ab21
|
@ -0,0 +1,157 @@
|
|||
|
||||
# Setup for windows
|
||||
|
||||
The setup of the windows versions was quite a challenge, getting the versions right.
|
||||
|
||||
see `setup.ps` for the loading of the paths.
|
||||
|
||||
```
|
||||
Windows 10 Pro
|
||||
Version: 2004
|
||||
Os Build : 19041.329
|
||||
```
|
||||
|
||||
Here are the versions I have installed via chocolaty
|
||||
|
||||
```
|
||||
Chocolatey v0.10.15
|
||||
7zip v19.0
|
||||
7zip.install v19.0
|
||||
anaconda3 v2020.02
|
||||
audacity v2.4.1
|
||||
az.powershell v4.2.0
|
||||
azshell v0.2.2
|
||||
azure-cli v2.7.0
|
||||
azurepowershell v6.9.0
|
||||
bazel v3.2.0
|
||||
blender v2.83.0
|
||||
chocolatey v0.10.15
|
||||
chocolatey-core.extension v1.3.5.1
|
||||
chocolatey-dotnetfx.extension v1.0.1
|
||||
chocolatey-fastanswers.extension v0.0.2
|
||||
chocolatey-visualstudio.extension v1.8.1
|
||||
chocolatey-windowsupdate.extension v1.0.4
|
||||
docker-desktop v2.3.0.3
|
||||
DotNet4.5.1 v4.5.1.20140606
|
||||
DotNet4.5.2 v4.5.2.20140902
|
||||
dotnetfx v4.8.0.20190930
|
||||
ffmpeg v4.2.3
|
||||
git v2.26.2
|
||||
git.install
|
||||
google-chrome-x64 v47.0.2526.81
|
||||
GoogleChrome v83.0.4103.97
|
||||
grep v2.1032
|
||||
KB2919355 v1.0.20160915
|
||||
KB2919442 v1.0.20160915
|
||||
KB2999226 v1.0.20181019
|
||||
KB3033929 v1.0.5
|
||||
KB3035131 v1.0.3
|
||||
KB3118401 v1.0.4
|
||||
microsoft-windows-terminal v1.0.1401.0
|
||||
msys2 v20200602.0.0
|
||||
NTop.Portable v0.3.4
|
||||
powershell-core v7.0.1
|
||||
procexp v16.32
|
||||
python v3.8.3
|
||||
python3 v3.8.3
|
||||
sox.portable v14.4.1
|
||||
vcredist140 v14.26.28720.3
|
||||
vcredist2008 v9.0.30729.6163
|
||||
vcredist2015 v14.0.24215.20170201
|
||||
vcredist2017 v14.16.27033
|
||||
visualstudio-installer v2.0.1
|
||||
VisualStudio2013ExpressWeb v12.0.21005.20150920
|
||||
visualstudio2019community v16.6.1.0
|
||||
vscode v1.45.1
|
||||
vscode.install v1.45.1
|
||||
Wget v1.20.3.20190531
|
||||
windows-sdk-10-version-2004-windbg v10.0.19041.0
|
||||
wsl v1.0.1
|
||||
Xming v6.9.0.31
|
||||
|
||||
```
|
||||
|
||||
The versions of software installed are :
|
||||
|
||||
* CUDA v10.0
|
||||
from https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal
|
||||
get the file https://developer.download.nvidia.com/compute/cuda/10.0/secure/Prod/local_installers/cuda_10.0.130_411.31_win10.exe
|
||||
|
||||
* cudnn-10.0-windows10-x64-v7.5.1.10
|
||||
from https://developer.nvidia.com/rdp/cudnn-archive
|
||||
`Download cuDNN v7.5.1 (April 22, 2019), for CUDA 10.0`
|
||||
via https://developer.nvidia.com/rdp/cudnn-archive#a-collapse751-10
|
||||
get the file https://developer.download.nvidia.com/compute/machine-learning/cudnn/secure/7.6.5.32/Production/10.0_20191031/cudnn-10.0-windows10-x64-v7.6.5.32.zip
|
||||
|
||||
* TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5
|
||||
from https://developer.nvidia.com/nvidia-tensorrt-5x-download
|
||||
https://developer.nvidia.com/nvidia-tensorrt-5x-download#trt51ga
|
||||
via `Windows10 and CUDA 10.0 zip package`
|
||||
get the file https://developer.nvidia.com/compute/machine-learning/tensorrt/5.1/ga/zips/TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5.zip
|
||||
|
||||
|
||||
I am using these exact versions:
|
||||
`pip3 install -r requirements.txt`
|
||||
|
||||
Here is the output :
|
||||
```
|
||||
PS C:\Users\jmike\Documents\GitHub\DeepSpeech-examples\batch_processing> . .\test.ps1
|
||||
2020-06-14 11:05:01.015450: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
|
||||
Loading model from file C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm
|
||||
TensorFlow: v1.15.0-24-gceb46aae58
|
||||
DeepSpeech: v0.7.3-0-g88584941
|
||||
2020-06-14 11:05:01.237478: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
|
||||
2020-06-14 11:05:01.244057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
|
||||
2020-06-14 11:05:01.466608: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
|
||||
name: GeForce MX250 major: 6 minor: 1 memoryClockRate(GHz): 1.582
|
||||
pciBusID: 0000:01:00.0
|
||||
2020-06-14 11:05:01.466806: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
|
||||
2020-06-14 11:05:01.473468: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
|
||||
2020-06-14 11:05:01.476879: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
|
||||
2020-06-14 11:05:01.478672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
|
||||
2020-06-14 11:05:01.482925: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
|
||||
2020-06-14 11:05:01.485963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
|
||||
2020-06-14 11:05:01.498053: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
|
||||
2020-06-14 11:05:01.498710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
|
||||
2020-06-14 11:05:02.066853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
|
||||
2020-06-14 11:05:02.067030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
|
||||
2020-06-14 11:05:02.068133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
|
||||
2020-06-14 11:05:02.073298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1410 MB memory) -> physical GPU (device: 0, name: GeForce MX250, pci bus id: 0000:01:00.0, compute capability: 6.1)
|
||||
Loaded model in 0.941s.
|
||||
Loading scorer from files C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer
|
||||
Loaded scorer in 0.0143s.
|
||||
Warning: original sample rate (44100) is different than 16000hz. Resampling might produce erratic speech recognition.
|
||||
Running inference.
|
||||
2020-06-14 11:05:02.382781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
|
||||
```
|
||||
Running via the GPU takes half the time of using the CPU and has good results.
|
||||
|
||||
# Driver command line
|
||||
|
||||
`./driver.py --model c:/Users/jmike/Documents/GitHub/DeepSpeech/deepspeech-0.7.3-models.pbmm --scorer c:/Users/jmike/Documents/GitHub/DeepSpeech/deepspeech-0.7.3-models.scorer --dirname c:/Users/jmike/Downloads/podcast/`
|
||||
|
||||
# Example
|
||||
|
||||
It will then run the individual commands like :
|
||||
|
||||
`deepspeech --model C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm --scorer C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer --audio 'C:\Users\jmike\Downloads\podcast\45374977-48000-2-24d9a365625bb.mp3.wav' --json`
|
||||
|
||||
|
||||
Websites referenced:
|
||||
|
||||
https://chocolatey.org/packages/cuda
|
||||
https://deepspeech.readthedocs.io/en/v0.7.3/?badge=latest
|
||||
https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10
|
||||
https://discourse.mozilla.org/t/query-regarding-speed-of-training-and-issues-with-convergence/41874
|
||||
https://discourse.mozilla.org/t/right-cuda-version-for-using-deepspeech-gpu/41927/12
|
||||
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#download-windows
|
||||
https://github.com/MichalMazurek/python-poetry/blob/d3f6df6a6c2587d7a6034719716de257917c4b0f/dockerfiles.py
|
||||
https://github.com/amitt001/delegator.py
|
||||
https://github.com/tensorflow/tensorflow/issues/25807
|
||||
https://github.com/tensorflow/tensorflow/issues/28223
|
||||
https://github.com/tensorflow/tensorflow/issues/5968
|
||||
https://hacks.mozilla.org/2019/12/deepspeech-0-6-mozillas-speech-to-text-engine/
|
||||
https://palletsprojects.com/p/click/
|
||||
https://www.howtoforge.com/tutorial/ffmpeg-audio-conversion/
|
||||
https://www.joe0.com/2019/10/19/how-resolve-tensorflow-2-0-error-could-not-load-dynamic-library-cudart64_100-dll-dlerror-cudart64_100-dll-not-found/
|
||||
https://www.programcreek.com/python/example/88033/click.Path
|
|
@ -0,0 +1,83 @@
|
|||
import glob
|
||||
import json
|
||||
import os
|
||||
from os.path import expanduser
|
||||
|
||||
import click
|
||||
|
||||
import delegator
|
||||
|
||||
# first loop over the files
|
||||
# convert them to wave
|
||||
|
||||
# record things in 16000hz in the future or you gret this
|
||||
# Warning: original sample rate (44100) is different than 16000h.z Resampling might produce erratic speech recognition.
|
||||
|
||||
|
||||
@click.command()
|
||||
@click.option("--dirname", type=click.Path(exists=True, resolve_path=True))
|
||||
@click.option("--ext", default=".mp3")
|
||||
@click.option(
|
||||
"--model",
|
||||
default="deepspeech-0.7.3-models.pbmm",
|
||||
type=click.Path(exists=True, resolve_path=True),
|
||||
)
|
||||
@click.option(
|
||||
"--scorer",
|
||||
default="deepspeech-0.7.3-models.scorer",
|
||||
type=click.Path(exists=True, resolve_path=True),
|
||||
)
|
||||
|
||||
# manage my library of podcasts
|
||||
def main(dirname, ext, model, scorer):
|
||||
print("main")
|
||||
model = expanduser(model)
|
||||
scorer = expanduser(scorer)
|
||||
pattern = dirname + "/" + "*" + ext
|
||||
audiorate = "16000"
|
||||
|
||||
print(pattern)
|
||||
for filename in glob.glob(pattern):
|
||||
print(filename)
|
||||
|
||||
wavefile = filename + ".wav"
|
||||
|
||||
convert_command = " ".join(
|
||||
[
|
||||
"ffmpeg",
|
||||
"-i",
|
||||
"'{}'".format(filename),
|
||||
"-ar",
|
||||
audiorate,
|
||||
"'{}'".format(wavefile),
|
||||
]
|
||||
)
|
||||
if not os.path.isfile(wavefile):
|
||||
print(convert_command)
|
||||
r = delegator.run(convert_command)
|
||||
print(r.out)
|
||||
else:
|
||||
print("skipping wave conversion that exists")
|
||||
|
||||
command = " ".join(
|
||||
[
|
||||
"deepspeech",
|
||||
"--model",
|
||||
model,
|
||||
"--scorer",
|
||||
scorer,
|
||||
"--audio",
|
||||
"'{}'".format(wavefile),
|
||||
# "--extended",
|
||||
"--json",
|
||||
]
|
||||
)
|
||||
print(command)
|
||||
r = delegator.run(command)
|
||||
with open(filename + ".json", "w") as fo:
|
||||
print(r.out)
|
||||
fo.write(r.out)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
|
@ -0,0 +1,62 @@
|
|||
absl-py==0.9.0
|
||||
addignore==1.2.7
|
||||
appdirs==1.4.4
|
||||
astor==0.8.1
|
||||
astunparse==1.6.3
|
||||
attrs==19.3.0
|
||||
black==19.10b0
|
||||
bokeh==1.4.0
|
||||
cachetools==4.1.0
|
||||
certifi==2020.4.5.2
|
||||
chardet==3.0.4
|
||||
click==7.1.2
|
||||
deepspeech==0.7.3
|
||||
deepspeech-gpu==0.7.3
|
||||
delegator.py @ git+https://github.com/amitt001/delegator.py.git@194aa92543fbdbfbae0bcc24ca217819a7805da2
|
||||
flask==1.1.2
|
||||
gast==0.2.2
|
||||
google-auth==1.16.1
|
||||
google-auth-oauthlib==0.4.1
|
||||
google-pasta==0.2.0
|
||||
grpcio==1.29.0
|
||||
h5py==2.10.0
|
||||
idna==2.9
|
||||
isort==4.3.21
|
||||
Jinja2==2.11.2
|
||||
Keras-Applications==1.0.8
|
||||
Keras-Preprocessing==1.1.2
|
||||
Markdown==3.2.2
|
||||
MarkupSafe==1.1.1
|
||||
numpy==1.17.3
|
||||
oauthlib==3.1.0
|
||||
opt-einsum==3.2.1
|
||||
packaging==20.4
|
||||
pathspec==0.8.0
|
||||
pexpect==4.8.0
|
||||
phonemizer==2.2
|
||||
protobuf==3.12.2
|
||||
ptyprocess==0.6.0
|
||||
pyasn1==0.4.8
|
||||
pyasn1-modules==0.2.8
|
||||
pyparsing==2.4.7
|
||||
PyYAML==5.3.1
|
||||
regex==2020.6.7
|
||||
requests==2.23.0
|
||||
requests-oauthlib==1.3.0
|
||||
rsa==4.0
|
||||
scipy==1.4.1
|
||||
six==1.15.0
|
||||
soundfile==0.10.3.post1
|
||||
tensorboard==2.1.1
|
||||
tensorboard-plugin-wit==1.6.0.post3
|
||||
tensorflow-estimator==2.1.0
|
||||
tensorflow-gpu==2.2.0
|
||||
tensorflow-gpu-estimator==2.2.0
|
||||
termcolor==1.1.0
|
||||
toml==0.10.1
|
||||
tqdm==4.46.1
|
||||
tts==0.0.2+f320992
|
||||
typed-ast==1.4.1
|
||||
urllib3==1.25.9
|
||||
Werkzeug==1.0.1
|
||||
wrapt==1.12.1
|
|
@ -0,0 +1,5 @@
|
|||
$env:Path += ";C:\Users\jmike\Downloads\cudnn-10.0-windows10-x64-v7.5.1.10\cuda\bin"
|
||||
$env:Path += ";$env:userprofile\Downloads\TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5\TensorRT-5.1.5.0\lib"
|
||||
$env:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin"
|
||||
$env:Path += ";c:\tools\msys64\usr\bin\"
|
||||
$env:Path += ";C:\Program Files (x86)\Dr. Memory\bin\"
|
|
@ -0,0 +1 @@
|
|||
deepspeech --model C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm --scorer C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer --audio C:\Users\jmike\Documents\Audacity\clip.wav --json
|
|
@ -0,0 +1,10 @@
|
|||
import tensorflow as tf
|
||||
|
||||
print ("hello")
|
||||
av = tf.test.is_gpu_available()
|
||||
print(av)
|
||||
|
||||
av2= tf.config.list_physical_devices('GPU')
|
||||
print(av2)
|
||||
|
||||
#[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
|
Загрузка…
Ссылка в новой задаче