This commit is contained in:
jmikedupont2 2020-06-17 06:52:51 -04:00 коммит произвёл GitHub
Родитель c2cc8c3112
Коммит 25be62ab21
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
6 изменённых файлов: 318 добавлений и 0 удалений

157
batch_processing/Readme.md Normal file
Просмотреть файл

@ -0,0 +1,157 @@
# Setup for windows
The setup of the windows versions was quite a challenge, getting the versions right.
see `setup.ps` for the loading of the paths.
```
Windows 10 Pro
Version: 2004
Os Build : 19041.329
```
Here are the versions I have installed via chocolaty
```
Chocolatey v0.10.15
7zip v19.0
7zip.install v19.0
anaconda3 v2020.02
audacity v2.4.1
az.powershell v4.2.0
azshell v0.2.2
azure-cli v2.7.0
azurepowershell v6.9.0
bazel v3.2.0
blender v2.83.0
chocolatey v0.10.15
chocolatey-core.extension v1.3.5.1
chocolatey-dotnetfx.extension v1.0.1
chocolatey-fastanswers.extension v0.0.2
chocolatey-visualstudio.extension v1.8.1
chocolatey-windowsupdate.extension v1.0.4
docker-desktop v2.3.0.3
DotNet4.5.1 v4.5.1.20140606
DotNet4.5.2 v4.5.2.20140902
dotnetfx v4.8.0.20190930
ffmpeg v4.2.3
git v2.26.2
git.install
google-chrome-x64 v47.0.2526.81
GoogleChrome v83.0.4103.97
grep v2.1032
KB2919355 v1.0.20160915
KB2919442 v1.0.20160915
KB2999226 v1.0.20181019
KB3033929 v1.0.5
KB3035131 v1.0.3
KB3118401 v1.0.4
microsoft-windows-terminal v1.0.1401.0
msys2 v20200602.0.0
NTop.Portable v0.3.4
powershell-core v7.0.1
procexp v16.32
python v3.8.3
python3 v3.8.3
sox.portable v14.4.1
vcredist140 v14.26.28720.3
vcredist2008 v9.0.30729.6163
vcredist2015 v14.0.24215.20170201
vcredist2017 v14.16.27033
visualstudio-installer v2.0.1
VisualStudio2013ExpressWeb v12.0.21005.20150920
visualstudio2019community v16.6.1.0
vscode v1.45.1
vscode.install v1.45.1
Wget v1.20.3.20190531
windows-sdk-10-version-2004-windbg v10.0.19041.0
wsl v1.0.1
Xming v6.9.0.31
```
The versions of software installed are :
* CUDA v10.0
from https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal
get the file https://developer.download.nvidia.com/compute/cuda/10.0/secure/Prod/local_installers/cuda_10.0.130_411.31_win10.exe
* cudnn-10.0-windows10-x64-v7.5.1.10
from https://developer.nvidia.com/rdp/cudnn-archive
`Download cuDNN v7.5.1 (April 22, 2019), for CUDA 10.0`
via https://developer.nvidia.com/rdp/cudnn-archive#a-collapse751-10
get the file https://developer.download.nvidia.com/compute/machine-learning/cudnn/secure/7.6.5.32/Production/10.0_20191031/cudnn-10.0-windows10-x64-v7.6.5.32.zip
* TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5
from https://developer.nvidia.com/nvidia-tensorrt-5x-download
https://developer.nvidia.com/nvidia-tensorrt-5x-download#trt51ga
via `Windows10 and CUDA 10.0 zip package`
get the file https://developer.nvidia.com/compute/machine-learning/tensorrt/5.1/ga/zips/TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5.zip
I am using these exact versions:
`pip3 install -r requirements.txt`
Here is the output :
```
PS C:\Users\jmike\Documents\GitHub\DeepSpeech-examples\batch_processing> . .\test.ps1
2020-06-14 11:05:01.015450: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Loading model from file C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm
TensorFlow: v1.15.0-24-gceb46aae58
DeepSpeech: v0.7.3-0-g88584941
2020-06-14 11:05:01.237478: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-06-14 11:05:01.244057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-06-14 11:05:01.466608: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce MX250 major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:01:00.0
2020-06-14 11:05:01.466806: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-06-14 11:05:01.473468: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-06-14 11:05:01.476879: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll
2020-06-14 11:05:01.478672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll
2020-06-14 11:05:01.482925: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll
2020-06-14 11:05:01.485963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll
2020-06-14 11:05:01.498053: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-06-14 11:05:01.498710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-06-14 11:05:02.066853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-14 11:05:02.067030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2020-06-14 11:05:02.068133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2020-06-14 11:05:02.073298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1410 MB memory) -> physical GPU (device: 0, name: GeForce MX250, pci bus id: 0000:01:00.0, compute capability: 6.1)
Loaded model in 0.941s.
Loading scorer from files C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer
Loaded scorer in 0.0143s.
Warning: original sample rate (44100) is different than 16000hz. Resampling might produce erratic speech recognition.
Running inference.
2020-06-14 11:05:02.382781: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
```
Running via the GPU takes half the time of using the CPU and has good results.
# Driver command line
`./driver.py --model c:/Users/jmike/Documents/GitHub/DeepSpeech/deepspeech-0.7.3-models.pbmm --scorer c:/Users/jmike/Documents/GitHub/DeepSpeech/deepspeech-0.7.3-models.scorer --dirname c:/Users/jmike/Downloads/podcast/`
# Example
It will then run the individual commands like :
`deepspeech --model C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm --scorer C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer --audio 'C:\Users\jmike\Downloads\podcast\45374977-48000-2-24d9a365625bb.mp3.wav' --json`
Websites referenced:
https://chocolatey.org/packages/cuda
https://deepspeech.readthedocs.io/en/v0.7.3/?badge=latest
https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10
https://discourse.mozilla.org/t/query-regarding-speed-of-training-and-issues-with-convergence/41874
https://discourse.mozilla.org/t/right-cuda-version-for-using-deepspeech-gpu/41927/12
https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#download-windows
https://github.com/MichalMazurek/python-poetry/blob/d3f6df6a6c2587d7a6034719716de257917c4b0f/dockerfiles.py
https://github.com/amitt001/delegator.py
https://github.com/tensorflow/tensorflow/issues/25807
https://github.com/tensorflow/tensorflow/issues/28223
https://github.com/tensorflow/tensorflow/issues/5968
https://hacks.mozilla.org/2019/12/deepspeech-0-6-mozillas-speech-to-text-engine/
https://palletsprojects.com/p/click/
https://www.howtoforge.com/tutorial/ffmpeg-audio-conversion/
https://www.joe0.com/2019/10/19/how-resolve-tensorflow-2-0-error-could-not-load-dynamic-library-cudart64_100-dll-dlerror-cudart64_100-dll-not-found/
https://www.programcreek.com/python/example/88033/click.Path

Просмотреть файл

@ -0,0 +1,83 @@
import glob
import json
import os
from os.path import expanduser
import click
import delegator
# first loop over the files
# convert them to wave
# record things in 16000hz in the future or you gret this
# Warning: original sample rate (44100) is different than 16000h.z Resampling might produce erratic speech recognition.
@click.command()
@click.option("--dirname", type=click.Path(exists=True, resolve_path=True))
@click.option("--ext", default=".mp3")
@click.option(
"--model",
default="deepspeech-0.7.3-models.pbmm",
type=click.Path(exists=True, resolve_path=True),
)
@click.option(
"--scorer",
default="deepspeech-0.7.3-models.scorer",
type=click.Path(exists=True, resolve_path=True),
)
# manage my library of podcasts
def main(dirname, ext, model, scorer):
print("main")
model = expanduser(model)
scorer = expanduser(scorer)
pattern = dirname + "/" + "*" + ext
audiorate = "16000"
print(pattern)
for filename in glob.glob(pattern):
print(filename)
wavefile = filename + ".wav"
convert_command = " ".join(
[
"ffmpeg",
"-i",
"'{}'".format(filename),
"-ar",
audiorate,
"'{}'".format(wavefile),
]
)
if not os.path.isfile(wavefile):
print(convert_command)
r = delegator.run(convert_command)
print(r.out)
else:
print("skipping wave conversion that exists")
command = " ".join(
[
"deepspeech",
"--model",
model,
"--scorer",
scorer,
"--audio",
"'{}'".format(wavefile),
# "--extended",
"--json",
]
)
print(command)
r = delegator.run(command)
with open(filename + ".json", "w") as fo:
print(r.out)
fo.write(r.out)
if __name__ == "__main__":
main()

Просмотреть файл

@ -0,0 +1,62 @@
absl-py==0.9.0
addignore==1.2.7
appdirs==1.4.4
astor==0.8.1
astunparse==1.6.3
attrs==19.3.0
black==19.10b0
bokeh==1.4.0
cachetools==4.1.0
certifi==2020.4.5.2
chardet==3.0.4
click==7.1.2
deepspeech==0.7.3
deepspeech-gpu==0.7.3
delegator.py @ git+https://github.com/amitt001/delegator.py.git@194aa92543fbdbfbae0bcc24ca217819a7805da2
flask==1.1.2
gast==0.2.2
google-auth==1.16.1
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.29.0
h5py==2.10.0
idna==2.9
isort==4.3.21
Jinja2==2.11.2
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
Markdown==3.2.2
MarkupSafe==1.1.1
numpy==1.17.3
oauthlib==3.1.0
opt-einsum==3.2.1
packaging==20.4
pathspec==0.8.0
pexpect==4.8.0
phonemizer==2.2
protobuf==3.12.2
ptyprocess==0.6.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
PyYAML==5.3.1
regex==2020.6.7
requests==2.23.0
requests-oauthlib==1.3.0
rsa==4.0
scipy==1.4.1
six==1.15.0
soundfile==0.10.3.post1
tensorboard==2.1.1
tensorboard-plugin-wit==1.6.0.post3
tensorflow-estimator==2.1.0
tensorflow-gpu==2.2.0
tensorflow-gpu-estimator==2.2.0
termcolor==1.1.0
toml==0.10.1
tqdm==4.46.1
tts==0.0.2+f320992
typed-ast==1.4.1
urllib3==1.25.9
Werkzeug==1.0.1
wrapt==1.12.1

Просмотреть файл

@ -0,0 +1,5 @@
$env:Path += ";C:\Users\jmike\Downloads\cudnn-10.0-windows10-x64-v7.5.1.10\cuda\bin"
$env:Path += ";$env:userprofile\Downloads\TensorRT-5.1.5.0.Windows10.x86_64.cuda-10.0.cudnn7.5\TensorRT-5.1.5.0\lib"
$env:Path += ";C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin"
$env:Path += ";c:\tools\msys64\usr\bin\"
$env:Path += ";C:\Program Files (x86)\Dr. Memory\bin\"

Просмотреть файл

@ -0,0 +1 @@
deepspeech --model C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.pbmm --scorer C:\Users\jmike\Documents\GitHub\DeepSpeech\deepspeech-0.7.3-models.scorer --audio C:\Users\jmike\Documents\Audacity\clip.wav --json

Просмотреть файл

@ -0,0 +1,10 @@
import tensorflow as tf
print ("hello")
av = tf.test.is_gpu_available()
print(av)
av2= tf.config.list_physical_devices('GPU')
print(av2)
#[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]