DeepSpeech-examples/ffmpeg_vad_streaming
Hadran Ayoub 49b9832187 Bump version from 0.8.0 to 0.9.3 in branch r0.9
Bump DeepSpeech version from 0.8.0 to 0.9.3 in branch r0.9
2021-01-06 18:48:06 +01:00
..
README.MD Revert "Merge pull request #73 from lissyx/master-stt-rename" 2020-08-25 16:49:29 +02:00
index.js Revert "Merge pull request #73 from lissyx/master-stt-rename" 2020-08-25 16:49:29 +02:00
package.json Bump version from 0.8.0 to 0.9.3 in branch r0.9 2021-01-06 18:48:06 +01:00
test.sh Bump version from 0.8.0 to 0.9.3 in branch r0.9 2021-01-06 18:48:06 +01:00

README.MD

FFmpeg VAD Streaming

Streaming inference from arbitrary source (FFmpeg input) to DeepSpeech, using VAD (voice activity detection). A fairly simple example demonstrating the DeepSpeech streaming API in Node.js.

This example was successfully tested with a mobile phone streaming a live feed to a RTMP server (nginx-rtmp), which then could be used by this script for near real time speech recognition.

Installation

npm install

Moreover FFmpeg must be installed:

sudo apt-get install ffmpeg

Usage

Here is an example for a local audio file:

node ./index.js --audio <AUDIO_FILE> \
                --model $HOME/models/output_graph.pbmm

Here is an example for a remote RTMP-Stream:

node ./index.js  --audio rtmp://<IP>:1935/live/teststream \
                 --model $HOME/models/output_graph.pbmm

Examples

Real time streaming inference with DeepSpeech's example audio (audio-0.4.1.tar.gz).

node ./index.js --audio $HOME/audio/2830-3980-0043.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm
node ./index.js --audio $HOME/audio/4507-16021-0012.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm
node ./index.js --audio $HOME/audio/8455-210777-0068.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

Real time streaming inference in combination with a RTMP server.

node ./index.js --audio rtmp://<HOST>/<APP>/<KEY> \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

Notes

To get the best result mapped on to your own scenario, it might be helpful to adjust the parameters VAD_MODE and DEBOUNCE_TIME.