История

Alexandre Lissy 621f69674e Rename DeepSpeech to Mozilla Voice STT		2020-08-10 17:16:40 +02:00
..
README.MD	Rename DeepSpeech to Mozilla Voice STT	2020-08-10 17:16:40 +02:00
index.js	Rename DeepSpeech to Mozilla Voice STT	2020-08-10 17:16:40 +02:00
package.json	Rename DeepSpeech to Mozilla Voice STT	2020-08-10 17:16:40 +02:00
test.sh	Update model file name in tests	2020-07-30 23:17:30 +02:00

README.MD

FFmpeg VAD Streaming

Streaming inference from arbitrary source (FFmpeg input) to Mozilla Voice STT, using VAD (voice activity detection). A fairly simple example demonstrating the Mozilla Voice STT streaming API in Node.js.

This example was successfully tested with a mobile phone streaming a live feed to a RTMP server (nginx-rtmp), which then could be used by this script for near real time speech recognition.

Installation

npm install

Moreover FFmpeg must be installed:

sudo apt-get install ffmpeg

Usage

Here is an example for a local audio file:

node ./index.js --audio <AUDIO_FILE> \
                --model $HOME/models/output_graph.pbmm

Here is an example for a remote RTMP-Stream:

node ./index.js  --audio rtmp://<IP>:1935/live/teststream \
                 --model $HOME/models/output_graph.pbmm

Examples

Real time streaming inference with Mozilla Voice STT's example audio (audio-0.4.1.tar.gz).

node ./index.js --audio $HOME/audio/2830-3980-0043.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

node ./index.js --audio $HOME/audio/4507-16021-0012.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

node ./index.js --audio $HOME/audio/8455-210777-0068.wav \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

Real time streaming inference in combination with a RTMP server.

node ./index.js --audio rtmp://<HOST>/<APP>/<KEY> \
                --scorer $HOME/models/kenlm.scorer \
                --model $HOME/models/output_graph.pbmm

Notes

To get the best result mapped on to your own scenario, it might be helpful to adjust the parameters VAD_MODE and DEBOUNCE_TIME.