DeepSpeech-examples/mic_vad_streaming
Hadran Ayoub 49b9832187 Bump version from 0.8.0 to 0.9.3 in branch r0.9
Bump DeepSpeech version from 0.8.0 to 0.9.3 in branch r0.9
2021-01-06 18:48:06 +01:00
..
README.rst Revert "Merge pull request #73 from lissyx/master-stt-rename" 2020-08-25 16:49:29 +02:00
mic_vad_streaming.py Revert "Merge branch 'rename-test' (Update tests to match new branding)" 2020-08-25 16:49:38 +02:00
requirements.txt Bump version from 0.8.0 to 0.9.3 in branch r0.9 2021-01-06 18:48:06 +01:00
test.sh Bump version from 0.8.0 to 0.9.3 in branch r0.9 2021-01-06 18:48:06 +01:00

README.rst

Microphone VAD Streaming
========================

Stream from microphone to DeepSpeech, using VAD (voice activity detection). A fairly simple example demonstrating the DeepSpeech streaming API in Python. Also useful for quick, real-time testing of models and decoding parameters.

Installation
------------

.. code-block:: bash

   pip install -r requirements.txt

Uses portaudio for microphone access, so on Linux, you may need to install its header files to compile the ``pyaudio`` package:

.. code-block:: bash

   sudo apt install portaudio19-dev

Installation on MacOS may fail due to portaudio, use brew to install it:

.. code-block:: bash

   brew install portaudio

Usage
-----

.. code-block::

   usage: mic_vad_streaming.py [-h] [-v VAD_AGGRESSIVENESS] [--nospinner]
                               [-w SAVEWAV] [-f FILE] -m MODEL [-s SCORER]
                               [-d DEVICE] [-r RATE]
   
   Stream from microphone to DeepSpeech using VAD
   
   optional arguments:
     -h, --help            show this help message and exit
     -v VAD_AGGRESSIVENESS, --vad_aggressiveness VAD_AGGRESSIVENESS
                           Set aggressiveness of VAD: an integer between 0 and 3,
                           0 being the least aggressive about filtering out non-
                           speech, 3 the most aggressive. Default: 3
     --nospinner           Disable spinner
     -w SAVEWAV, --savewav SAVEWAV
                           Save .wav files of utterences to given directory
     -f FILE, --file FILE  Read from .wav file instead of microphone
     -m MODEL, --model MODEL
                           Path to the model (protocol buffer binary file, or
                           entire directory containing all standard-named files
                           for model)
     -s SCORER, --scorer SCORER
                           Path to the external scorer file. Default:
                           kenlm.scorer
     -d DEVICE, --device DEVICE
                           Device input index (Int) as listed by
                           pyaudio.PyAudio.get_device_info_by_index(). If not
                           provided, falls back to PyAudio.get_default_device().
     -r RATE, --rate RATE  Input device sample rate. Default: 16000. Your device