2019-10-02 17:37:47 +03:00
Project DeepSpeech
==================
.. image :: https://readthedocs.org/projects/deepspeech/badge/?version=latest
:target: http://deepspeech.readthedocs.io/?badge=latest
:alt: Documentation
2019-11-03 06:54:18 +03:00
.. image :: https://community-tc.services.mozilla.com/api/github/v1/repository/mozilla/DeepSpeech/master/badge.svg
:target: https://community-tc.services.mozilla.com/api/github/v1/repository/mozilla/DeepSpeech/master/latest
2019-10-02 17:37:47 +03:00
:alt: Task Status
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on `Baidu's Deep Speech research paper <https://arxiv.org/abs/1412.5567> `_ . Project DeepSpeech uses Google's `TensorFlow <https://www.tensorflow.org/> `_ to make the implementation easier.
2020-03-12 19:31:42 +03:00
**NOTE:** This documentation applies to the **MASTER version** of DeepSpeech only. **Documentation for the latest stable version** is published on `deepspeech.readthedocs.io <http://deepspeech.readthedocs.io/?badge=latest> `_ .
2019-12-02 19:34:42 +03:00
2019-10-02 17:37:47 +03:00
To install and use deepspeech all you have to do is:
.. code-block :: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model and extract
2020-01-10 17:02:47 +03:00
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz
tar xvf deepspeech-0.6.1-models.tar.gz
2019-10-02 17:37:47 +03:00
# Download example audio files
2020-01-10 17:02:47 +03:00
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/audio-0.6.1.tar.gz
tar xvf audio-0.6.1.tar.gz
2019-10-02 17:37:47 +03:00
# Transcribe an audio file
2020-01-22 17:18:17 +03:00
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav
2019-10-02 17:37:47 +03:00
2020-01-08 12:02:46 +03:00
A pre-trained English model is available for use and can be downloaded using `the instructions below <doc/USING.rst#using-a-pre-trained-model> `_ . A package with some example audio files is available for download in our `release notes <https://github.com/mozilla/DeepSpeech/releases/latest> `_ .
2019-10-02 17:37:47 +03:00
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the `release notes <https://github.com/mozilla/DeepSpeech/releases/latest> `_ to find which GPUs are supported. To run `` deepspeech `` on a GPU, install the GPU specific package:
.. code-block :: bash
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
2020-01-22 17:18:17 +03:00
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav
2019-10-02 17:37:47 +03:00
2020-01-08 12:02:46 +03:00
Please ensure you have the required `CUDA dependencies <doc/USING.rst#cuda-dependency> `_ .
2019-10-02 17:37:47 +03:00
2019-10-10 23:04:16 +03:00
See the output of `` deepspeech -h `` for more information on the use of `` deepspeech `` . (If you experience problems running `` deepspeech ` ` \ , please check ` required runtime dependencies <native_client/README.rst#required-dependencies> ` _\ ).
2019-10-02 17:37:47 +03:00
----
**Table of Contents**
2019-11-07 21:40:00 +03:00
2020-01-08 12:02:46 +03:00
* `Using a Pre-trained Model <doc/USING.rst#using-a-pre-trained-model> `_
2019-11-07 21:40:00 +03:00
2020-01-08 12:02:46 +03:00
* `CUDA dependency <doc/USING.rst#cuda-dependency> `_
* `Getting the pre-trained model <doc/USING.rst#getting-the-pre-trained-model> `_
* `Model compatibility <doc/USING.rst#model-compatibility> `_
* `Using the Python package <doc/USING.rst#using-the-python-package> `_
* `Using the Node.JS package <doc/USING.rst#using-the-nodejs-package> `_
* `Using the Command Line client <doc/USING.rst#using-the-command-line-client> `_
* `Installing bindings from source <doc/USING.rst#installing-bindings-from-source> `_
* `Third party bindings <doc/USING.rst#third-party-bindings> `_
2019-11-07 21:40:00 +03:00
2019-10-02 17:37:47 +03:00
2019-11-27 15:42:19 +03:00
* `Trying out DeepSpeech with examples <examples/README.rst> `_
2019-11-07 18:34:21 +03:00
2020-01-08 12:02:46 +03:00
* `Training your own Model <doc/TRAINING.rst#training-your-own-model> `_
* `Prerequisites for training a model <doc/TRAINING.rst#prerequisites-for-training-a-model> `_
* `Getting the training code <doc/TRAINING.rst#getting-the-training-code> `_
* `Installing Python dependencies <doc/TRAINING.rst#installing-python-dependencies> `_
* `Recommendations <doc/TRAINING.rst#recommendations> `_
* `Common Voice training data <doc/TRAINING.rst#common-voice-training-data> `_
* `Training a model <doc/TRAINING.rst#training-a-model> `_
* `Checkpointing <doc/TRAINING.rst#checkpointing> `_
* `Exporting a model for inference <doc/TRAINING.rst#exporting-a-model-for-inference> `_
* `Exporting a model for TFLite <doc/TRAINING.rst#exporting-a-model-for-tflite> `_
* `Making a mmap-able model for inference <doc/TRAINING.rst#making-a-mmap-able-model-for-inference> `_
* `Continuing training from a release model <doc/TRAINING.rst#continuing-training-from-a-release-model> `_
* `Training with Augmentation <doc/TRAINING.rst#training-with-augmentation> `_
2019-10-02 17:37:47 +03:00
* `Contribution guidelines <CONTRIBUTING.rst> `_
* `Contact/Getting Help <SUPPORT.rst> `_