Add instruction to install `sox` for CV data, add section on custom docker image

2021-02-28 14:17:07 +11:00 · 2021-02-28 14:17:07 +11:00 · c8f7cf8521
--- a/DATA_FORMATTING.md
+++ b/DATA_FORMATTING.md
@ -43,18 +43,26 @@ If you are using data from Common Voice for training a model, you will need to p

 In this example we will prepare the Indonesian dataset for training, but you can use any language from Common Voice that you prefer. We've chosen Indonesian as it has the same [orthographic alphabet](ALPHABET.md) as English, which means we don't have to use a different `alphabet.txt` file for training; we can use the default.

-This example assumes you have already [set up a Docker environment with an attached volume for training](TRAINING.md).
+---
+This example assumes you have already [set up a Docker [environment](ENVIRONMENT.md) for [training](TRAINING.md). If you have not yet set up your Docker environment, we suggest you pause here and do this first.
+---

-First, [download the dataset from Common Voice](https://commonvoice.mozilla.org/en/datasets). Extract the archive into your `deepspeech-data` Docker volume. Make sure that you place the files into the `_data` directory, otherwise they will not be available from within your Docker container.
+First, [download the dataset from Common Voice](https://commonvoice.mozilla.org/en/datasets), and extract the archive into your `deepspeech-data` directory. This makes it available to your Docker container through a _bind mount_. Start your DeepSpeech Docker container with the `deepspeech-data` directory as a _bind mount_ (this is covered in the [environment](ENVIRONMENT.md) section).

-Start your DeepSpeech Docker container with the `deepspeech-data` volume attached. Your CV corpus data should be available from within the Docker container.
+Your CV corpus data should be available from within the Docker container.

 ```
- root@3de3afbe5d6f:/DeepSpeech# ls  persistent-data/cv-corpus-6.1-2020-12-11/id/
+ root@3de3afbe5d6f:/DeepSpeech# ls  deepspeech-data/cv-corpus-6.1-2020-12-11/id/
 clips    invalidated.tsv  reported.tsv  train.tsv
 dev.tsv  other.tsv        test.tsv      validated.tsv
 ```

+The `deepspeech-training:v0.9.3` Docker image _does not_ come with `sox`, which is a package used for processing Common Voice data. We need to install `sox` first.
+
+```
+root@4b39be3b0ffc:/DeepSpeech# apt-get -y update && apt-get install -y sox
+```
+
 Next, we will run the Common Voice importer that ships with DeepSpeech.

 ```
--- a/ENVIRONMENT.md
+++ b/ENVIRONMENT.md
@ -15,7 +15,8 @@
  * [Pulling down a pre-built DeepSpeech Docker image](#pulling-down-a-pre-built-deepspeech-docker-image)
    + [Testing the image by creating a container and running a script](#testing-the-image-by-creating-a-container-and-running-a-script)
  * [Setting up a bind mount to store persistent data](#setting-up-a-bind-mount-to-store-persistent-data)
-
+  * [Extending the base `deepspeech-training` Docker image for your needs](#extending-the-base--deepspeech-training--docker-image-for-your-needs)
+  
 This section of the Playbook assumes you are comfortable installing DeepSpeech and using it with a pre-trained model, and that you are comfortable setting up a Python _virtual environment_.

 Here, we provide information on setting up a Docker environment for training your own speech recognition model using DeepSpeech. We also cover dependencies Docker has for NVIDIA GPUs, so that you can use your GPU(s) for training a model.
@ -298,7 +299,27 @@ root@e964b1e5a60c:/DeepSpeech# ls | grep deepspeech-data
 deepspeech-data
 ```

-You are now ready to begin training your model.
+You are now ready to begin [training](TRAINING.md) your model.
+
+## Extending the base `deepspeech-training` Docker image for your needs
+
+As you become more comfortable training speech recognition models with DeepSpeech, you may wish to extend the base Docker image. You can do this using the `FROM` instruction in a `Dockerfile`, for example:
+
+```
+# Custom Dockerfile for training models using DeepSpeech
+
+# Get the latest DeepSpeech image
+FROM mozilla/deepspeech-train:v0.9.3
+
+# Install nano editor
+RUN apt-get -y update && apt-get install -y nano
+
+# Install sox for inference and for processing Common Voice data
+RUN apt-get -y update && apt-get install -y sox
+
+```
+
+You can then use `docker build` with this `Dockerfile` to build your own custom Docker image.

 ---