diff --git a/examples/pytorch/FastCells/KWS-training/README.md b/examples/pytorch/FastCells/KWS-training/README.md new file mode 100644 index 00000000..8cb52930 --- /dev/null +++ b/examples/pytorch/FastCells/KWS-training/README.md @@ -0,0 +1,99 @@ +## Training Keyword-Spotting model + +This example demonstrates how to train a FastGRNN-based keyword spotting model based on the Google speech commands dataset, +compile it using the ELL compiler and deploy the keyword spotting model on [STM BlueCoin](https://www.st.com/en/evaluation-tools/steval-bcnkt01v1.html). +Follow the steps below to featurize data using ELL, train and export an ONNX model using the EdgeML library, +and prepare a binary that provides prediction capability using the ELL library. + +### Install ELL +[link](https://github.com/microsoft/ELL) + +### Download Google speech commands dataset +Download the [dataset](https://storage.cloud.google.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz) and extract data +``` +mkdir data_speech_commands_v2 +tar xvzf data_speech_commands_v0.02.tar.gz -C data_speech_commands_v2 +``` + +### Export path to dataset and ELL +``` +export ELL_ROOT= +### export ELL_ROOT=/home/user/ELL +export DATASET_PATH= +### export DATASET_PATH=/mnt/../../data_speech_data_v2 +``` + +### Make training list - +Use `-max n` to over-ride the default limit on the maximum number of samples from each category including `background`. For low false positive rate, train with a large number of negative `background` examples, say 50000 or 250000. +``` +python3 $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_training_list.py -max 50000 --wav_files $DATASET_PATH +``` + +### Create an ELL featurizer - +``` +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_featurizer.py -ws 400 --nfft 512 --iir --log --log_delta 2.220446049250313e-16 --power_spec -fs 32 +``` + +### Compile the ELL featurizer - +``` +python $ELL_ROOT/tools/wrap/wrap.py --model_file featurizer.ell --outdir compiled_featurizer --module_name mfcc +cd compiled_featurizer && mkdir build && cd build && cmake .. && make && cd ../.. +``` + +### Pre-Process Dataset: +``` +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/training_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/validation_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/testing_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore +``` + +If you have a background noise clips not containing keywords that you want to fuse with your dataset with, +place them in a folder `$DATASET_PATH/backgroundNoise` and follow these instructions instead of the ones above. +``` +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/training_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore --noise_path $DATASET_PATH/backgroundNoise --max_noise_ratio 0.1 --noise_selection 1 +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/validation_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore --noise_path $DATASET_PATH/backgroundNoise --max_noise_ratio 0.1 --noise_selection 1 +python $ELL_ROOT/tools/utilities/pythonlibs/audio/training/make_dataset.py --list_file $DATASET_PATH/testing_list.txt --featurizer compiled_featurizer/mfcc --window_size 98 --shift 98 --multicore --noise_path $DATASET_PATH/backgroundNoise --max_noise_ratio 0.1 --noise_selection 1 +``` + +### Run model training: +``` +python examples/pytorch/FastCells/train_classifier.py \ + --use_gpu --normalize --rolling --max_rolling_length 235 \ + -a $DATASET_PATH -c $DATASET_PATH/categories.txt --outdir $MODEL_DIR \ + --architecture FastGRNNCUDA --num_layers 2 \ + --epochs 250 --learning_rate 0.005 -bs 128 -hu 128 \ + --lr_min 0.0005 --lr_scheduler CosineAnnealingLR --lr_peaks 0 +``` +Drop the `--rolling` and `--max_rolling_length` options if you are going to run inference on 1 second clips, +and do not plan to stream data through the model without resettting. + +### Convert .onnx model to .ell IR +``` +pip install onnx #If you haven't already +python $ELL_ROOT/tools/importers/onnx/onnx_import.py output_model/model.onnx +``` + + +### Compiling model and featurizer header and binary files for ARM Cortex M4 class devices. + +These commands will use ELL compiler to generate some files of which 4 are required: featurizer.h, featurizer.S, model.h and model.S + +#### For devices with hard FPU, e.g., STM Bluecoin +``` +$ELL_ROOT/build/bin/compile -imap model.ell -cfn Predict -cmn completemodel --bitcode -od . --fuseLinearOps True --header --blas false --optimize true --target custom --numBits 32 --cpu cortex-m4 --triple armv6m-gnueabi --features +vfp4,+d16 +/usr/lib/llvm-8/bin/opt model.bc -o model.opt.bc -O3 +/usr/lib/llvm-8/bin/llc model.opt.bc -o model.S -O3 -filetype=asm -mtriple=armv6m-gnueabi -mcpu=cortex-m4 -relocation-model=pic -float-abi=hard -mattr=+vfp4,+d16 +$ELL_ROOT/build/bin/compile -imap featurizer.ell -cfn Filter -cmn mfcc --bitcode -od . --fuseLinearOps True --header --blas false --optimize true --target custom --numBits 32 --cpu cortex-m4 --triple armv6m-gnueabi --features +vfp4,+d16 +/usr/lib/llvm-8/bin/opt featurizer.bc -o featurizer.opt.bc -O3 +/usr/lib/llvm-8/bin/llc featurizer.opt.bc -o featurizer.S -O3 -filetype=asm -mtriple=armv6m-gnueabi -mcpu=cortex-m4 -relocation-model=pic -float-abi=hard -mattr=+vfp4,+d16 +``` + +#### For M4 class devices without hard FPU, e.g., MXchip +``` +$ELL_ROOT/build/bin/compile -imap model.ell -cfn Predict -cmn completemodel --bitcode -od . --fuseLinearOps True --header --blas false --optimize true --target custom --numBits 32 --cpu cortex-m4 --triple armv6m-gnueabi --features +vfp4,+d16,+soft-float +/usr/lib/llvm-8/bin/opt model.bc -o model.opt.bc -O3 +/usr/lib/llvm-8/bin/llc model.opt.bc -o model.S -O3 -filetype=asm -mtriple=armv6m-gnueabi -mcpu=cortex-m4 -relocation-model=pic -float-abi=soft -mattr=+vfp4,+d16 +$ELL_ROOT/build/bin/compile -imap featurizer.ell -cfn Filter -cmn mfcc --bitcode -od . --fuseLinearOps True --header --blas false --optimize true --target custom --numBits 32 --cpu cortex-m4 --triple armv6m-gnueabi --features +vfp4,+d16,+soft-float +/usr/lib/llvm-8/bin/opt featurizer.bc -o featurizer.opt.bc -O3 +/usr/lib/llvm-8/bin/llc featurizer.opt.bc -o featurizer.S -O3 -filetype=asm -mtriple=armv6m-gnueabi -mcpu=cortex-m4 -relocation-model=pic -float-abi=soft -mattr=+vfp4,+d16 +```