9301aae1ec
## Describe your changes **Engine**: - Improve the output folder structure of a workflow run. The current structure was meant for multiple-ep, multiple-passflow worfklows but that is not the common usage for olive. - Unnecessary nesting for accelerator spec and pass flows is removed for single ep, single passflow scenario. - `output_name` is removed from both pass config and engine config. - The behavior of `output_name` is arbitrary. User can get the output in a specific folder by directly providing the `output_dir` like `parent-dir/specific-dir`. - `output_name` was allowed for pass config to save intermediate models. But this can be achieved by providing multiple pass flows like `[[A, B], [A, B, C]]`. This is cleaner than the former. - Refer to `Engine.run` for more details on the new output structure. **CLI**: - `add_model_options` is made configurable so that only the desired model type related options are added. - `save_output_model` uses the new engine output directory structure to copy the output model into the final output directory. - `finetune` command separated into `finetune` and `generate-adapter` commands. These commands can be chained as shown in the llama2 multilora notebook. ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link |
||
---|---|---|
.. | ||
.gitignore | ||
README.md | ||
download_files.py | ||
prepare_config.py | ||
requirements.txt | ||
vgg_config.json |
README.md
VGG model optimization on Qualcomm NPU
This folder contains a sample use case of Olive to convert an Onnx model to SNPE DLC, quantize it and convert it to Onnx.
Performs optimization pipeline:
- Onnx Model -> SNPE Model -> Quantized SNPE Model
Prerequisites
Download and unzip SNPE SDK
Download the SNPE SDK zip following instructions from Qualcomm
We test it with SNPE v2.18.0.240101.
Unzip the file and set the unzipped directory path as environment variable SNPE_ROOT
.
Configure SNPE
# in general, python 3.8 is recommended
olive configure-qualcomm-sdk --py_version 3.8 --sdk snpe
# only when the tensorflow 1.15.0 is needed, use python 3.6
olive configure-qualcomm-sdk --py_version 3.8 --sdk snpe
Pip requirements
Install the necessary python packages:
python -m pip install -r requirements.txt
Download data and model
To download the necessary data and model files:
python download_files.py
Prepare the configuration file
The configuration file vgg_config.json
contains the parameters for the conversion and quantization.
But the quantization is only supported for x64-Linux
platform. To run the optimization on Windows, please remove the quantization
section from the configuration file.
Or you can just run following command to generate the configuration file:
python prepare_config.py
Run sample
Run the conversion and quantization locally. Only supports x64-Linux
.
olive run --config vgg_config.json
Issues
- "Module 'qti.aisw.converters' has no attribute 'onnx':
Refer to this: https://developer.qualcomm.com/comment/21810#comment-21810,
change the import statement in
{SNPE_ROOT}/lib/python/qti/aisw/converters/onnx/onnx_to_ir.py:L30
to:from qti.aisw.converters.onnx import composable_custom_op_utils as ComposableCustomOp