Merge branch 'main' of github.com:microsoft/nn-Meter into main
This commit is contained in:
Коммит
3d0fc0f545
39
README.md
39
README.md
|
@ -1,7 +1,17 @@
|
||||||
# Project
|
# Introduction
|
||||||
|
|
||||||
> This repo has been populated by an initial template to help get you started. Please
|
>nn-Meter is a novel and efficient system to accurately predict the inference latency of DNN models on diverse edge devices. The key idea is dividing a whole model inference into kernels, i.e., the execution units of fused operators on a device, and conduct kernel-level prediction.
|
||||||
> make sure to update the content to build a great experience for community-building.
|
nn-Meter contains two key techniques: (i) kernel detection to automatically detect the execution unit of model inference via a set of well-designed test cases; (ii) adaptive sampling to efficiently sample the most beneficial configurations from a large space to build accurate kernel-level latency predictors.
|
||||||
|
nn-Meter currently evaluates four popular platforms on a large dataset of 26k models. It achieves 99.0% (mobile CPU), 99.1% (mobile Adreno 640 GPU), 99.0% (mobile Adreno 630 GPU), and 83.4% (Intel VPU) prediction accuracy.
|
||||||
|
|
||||||
|
The current supported hardware and inference frameworks:
|
||||||
|
|
||||||
|
| name | Device | Framework | Processor | +-10% Accuracy |
|
||||||
|
|:----:|:-------------------:|:--------------:|:--------------:|:------------------:|
|
||||||
|
| CPU | Pixel4 | TFLite v2.1 | CortexA76 CPU | 99.0% |
|
||||||
|
| GPU | Mi9 | TFLite v2.1 | Adreno 640 GPU | 99.1% |
|
||||||
|
| GPU1 | Pixel3XL | TFLite v2.1 | Adreno 630 GPU | 99.0% |
|
||||||
|
| VPU | Intel Movidius NCS2 | OpenVINO2019R2 | Myriad VPU | 83.4% |
|
||||||
|
|
||||||
As the maintainer of this project, please make a few updates:
|
As the maintainer of this project, please make a few updates:
|
||||||
|
|
||||||
|
@ -10,6 +20,29 @@ As the maintainer of this project, please make a few updates:
|
||||||
- Understanding the security reporting process in SECURITY.MD
|
- Understanding the security reporting process in SECURITY.MD
|
||||||
- Remove this section from the README
|
- Remove this section from the README
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
To install nn-meter, please first install python3. The test environment uses anaconda python 3.6.10. Install the dependencies via:
|
||||||
|
`pip3 install -r requirements.txt`
|
||||||
|
Please also check the versions of numpy, scikit_learn. The different versions may change the prediction accuracy of kernel predictors.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
To run the latency predictor, we support two input formats. We include popular CNN models in `data/testmodels`
|
||||||
|
|
||||||
|
#### 1. input model: xx.onnx or xx.pb:
|
||||||
|
|
||||||
|
`python demo_with_converter.py --input_models data/testmodels/alexnet.onnx --mf alexnet`
|
||||||
|
|
||||||
|
`python demo_with_converter.py --input_models data/testmodels/alexnet.pb --mf alexnet`
|
||||||
|
|
||||||
|
It will firstly convert onnx and pb models into our defined IR json. We conduct kernel detection with the IR graph and predict kernel latency on the 4 measured edge devices.
|
||||||
|
|
||||||
|
#### 2. input model: the converted IR json:
|
||||||
|
|
||||||
|
`python demo.py --input_models data/testmodels/alexnet_0.json --mf alexnet`
|
||||||
|
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||||
|
|
Загрузка…
Ссылка в новой задаче