Update README.md
This commit is contained in:
Родитель
e1aecc1f42
Коммит
eb1dbb0f94
31
README.md
31
README.md
|
@ -1,7 +1,17 @@
|
|||
# Project
|
||||
# Introduction
|
||||
|
||||
> This repo has been populated by an initial template to help get you started. Please
|
||||
> make sure to update the content to build a great experience for community-building.
|
||||
>nn-Meter is a novel and efficient system to accurately predict the inference latency of DNN models on diverse edge devices. The key idea is dividing a whole model inference into kernels, i.e., the execution units of fused operators on a device, and conduct kernel-level prediction.
|
||||
nn-Meter contains two key techniques: (i) kernel detection to automatically detect the execution unit of model inference via a set of well-designed test cases; (ii) adaptive sampling to efficiently sample the most beneficial configurations from a large space to build accurate kernel-level latency predictors.
|
||||
nn-Meter currently evaluates four popular platforms on a large dataset of 26k models. It achieves 99.0% (mobile CPU), 99.1% (mobile Adreno 640 GPU), 99.0% (mobile Adreno 630 GPU), and 83.4% (Intel VPU) prediction accuracy.
|
||||
|
||||
The current supported hardware and inference frameworks:
|
||||
|
||||
| name | Device | Framework | Processor | +-10% Accuracy |
|
||||
|:----:|:-------------------:|:--------------:|:--------------:|:------------------:|
|
||||
| CPU | Pixel4 | TFLite v2.1 | CortexA76 CPU | 99.0% |
|
||||
| GPU | Mi9 | TFLite v2.1 | Adreno 640 GPU | 99.1% |
|
||||
| GPU1 | Pixel3XL | TFLite v2.1 | Adreno 630 GPU | 99.0% |
|
||||
| VPU | Intel Movidius NCS2 | OpenVINO2019R2 | Myriad VPU | 83.4% |
|
||||
|
||||
As the maintainer of this project, please make a few updates:
|
||||
|
||||
|
@ -18,7 +28,20 @@ Please also check the versions of numpy, scikit_learn. The different versions ma
|
|||
|
||||
## Usage
|
||||
|
||||
To run the latency predictor, we support two input formats.
|
||||
To run the latency predictor, we support two input formats. We include popular CNN models in `data/testmodels`
|
||||
|
||||
#### 1. input model: xx.onnx or xx.pb:
|
||||
|
||||
`python demo_with_converter.py --input_models data/testmodels/alexnet.onnx --mf alexnet`
|
||||
|
||||
`python demo_with_converter.py --input_models data/testmodels/alexnet.pb --mf alexnet`
|
||||
|
||||
It will firstly convert onnx and pb models into our defined IR json. We conduct kernel detection with the IR graph and predict kernel latency on the 4 measured edge devices.
|
||||
|
||||
#### 2. input model: the converted IR json:
|
||||
|
||||
`python demo.py --input_models data/testmodels/alexnet_0.json --mf alexnet`
|
||||
|
||||
|
||||
## Contributing
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче