Feature: add tutorial for presentation
This commit is contained in:
Родитель
7bf50f83bc
Коммит
39408a8b40
|
@ -0,0 +1,51 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Quick Start of nn-Meter\n",
|
||||
"nn-Meter is a novel and efficient toolkit to accurately predict the inference latency of DNN models on diverse edge devices. nn-Meter has achieved the **Mobisys 21 Best Paper Award**, here is the paper link: [nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices](https://dl.acm.org/doi/10.1145/3458864.3467882). nn-Meter has been released as open source in [GitHub](https://github.com/microsoft/nn-Meter) and released as python package. In this notebook, we will try to leap the first step to use nn-Meter predictor, benchmark dataset and nn-Meter building tools. Let's start our journey!\n",
|
||||
"\n",
|
||||
"nn-Meter supports and is tested on Ubuntu >= 16.04, macOS >= 10.14.1, and Windows 10/11. Simply run the following `pip install` in an environment that has `python 64-bit >= 3.6`.\n",
|
||||
"```bash\n",
|
||||
"pip install nn-meter\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"After installation, a command named `nn-meter` is enabled. Users could run `nn-meter --help` to verify their installation. This is what we expect to get:\n",
|
||||
"\n",
|
||||
"```text\n",
|
||||
"usage: nn-meter [-h] [-v] [--list-predictors] {predict,lat_pred,get_ir} ...\n",
|
||||
"\n",
|
||||
"please run \"nn-meter {positional argument} --help\" to see nn-meter guidance\n",
|
||||
"\n",
|
||||
"positional arguments:\n",
|
||||
" {predict,lat_pred,get_ir}\n",
|
||||
" predict (lat_pred) apply latency predictor for testing model\n",
|
||||
" get_ir specify a model type to convert to nn-meter ir graph\n",
|
||||
"\n",
|
||||
"optional arguments:\n",
|
||||
" -h, --help show this help message and exit\n",
|
||||
" -v, --verbose increase output verbosity\n",
|
||||
" --list-predictors list all supported predictors\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"We provide three main usages for nn-Meter here.\n",
|
||||
"\n",
|
||||
"- Use nn-Meter for latency prediction\n",
|
||||
"\n",
|
||||
"- Use nn-Meter benchmark dataset\n",
|
||||
"\n",
|
||||
"- Use nn-Meter building tools"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -0,0 +1,709 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Use nn-Meter for latency prediction\n",
|
||||
"\n",
|
||||
"## Use nn_meter as a python package\n",
|
||||
"After nn-Meter installation, we can import `nn-Meter` package in python by:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"nn_meter version: 1.1a2\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import nn_meter\n",
|
||||
"print(f\"nn_meter version: {nn_meter.__version__}\")\n",
|
||||
"\n",
|
||||
"project_path = \"/home/jiahang/nnmeter-demo/\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"When using nn-Meter, the model of predictors will be automatically downloaded to the users' local device. We currently provide four predictors corresponding to four popular platforms, including mobile CPU (`\"cortexA76cpu_tflite21\"`), mobile Adreno 640 GPU (`\"adreno640gpu_tflite21\"`), mobile Adreno 630 GPU (`\"adreno640gpu_tflite21\"`), and Intel VPU (`\"myriadvpu_openvino2019r2\"`). The whole four existing predictors will take up about 6.33GB. The folder is set as `~/.nn_meter/data/` by default. If users want to change the target directory, they could run:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"nn_meter.change_user_data_folder(new_folder=project_path) # path to the new folder"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Users could list all supporting latency predictors by running:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[Predictor] cortexA76cpu_tflite21: version=1.0\n",
|
||||
"[Predictor] adreno640gpu_tflite21: version=1.0\n",
|
||||
"[Predictor] adreno630gpu_tflite21: version=1.0\n",
|
||||
"[Predictor] myriadvpu_openvino2019r2: version=1.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# list all supporting latency predictors\n",
|
||||
"predictors = nn_meter.list_latency_predictors()\n",
|
||||
"for p in predictors:\n",
|
||||
" print(f\"[Predictor] {p['name']}: version={p['version']}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"nn-Meter could predict latency for model with types of Tensorflow (with format of `.pb` file), ONNX (with format of `.onnx` file), and PyTorch ( with format of `nn.Module`). We provide some example files for users to quickly run nn-Meter. The data could be downloaded from [this link](). \n",
|
||||
"\n",
|
||||
"The first step is to load a predictor by specifying its name."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/home/jiahang/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator DecisionTreeRegressor from version 0.23.1 when using version 0.24.2. This might lead to breaking code or invalid results. Use at your own risk.\n",
|
||||
" UserWarning)\n",
|
||||
"/home/jiahang/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator RandomForestRegressor from version 0.23.1 when using version 0.24.2. This might lead to breaking code or invalid results. Use at your own risk.\n",
|
||||
" UserWarning)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"predictor_name = \"adreno640gpu_tflite21\" # user can change text here to test other predictors\n",
|
||||
"\n",
|
||||
"# load predictor\n",
|
||||
"predictor = nn_meter.load_latency_predictor(predictor_name)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If the user is the first time to use nn-Meter, it will take a while to download and unzip the required predictor model. \n",
|
||||
"\n",
|
||||
"After predictor loading, users could complete latency prediction by simply calling `predictor.predict()`. To use nn-Meter for specific model type, you also need to install corresponding required packages. The well tested versions are listed below:\n",
|
||||
"\n",
|
||||
"| Testing Model Type | Requirements |\n",
|
||||
"| :----------------: | :-----------------------------------------------------------------------------------------------------------------------: |\n",
|
||||
"| Tensorflow | `tensorflow==2.6.0` |\n",
|
||||
"| Torch | `torch==1.9.0`, `torchvision==0.10.0`, (alternative)[`onnx==1.9.0`, `onnx-simplifier==0.3.6`] or [`nni>=2.4`][1] |\n",
|
||||
"| Onnx | `onnx==1.9.0` |\n",
|
||||
"\n",
|
||||
"For Tensorflow `.pb` file:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[RESULT] predict latency for /home/jiahang/nnmeter-demo/testmodel/mobilenetv3small_0.pb: 4.489849402954042 ms\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"test_model = project_path + \"testmodel/mobilenetv3small_0.pb\"\n",
|
||||
"\n",
|
||||
"# predict latency\n",
|
||||
"latency = predictor.predict(model=test_model, model_type=\"pb\") # result is in unit of ms\n",
|
||||
"print(f'[RESULT] predict latency for {test_model}: {latency} ms')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For ONNX `.onnx` file:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[RESULT] predict latency for /home/jiahang/nnmeter-demo/testmodel/mobilenetv3small_0.onnx: 6.705541180860482 ms\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"test_model = project_path + \"testmodel/mobilenetv3small_0.onnx\"\n",
|
||||
"\n",
|
||||
"# predict latency\n",
|
||||
"latency = predictor.predict(model=test_model, model_type=\"onnx\") # result is in unit of ms\n",
|
||||
"print(f'[RESULT] predict latency for {test_model}: {latency} ms')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"There is a little difference for PyTorch model in nn-Meter. For PyTorch model prediction, a torch model with `nn.Module` format is needed, and the input shape has to be specified. Here we generated a simple torch model to run a demo. Users could choose one group of required dependencies from [`onnx==1.9.0`, `onnx-simplifier==0.3.6`], which we mark as \"onnx_based way\", or [`nni>=2.4`], which we mark as \"nni_based way\". \"onnx_based way\" is applied by default."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import torch.nn as nn\n",
|
||||
"\n",
|
||||
"class VGG(nn.Module):\n",
|
||||
"\n",
|
||||
" def __init__(self, features, num_classes=1000):\n",
|
||||
" super(VGG, self).__init__()\n",
|
||||
" self.features = features\n",
|
||||
" self.classifier = nn.Sequential(\n",
|
||||
" nn.Linear(512 * 7 * 7, 4096),\n",
|
||||
" nn.ReLU(True),\n",
|
||||
" nn.Dropout(),\n",
|
||||
" nn.Linear(4096, 4096),\n",
|
||||
" nn.ReLU(True),\n",
|
||||
" nn.Dropout(),\n",
|
||||
" nn.Linear(4096, num_classes),\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" def forward(self, x):\n",
|
||||
" x = self.features(x)\n",
|
||||
" x = x.view(x.size(0), -1)\n",
|
||||
" x = self.classifier(x)\n",
|
||||
" return x\n",
|
||||
"\n",
|
||||
"def make_layers(cfg, batch_norm=False):\n",
|
||||
" layers = []\n",
|
||||
" in_channels = 3\n",
|
||||
" for v in cfg:\n",
|
||||
" if v == 'M':\n",
|
||||
" layers += [nn.MaxPool2d(kernel_size=2, stride=2)]\n",
|
||||
" else:\n",
|
||||
" conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)\n",
|
||||
" if batch_norm:\n",
|
||||
" layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]\n",
|
||||
" else:\n",
|
||||
" layers += [conv2d, nn.ReLU(inplace=True)]\n",
|
||||
" in_channels = v\n",
|
||||
" return nn.Sequential(*layers)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A input shape should also be specified as the model cannot inference the input shape of the model by `nn.Module`. The prediction code should be:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/home/jiahang/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)\n",
|
||||
" return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[RESULT] predict latency for vgg11: 109.77864175998361 ms\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"vgg11 = VGG(make_layers([64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'])) # VGG 11-layer model\n",
|
||||
"\n",
|
||||
"# predict latency\n",
|
||||
"latency = predictor.predict(vgg11, model_type=\"torch\", input_shape=(1, 3, 224, 224)) \n",
|
||||
"print(f'[RESULT] predict latency for vgg11: {latency} ms')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
" For \"nni_based way\", the PyTorch modules should be defined by the `nn` interface from NNI `import nni.retiarii.nn.pytorch as nn` (view [NNI doc](https://nni.readthedocs.io/en/stable/NAS/QuickStart.html#define-base-model) for more information), and the parameter `apply_nni` should be set as True in the function `predictor.predict()`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:Start latency prediction ...\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:29] INFO (root/MainThread) Start latency prediction ...\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:NNI-based Torch Converter is applied for model conversion\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:29] INFO (root/MainThread) NNI-based Torch Converter is applied for model conversion\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'fc', 'name': 'fc#0', 'input_tensors': [[1, 25088]], 'cin': 25088, 'cout': 4096, 'inbounds': [], 'outbounds': ['relu#1']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'fc', 'name': 'fc#0', 'input_tensors': [[1, 25088]], 'cin': 25088, 'cout': 4096, 'inbounds': [], 'outbounds': ['relu#1']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'relu', 'name': 'relu#1', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['fc#0'], 'outbounds': ['__torch__.torch.nn.modules.dropout.Dropout#2']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'relu', 'name': 'relu#1', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['fc#0'], 'outbounds': ['__torch__.torch.nn.modules.dropout.Dropout#2']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': '__torch__.torch.nn.modules.dropout.Dropout', 'name': '__torch__.torch.nn.modules.dropout.Dropout#2', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['relu#1'], 'outbounds': ['fc#3']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': '__torch__.torch.nn.modules.dropout.Dropout', 'name': '__torch__.torch.nn.modules.dropout.Dropout#2', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['relu#1'], 'outbounds': ['fc#3']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'fc', 'name': 'fc#3', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['__torch__.torch.nn.modules.dropout.Dropout#2'], 'outbounds': ['relu#4']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'fc', 'name': 'fc#3', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['__torch__.torch.nn.modules.dropout.Dropout#2'], 'outbounds': ['relu#4']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'relu', 'name': 'relu#4', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['fc#3'], 'outbounds': ['__torch__.torch.nn.modules.dropout.Dropout#5']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'relu', 'name': 'relu#4', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['fc#3'], 'outbounds': ['__torch__.torch.nn.modules.dropout.Dropout#5']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': '__torch__.torch.nn.modules.dropout.Dropout', 'name': '__torch__.torch.nn.modules.dropout.Dropout#5', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['relu#4'], 'outbounds': ['fc#6']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': '__torch__.torch.nn.modules.dropout.Dropout', 'name': '__torch__.torch.nn.modules.dropout.Dropout#5', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 4096, 'inbounds': ['relu#4'], 'outbounds': ['fc#6']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'fc', 'name': 'fc#6', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 1000, 'inbounds': ['__torch__.torch.nn.modules.dropout.Dropout#5'], 'outbounds': []}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'fc', 'name': 'fc#6', 'input_tensors': [[1, 4096]], 'cin': 4096, 'cout': 1000, 'inbounds': ['__torch__.torch.nn.modules.dropout.Dropout#5'], 'outbounds': []}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#7', 'input_tensors': [[1, 224, 224, 3]], 'ks': [3, 3], 'inputh': 224, 'inputw': 224, 'cin': 3, 'cout': 64, 'inbounds': [], 'outbounds': ['maxpool#8']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#7', 'input_tensors': [[1, 224, 224, 3]], 'ks': [3, 3], 'inputh': 224, 'inputw': 224, 'cin': 3, 'cout': 64, 'inbounds': [], 'outbounds': ['maxpool#8']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'maxpool', 'name': 'maxpool#8', 'input_tensors': [[1, 224, 224, 64]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 224, 'inputw': 224, 'cin': 64, 'cout': 64, 'inbounds': ['conv-relu#7'], 'outbounds': ['conv-relu#9']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'maxpool', 'name': 'maxpool#8', 'input_tensors': [[1, 224, 224, 64]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 224, 'inputw': 224, 'cin': 64, 'cout': 64, 'inbounds': ['conv-relu#7'], 'outbounds': ['conv-relu#9']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#9', 'input_tensors': [[1, 112, 112, 64]], 'ks': [3, 3], 'inputh': 112, 'inputw': 112, 'cin': 64, 'cout': 128, 'inbounds': ['maxpool#8'], 'outbounds': ['maxpool#10']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#9', 'input_tensors': [[1, 112, 112, 64]], 'ks': [3, 3], 'inputh': 112, 'inputw': 112, 'cin': 64, 'cout': 128, 'inbounds': ['maxpool#8'], 'outbounds': ['maxpool#10']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'maxpool', 'name': 'maxpool#10', 'input_tensors': [[1, 112, 112, 128]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 112, 'inputw': 112, 'cin': 128, 'cout': 128, 'inbounds': ['conv-relu#9'], 'outbounds': ['conv-relu#11']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'maxpool', 'name': 'maxpool#10', 'input_tensors': [[1, 112, 112, 128]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 112, 'inputw': 112, 'cin': 128, 'cout': 128, 'inbounds': ['conv-relu#9'], 'outbounds': ['conv-relu#11']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#11', 'input_tensors': [[1, 56, 56, 128]], 'ks': [3, 3], 'inputh': 56, 'inputw': 56, 'cin': 128, 'cout': 256, 'inbounds': ['maxpool#10'], 'outbounds': ['conv-relu#12']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#11', 'input_tensors': [[1, 56, 56, 128]], 'ks': [3, 3], 'inputh': 56, 'inputw': 56, 'cin': 128, 'cout': 256, 'inbounds': ['maxpool#10'], 'outbounds': ['conv-relu#12']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#12', 'input_tensors': [[1, 56, 56, 256]], 'ks': [3, 3], 'inputh': 56, 'inputw': 56, 'cin': 256, 'cout': 256, 'inbounds': ['conv-relu#11'], 'outbounds': ['maxpool#13']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#12', 'input_tensors': [[1, 56, 56, 256]], 'ks': [3, 3], 'inputh': 56, 'inputw': 56, 'cin': 256, 'cout': 256, 'inbounds': ['conv-relu#11'], 'outbounds': ['maxpool#13']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'maxpool', 'name': 'maxpool#13', 'input_tensors': [[1, 56, 56, 256]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 56, 'inputw': 56, 'cin': 256, 'cout': 256, 'inbounds': ['conv-relu#12'], 'outbounds': ['conv-relu#14']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'maxpool', 'name': 'maxpool#13', 'input_tensors': [[1, 56, 56, 256]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 56, 'inputw': 56, 'cin': 256, 'cout': 256, 'inbounds': ['conv-relu#12'], 'outbounds': ['conv-relu#14']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#14', 'input_tensors': [[1, 28, 28, 256]], 'ks': [3, 3], 'inputh': 28, 'inputw': 28, 'cin': 256, 'cout': 512, 'inbounds': ['maxpool#13'], 'outbounds': ['conv-relu#15']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#14', 'input_tensors': [[1, 28, 28, 256]], 'ks': [3, 3], 'inputh': 28, 'inputw': 28, 'cin': 256, 'cout': 512, 'inbounds': ['maxpool#13'], 'outbounds': ['conv-relu#15']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#15', 'input_tensors': [[1, 28, 28, 512]], 'ks': [3, 3], 'inputh': 28, 'inputw': 28, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#14'], 'outbounds': ['maxpool#16']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#15', 'input_tensors': [[1, 28, 28, 512]], 'ks': [3, 3], 'inputh': 28, 'inputw': 28, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#14'], 'outbounds': ['maxpool#16']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'maxpool', 'name': 'maxpool#16', 'input_tensors': [[1, 28, 28, 512]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 28, 'inputw': 28, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#15'], 'outbounds': ['conv-relu#17']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'maxpool', 'name': 'maxpool#16', 'input_tensors': [[1, 28, 28, 512]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 28, 'inputw': 28, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#15'], 'outbounds': ['conv-relu#17']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#17', 'input_tensors': [[1, 14, 14, 512]], 'ks': [3, 3], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['maxpool#16'], 'outbounds': ['conv-relu#18']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#17', 'input_tensors': [[1, 14, 14, 512]], 'ks': [3, 3], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['maxpool#16'], 'outbounds': ['conv-relu#18']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'conv-relu', 'name': 'conv-relu#18', 'input_tensors': [[1, 14, 14, 512]], 'ks': [3, 3], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#17'], 'outbounds': ['maxpool#19']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'conv-relu', 'name': 'conv-relu#18', 'input_tensors': [[1, 14, 14, 512]], 'ks': [3, 3], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#17'], 'outbounds': ['maxpool#19']}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:{'op': 'maxpool', 'name': 'maxpool#19', 'input_tensors': [[1, 14, 14, 512]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#18'], 'outbounds': []}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) {'op': 'maxpool', 'name': 'maxpool#19', 'input_tensors': [[1, 14, 14, 512]], 'ks': [2, 2], 'strides': [2, 2], 'inputh': 14, 'inputw': 14, 'cin': 512, 'cout': 512, 'inbounds': ['conv-relu#18'], 'outbounds': []}\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:root:Predict latency: 109.77864175998361 ms\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[2021-11-16 13:22:31] INFO (root/MainThread) Predict latency: 109.77864175998361 ms\n",
|
||||
"[RESULT] predict latency for vgg11: 109.77864175998361 ms\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import nni.retiarii.nn.pytorch as nn # different from \"onnx_based way\"\n",
|
||||
"\n",
|
||||
"class VGG(nn.Module):\n",
|
||||
"\n",
|
||||
" def __init__(self, features, num_classes=1000):\n",
|
||||
" super(VGG, self).__init__()\n",
|
||||
" self.features = features\n",
|
||||
" self.classifier = nn.Sequential(\n",
|
||||
" nn.Linear(512 * 7 * 7, 4096),\n",
|
||||
" nn.ReLU(True),\n",
|
||||
" nn.Dropout(),\n",
|
||||
" nn.Linear(4096, 4096),\n",
|
||||
" nn.ReLU(True),\n",
|
||||
" nn.Dropout(),\n",
|
||||
" nn.Linear(4096, num_classes),\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" def forward(self, x):\n",
|
||||
" x = self.features(x)\n",
|
||||
" x = x.view(x.size(0), -1)\n",
|
||||
" x = self.classifier(x)\n",
|
||||
" return x\n",
|
||||
"\n",
|
||||
"def make_layers(cfg, batch_norm=False):\n",
|
||||
" layers = []\n",
|
||||
" in_channels = 3\n",
|
||||
" for v in cfg:\n",
|
||||
" if v == 'M':\n",
|
||||
" layers += [nn.MaxPool2d(kernel_size=2, stride=2)]\n",
|
||||
" else:\n",
|
||||
" conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)\n",
|
||||
" if batch_norm:\n",
|
||||
" layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]\n",
|
||||
" else:\n",
|
||||
" layers += [conv2d, nn.ReLU(inplace=True)]\n",
|
||||
" in_channels = v\n",
|
||||
" return nn.Sequential(*layers)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"vgg11 = VGG(make_layers([64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'])) # VGG 11-layer model\n",
|
||||
"\n",
|
||||
"# predict latency\n",
|
||||
"latency = predictor.predict(\n",
|
||||
" vgg11, model_type=\"torch\", input_shape=(1, 3, 224, 224), \n",
|
||||
" apply_nni=True # different from \"onnx_based way\"\n",
|
||||
" ) \n",
|
||||
"print(f'[RESULT] predict latency for vgg11: {latency} ms')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use nn-Meter by command line\n",
|
||||
"\n",
|
||||
"Another way to run nn-Meter is be script command line. \n",
|
||||
"\n",
|
||||
"After nn-Meter installation, a command `nn-meter` is added. You can predict the latency by \n",
|
||||
"```Bash\n",
|
||||
"# for Tensorflow (.pb) file\n",
|
||||
"nn-meter predict --predictor <hardware> [--predictor-version <version>] --tensorflow <pb-file_or_folder> \n",
|
||||
"\n",
|
||||
"# for ONNX (*.onnx) file\n",
|
||||
"nn-meter predict --predictor <hardware> [--predictor-version <version>] --onnx <onnx-file_or_folder>\n",
|
||||
"\n",
|
||||
"# for torch model from torchvision model zoo (str)\n",
|
||||
"nn-meter predict --predictor <hardware> [--predictor-version <version>] --torchvision <model-name> <model-name>... \n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Here are some concrete examples:\n",
|
||||
"```Bash\n",
|
||||
"project_path=\"/home/jiahang/nnmeter-demo/testmodel\"\n",
|
||||
"\n",
|
||||
"nn-meter predict --predictor adreno640gpu_tflite21 --tensorflow $project_path\n",
|
||||
"\n",
|
||||
"nn-meter predict --predictor adreno640gpu_tflite21 --onnx $project_path\n",
|
||||
"\n",
|
||||
"nn-meter predict --predictor adreno640gpu_tflite21 --torchvision mobilenet_v2\n",
|
||||
"```\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"interpreter": {
|
||||
"hash": "2602612169f43f91d25fe52816b7763616055f24dc48b1edca6c7b81a282af45"
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3.6.10 64-bit ('py36': conda)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.10"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -0,0 +1,391 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Use nn-Meter Benchmark Dataset\n",
|
||||
"nn-Meter collects and generates 26k CNN models. The dataset is released and an interface of `nn_meter.dataset` is provided for users to get access to the dataset. In this notebook, we showed how to use nn-Meter benchmark dataset for nn-Meter latency prediction, and, as a extension, for GNN latency prediction.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Model group: alexnets.jsonl\n",
|
||||
"Model group: densenets.jsonl\n",
|
||||
"Model group: googlenets.jsonl\n",
|
||||
"Model group: mnasnets.jsonl\n",
|
||||
"Model group: mobilenetv1s.jsonl\n",
|
||||
"Model group: mobilenetv2s.jsonl\n",
|
||||
"Model group: mobilenetv3s.jsonl\n",
|
||||
"Model group: nasbench201s.jsonl\n",
|
||||
"Model group: proxylessnass.jsonl\n",
|
||||
"Model group: resnets.jsonl\n",
|
||||
"Model group: shufflenetv2s.jsonl\n",
|
||||
"Model group: squeezenets.jsonl\n",
|
||||
"Model group: vggs.jsonl\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from nn_meter.dataset import bench_dataset\n",
|
||||
"\n",
|
||||
"datasets = bench_dataset()\n",
|
||||
"for data in datasets:\n",
|
||||
" print(f\"Model group: {os.path.basename(data)}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"There are 13 groups of models in the benchmark dataset. In each groups, about 2000 model with different parameters were sampled.\n",
|
||||
"\n",
|
||||
"Dataset schema: for each model, the dataset stores its: \n",
|
||||
"- model id\n",
|
||||
"- graph in nn-meter IR graph format \n",
|
||||
"- latency numbers on four devices\n",
|
||||
"\n",
|
||||
"Here we export some information of one model to show the schema of the dataset."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"dict keys: ['id', 'cortexA76cpu_tflite21', 'adreno640gpu_tflite21', 'adreno630gpu_tflite21', 'myriadvpu_openvino2019r2', 'graph']\n",
|
||||
"model id alexnet_1356\n",
|
||||
"cpu latency: 148.164\n",
|
||||
"adreno640gpu latency: 24.4851\n",
|
||||
"adreno630gpu latency: 31.932404999999996\n",
|
||||
"intelvpu latency: 15.486\n",
|
||||
"model graph is stored in nn-meter IR (shows only one node here): {'inbounds': ['input_im_0'], 'attr': {'name': 'conv1.conv/Conv2D', 'type': 'Conv2D', 'output_shape': [[1, 56, 56, 63]], 'attr': {'dilations': [1, 1], 'strides': [4, 4], 'data_format': 'NHWC', 'padding': 'VALID', 'kernel_shape': [7, 7], 'weight_shape': [7, 7, 3, 63], 'pads': [0, 0, 0, 0]}, 'input_shape': [[1, 224, 224, 3]]}, 'outbounds': ['conv1.relu.relu/Relu']}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import jsonlines\n",
|
||||
"test_data = datasets[0]\n",
|
||||
"with jsonlines.open(test_data) as data_reader:\n",
|
||||
" True_lat = []\n",
|
||||
" Pred_lat = []\n",
|
||||
" for i, item in enumerate(data_reader):\n",
|
||||
" print('dict keys:',list(item.keys()))\n",
|
||||
" print('model id',item['id'])\n",
|
||||
" print('cpu latency: ',item['cortexA76cpu_tflite21'])\n",
|
||||
" print('adreno640gpu latency: ',item['adreno640gpu_tflite21'])\n",
|
||||
" print('adreno630gpu latency: ',item['adreno630gpu_tflite21'])\n",
|
||||
" print('intelvpu latency: ',item['myriadvpu_openvino2019r2'])\n",
|
||||
" print('model graph is stored in nn-meter IR (shows only one node here):',\\\n",
|
||||
" item['graph']['conv1.conv/Conv2D'])\n",
|
||||
" break"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use nn-Meter predictor with benchmark dataset "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/home/jiahang/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator DecisionTreeRegressor from version 0.23.1 when using version 0.24.2. This might lead to breaking code or invalid results. Use at your own risk.\n",
|
||||
" UserWarning)\n",
|
||||
"/home/jiahang/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator RandomForestRegressor from version 0.23.1 when using version 0.24.2. This might lead to breaking code or invalid results. Use at your own risk.\n",
|
||||
" UserWarning)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import nn_meter\n",
|
||||
"\n",
|
||||
"predictor_name = 'adreno640gpu_tflite21' # user can change text here to test other predictors\n",
|
||||
"\n",
|
||||
"# load predictor\n",
|
||||
"predictor = nn_meter.load_latency_predictor(predictor_name)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[RESULT] alexnets.jsonl[0]: predict: 23.447085575244767, real: 24.4851\n",
|
||||
"[RESULT] alexnets.jsonl[1]: predict: 23.88567577635713, real: 23.9185\n",
|
||||
"[RESULT] alexnets.jsonl[2]: predict: 29.586297830632216, real: 30.3052\n",
|
||||
"[RESULT] alexnets.jsonl[3]: predict: 51.12333226388624, real: 52.089\n",
|
||||
"[RESULT] alexnets.jsonl[4]: predict: 4.93716647049407, real: 5.26442\n",
|
||||
"[RESULT] alexnets.jsonl[5]: predict: 14.996201148770355, real: 15.2265\n",
|
||||
"[RESULT] alexnets.jsonl[6]: predict: 9.262593840400983, real: 9.12046\n",
|
||||
"[RESULT] alexnets.jsonl[7]: predict: 13.91285961819858, real: 14.2242\n",
|
||||
"[RESULT] alexnets.jsonl[8]: predict: 15.022936121166751, real: 15.2457\n",
|
||||
"[RESULT] alexnets.jsonl[9]: predict: 12.443609556620192, real: 12.5989\n",
|
||||
"[RESULT] alexnets.jsonl[10]: predict: 15.97123988761122, real: 15.185\n",
|
||||
"[RESULT] alexnets.jsonl[11]: predict: 19.469347190777864, real: 20.1434\n",
|
||||
"[RESULT] alexnets.jsonl[12]: predict: 12.580476335563757, real: 14.4818\n",
|
||||
"[RESULT] alexnets.jsonl[13]: predict: 18.51408123823703, real: 19.0136\n",
|
||||
"[RESULT] alexnets.jsonl[14]: predict: 7.330729281187614, real: 7.7855\n",
|
||||
"[RESULT] alexnets.jsonl[15]: predict: 14.860185617106696, real: 15.7775\n",
|
||||
"[RESULT] alexnets.jsonl[16]: predict: 15.788781165175772, real: 16.0765\n",
|
||||
"[RESULT] alexnets.jsonl[17]: predict: 35.33131516111195, real: 35.7741\n",
|
||||
"[RESULT] alexnets.jsonl[18]: predict: 12.409197810645443, real: 12.4725\n",
|
||||
"[RESULT] alexnets.jsonl[19]: predict: 37.084732595563125, real: 36.4975\n",
|
||||
"[SUMMARY] The first 20 cases from alexnets.jsonl on adreno640gpu_tflite21: rmse: 0.6889098264185185, 5%accuracy: 0.75, 10%accuracy: 0.95\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# view latency prediction demo in one model group of the dataset \n",
|
||||
"test_data = datasets[0]\n",
|
||||
"with jsonlines.open(test_data) as data_reader:\n",
|
||||
" True_lat = []\n",
|
||||
" Pred_lat = []\n",
|
||||
" for i, item in enumerate(data_reader):\n",
|
||||
" if i >= 20: # only show the first 20 results to save space\n",
|
||||
" break\n",
|
||||
" graph = item[\"graph\"]\n",
|
||||
" pred_lat = predictor.predict(graph, model_type=\"nnmeter-ir\")\n",
|
||||
" real_lat = item[predictor_name]\n",
|
||||
" print(f'[RESULT] {os.path.basename(test_data)}[{i}]: predict: {pred_lat}, real: {real_lat}')\n",
|
||||
"\n",
|
||||
" if real_lat != None:\n",
|
||||
" True_lat.append(real_lat)\n",
|
||||
" Pred_lat.append(pred_lat)\n",
|
||||
"\n",
|
||||
"if len(True_lat) > 0:\n",
|
||||
" rmse, rmspe, error, acc5, acc10, _ = nn_meter.latency_metrics(Pred_lat, True_lat)\n",
|
||||
" print(\n",
|
||||
" f'[SUMMARY] The first 20 cases from {os.path.basename(test_data)} on {predictor_name}: rmse: {rmse}, 5%accuracy: {acc5}, 10%accuracy: {acc10}'\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use benckmark dataset for GNN\n",
|
||||
"\n",
|
||||
"Considering the dataset is encoded in a graph format, we also provide interfaces, i.e., `GNNDataset` and `GNNDataloader`, for GNN training to predict the model latency with the bench dataset. \n",
|
||||
"\n",
|
||||
"`GNNDataset` and `GNNDataloader` in `nn_meter/dataset/gnn_dataloader.py` build the model structure of the Dataset in `.jsonl` format into GNN required Dataset and Dataloader. The output of GNNDataset includes adjacency matrix and attributes of the graph, together with latency value. The script depends on package `torch` and `dgl`.\n",
|
||||
"\n",
|
||||
"Here we provide dataset for GNN training:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Processing Training Set.\n",
|
||||
"Processing Testing Set.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"ERROR:root:You have not install the dgl package, please install dgl and try again.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"ename": "AttributeError",
|
||||
"evalue": "'NoneType' object has no attribute 'graph'",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
|
||||
"\u001b[0;32m<ipython-input-8-8b8d13c1f2c9>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0mtest_set\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgnn_dataloader\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mGNNDataset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mdevice\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtarget_device\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mtrain_loader\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgnn_dataloader\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mGNNDataloader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain_set\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatchsize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m \u001b[0;34m,\u001b[0m \u001b[0mshuffle\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 12\u001b[0m \u001b[0mtest_loader\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mgnn_dataloader\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mGNNDataloader\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtest_set\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatchsize\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mshuffle\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Train Dataset Size:'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtrain_set\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||
"\u001b[0;32m~/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/nn_meter/dataset/gnn_dataloader.py\u001b[0m in \u001b[0;36m__init__\u001b[0;34m(self, dataset, shuffle, batchsize)\u001b[0m\n\u001b[1;32m 231\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraphs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 232\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlatencies\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 233\u001b[0;31m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconstruct_graphs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 234\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 235\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mconstruct_graphs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||
"\u001b[0;32m~/anaconda3/envs/nn-meter1.1-test/lib/python3.6/site-packages/nn_meter/dataset/gnn_dataloader.py\u001b[0m in \u001b[0;36mconstruct_graphs\u001b[0;34m(self)\u001b[0m\n\u001b[1;32m 238\u001b[0m \u001b[0;34m(\u001b[0m\u001b[0madj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mattrs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlatency\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mop_types\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdataset\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mgid\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 239\u001b[0m \u001b[0mu\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mv\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnonzero\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0madj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mas_tuple\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mTrue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 240\u001b[0;31m \u001b[0mgraph\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdgl\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgraph\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mu\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mv\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 241\u001b[0m \u001b[0mMAX_NORM\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtensor\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mop_types\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m6963\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m6963\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m224\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m224\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m11\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 242\u001b[0m \u001b[0mattrs\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mattrs\u001b[0m \u001b[0;34m/\u001b[0m \u001b[0mMAX_NORM\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
|
||||
"\u001b[0;31mAttributeError\u001b[0m: 'NoneType' object has no attribute 'graph'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from nn_meter.dataset import gnn_dataloader\n",
|
||||
"\n",
|
||||
"target_device = \"cortexA76cpu_tflite21\"\n",
|
||||
"\n",
|
||||
"print(\"Processing Training Set.\")\n",
|
||||
"train_set = gnn_dataloader.GNNDataset(train=True, device=target_device) \n",
|
||||
"print(\"Processing Testing Set.\")\n",
|
||||
"test_set = gnn_dataloader.GNNDataset(train=False, device=target_device)\n",
|
||||
"\n",
|
||||
"train_loader = gnn_dataloader.GNNDataloader(train_set, batchsize=1 , shuffle=True)\n",
|
||||
"test_loader = gnn_dataloader.GNNDataloader(test_set, batchsize=1, shuffle=False)\n",
|
||||
"print('Train Dataset Size:', len(train_set))\n",
|
||||
"print('Testing Dataset Size:', len(test_set))\n",
|
||||
"print('Attribute tensor shape:', next(train_loader)[1].ndata['h'].size(1))\n",
|
||||
"ATTR_COUNT = next(train_loader)[1].ndata['h'].size(1)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Then we build a GNN model, which is constructed based on GraphSAGE, and maxpooling is selected as out pooling method."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import torch.nn as nn\n",
|
||||
"from torch.nn.modules.module import Module\n",
|
||||
"import dgl.nn as dglnn\n",
|
||||
"from dgl.nn.pytorch.glob import MaxPooling\n",
|
||||
"\n",
|
||||
"class GNN(Module):\n",
|
||||
" def __init__(self, \n",
|
||||
" num_features=0, \n",
|
||||
" num_layers=2,\n",
|
||||
" num_hidden=32,\n",
|
||||
" dropout_ratio=0):\n",
|
||||
"\n",
|
||||
" super(GNN, self).__init__()\n",
|
||||
" self.nfeat = num_features\n",
|
||||
" self.nlayer = num_layers\n",
|
||||
" self.nhid = num_hidden\n",
|
||||
" self.dropout_ratio = dropout_ratio\n",
|
||||
" self.gc = nn.ModuleList([dglnn.SAGEConv(self.nfeat if i==0 else self.nhid, self.nhid, 'pool') for i in range(self.nlayer)])\n",
|
||||
" self.bn = nn.ModuleList([nn.LayerNorm(self.nhid) for i in range(self.nlayer)])\n",
|
||||
" self.relu = nn.ModuleList([nn.ReLU() for i in range(self.nlayer)])\n",
|
||||
" self.pooling = MaxPooling()\n",
|
||||
" self.fc = nn.Linear(self.nhid, 1)\n",
|
||||
" self.fc1 = nn.Linear(self.nhid, self.nhid)\n",
|
||||
" self.dropout = nn.ModuleList([nn.Dropout(self.dropout_ratio) for i in range(self.nlayer)])\n",
|
||||
"\n",
|
||||
" def forward_single_model(self, g, features):\n",
|
||||
" x = self.relu[0](self.bn[0](self.gc[0](g, features)))\n",
|
||||
" x = self.dropout[0](x)\n",
|
||||
" for i in range(1,self.nlayer):\n",
|
||||
" x = self.relu[i](self.bn[i](self.gc[i](g, x)))\n",
|
||||
" x = self.dropout[i](x)\n",
|
||||
" return x\n",
|
||||
"\n",
|
||||
" def forward(self, g, features):\n",
|
||||
" x = self.forward_single_model(g, features)\n",
|
||||
" with g.local_scope():\n",
|
||||
" g.ndata['h'] = x\n",
|
||||
" x = self.pooling(g, x)\n",
|
||||
" x = self.fc1(x)\n",
|
||||
" return self.fc(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Start GNN training:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from torch.optim.lr_scheduler import CosineAnnealingLR\n",
|
||||
"\n",
|
||||
"if torch.cuda.is_available():\n",
|
||||
" print(\"Using CUDA.\")\n",
|
||||
"# device = \"cpu\"\n",
|
||||
"device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
|
||||
"\n",
|
||||
"# Start Training\n",
|
||||
"model = GNN(ATTR_COUNT, 3, 400, 0.1).to(device)\n",
|
||||
"opt = torch.optim.AdamW(model.parameters(), lr=4e-4)\n",
|
||||
"EPOCHS=20\n",
|
||||
"loss_func = nn.L1Loss()\n",
|
||||
"\n",
|
||||
"lr_scheduler = CosineAnnealingLR(opt, T_max=EPOCHS)\n",
|
||||
"loss_sum = 0\n",
|
||||
"for epoch in range(EPOCHS):\n",
|
||||
" train_length = len(train_set)\n",
|
||||
" tran_acc_ten = 0\n",
|
||||
" loss_sum = 0 \n",
|
||||
" # latency, graph, types, flops\n",
|
||||
" for batched_l, batched_g in train_loader:\n",
|
||||
" opt.zero_grad()\n",
|
||||
" batched_l = batched_l.to(device).float()\n",
|
||||
" batched_g = batched_g.to(device)\n",
|
||||
" batched_f = batched_g.ndata['h'].float()\n",
|
||||
" logits = model(batched_g, batched_f)\n",
|
||||
" for i in range(len(batched_l)):\n",
|
||||
" pred_latency = logits[i].item()\n",
|
||||
" prec_latency = batched_l[i].item()\n",
|
||||
" if (pred_latency >= 0.9 * prec_latency) and (pred_latency <= 1.1 * prec_latency):\n",
|
||||
" tran_acc_ten += 1\n",
|
||||
" # print(\"true latency: \", batched_l)\n",
|
||||
" # print(\"Predict latency: \", logits)\n",
|
||||
" batched_l = torch.reshape(batched_l, (-1 ,1))\n",
|
||||
" loss = loss_func(logits, batched_l)\n",
|
||||
" loss_sum += loss\n",
|
||||
" loss.backward()\n",
|
||||
" opt.step()\n",
|
||||
" lr_scheduler.step()\n",
|
||||
" print(\"[Epoch \", epoch, \"]: \", \"Training accuracy within 10%: \", tran_acc_ten / train_length * 100, \" %.\")\n",
|
||||
" # print('Learning Rate:', lr_scheduler.get_last_lr())\n",
|
||||
" # print('Loss:', loss_sum / train_length)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"interpreter": {
|
||||
"hash": "2602612169f43f91d25fe52816b7763616055f24dc48b1edca6c7b81a282af45"
|
||||
},
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3.6.10 64-bit ('py36': conda)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.10"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
Загрузка…
Ссылка в новой задаче