This commit is contained in:
jiahangxu 2021-11-04 10:36:03 +08:00
Родитель ac83a2bffa
Коммит 72eee00c08
34 изменённых файлов: 110 добавлений и 115 удалений

Просмотреть файл

@ -1,37 +1,37 @@
# Hardware-aware DNN Model Design # Hardware-aware DNN Model Design
In many DNN model deployment scenarios, there are strict inference efficiency constraints as well as the model accuracy. For example, the **inference latency** and **energy consumption** are the most frequently used criteria of efficiencies to determine whether a DNN model could be deployed on a mobile phone or not. Therefore, DNN model designers have to consider the model efficiency. A typical methodology is to train a big model to meet the accuracy requirements first, and then apply model compression algorithms to get a light-weight model with similar accuracy but much smaller size. Due to many reasons, they use the number of parameters and FLOPs in the compression process. In many DNN model deployment scenarios, there are strict inference efficiency constraints as well as the model accuracy. For example, the **inference latency** and **energy consumption** are the most frequently used criteria of efficiencies to determine whether a DNN model could be deployed on a mobile phone or not. Therefore, DNN model designers have to consider the model efficiency. A typical methodology is to train a big model to meet the accuracy requirements first, and then apply model compression algorithms to get a light-weight model with similar accuracy but much smaller size. Due to many reasons, they use the number of parameters and FLOPs in the compression process.
However, as pointed out in our work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and many others, ***neither number of parameters nor number of FLOPs is a good metric of the real inference efficiency (e.g., latency or energy consumption)***. Operators with similar FLOPs may have very different inference latency on different hardware platforms (e.g., CPU, GPU, and ASIC) (shown in work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and [[3]](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf)). This makes the effort of designing efficient DNN models for a target hardware bit of games of opening blind boxes. Recently, many hardware-aware NAS works are proposed to solve this challenge. However, as pointed out in our work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and many others, ***neither number of parameters nor number of FLOPs is a good metric of the real inference efficiency (e.g., latency or energy consumption)***. Operators with similar FLOPs may have very different inference latency on different hardware platforms (e.g., CPU, GPU, and ASIC) (shown in work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and [[3]](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf)). This makes the effort of designing efficient DNN models for a target hardware bit of games of opening blind boxes. Recently, many hardware-aware NAS works are proposed to solve this challenge.
Compared with the conventional NAS algorithms, some recent works (i.e. hardware-aware NAS, aka HW-NAS) integrated hardware-awareness into the search loop and achieves a balanced trade-off between accuracy and hardware efficiencies [[4]](http://arxiv.org/abs/2101.09336). Compared with the conventional NAS algorithms, some recent works (i.e. hardware-aware NAS, aka HW-NAS) integrated hardware-awareness into the search loop and achieves a balanced trade-off between accuracy and hardware efficiencies [[4]](http://arxiv.org/abs/2101.09336).
Next, we introduce our hardware-aware NAS framework[[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf), which combines the nn-Meter, to search high-accuracy DNN models within the latency constraints for target edge devices. Next, we introduce our hardware-aware NAS framework[[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf), which combines the nn-Meter, to search high-accuracy DNN models within the latency constraints for target edge devices.
## Hardware-aware Neural Architecture Search ## Hardware-aware Neural Architecture Search
<img src="imgs/hw-nas.png" alt="drawing" width="800"/> <img src="imgs/hw-nas.png" alt="drawing" width="800"/>
**Hardware-aware Search Space Generation.** As formulated in many works, the search space is one of the three key aspects of a NAS process (the other two are the search strategy and the evaluation methodology) and matters a lot to the final results. **Hardware-aware Search Space Generation.** As formulated in many works, the search space is one of the three key aspects of a NAS process (the other two are the search strategy and the evaluation methodology) and matters a lot to the final results.
Our HW-NAS framework firstly automatically selects the hardware-friendly operators (or blocks) by considering both representation capacity and hardware efficiency. The selected operators could establish a ***hardware-aware search space*** for most of existing NAS algorithms. Our HW-NAS framework firstly automatically selects the hardware-friendly operators (or blocks) by considering both representation capacity and hardware efficiency. The selected operators could establish a ***hardware-aware search space*** for most of existing NAS algorithms.
**Latency Prediction in search process by nn-Meter.** Different with other simple predictors (e.g., look-up table for operators/blocks, linear regression models), [nn-Meter](overview.md) conducts kernel-level prediction, which captures the complex model graph optimizations on edge devices. nn-Meter is the first accurate latency prediction tool for DNNs on edge devices. **Latency Prediction in search process by nn-Meter.** Different with other simple predictors (e.g., look-up table for operators/blocks, linear regression models), [nn-Meter](overview.md) conducts kernel-level prediction, which captures the complex model graph optimizations on edge devices. nn-Meter is the first accurate latency prediction tool for DNNs on edge devices.
Besides the search space specialization, our HW-NAS framework also allows combining nn-Meter with existing NAS algorithms in the optimization objectives and constraints. As described in [[4]](http://arxiv.org/abs/2101.09336), the HW-NAS algorithms often consider hardware efficiency metrics as the constraints of existing NAS formulation or part of the scalarized loss functions (e.g., the loss is weighted sum of both cross entropy loss and hardware-aware penalty). Since the NAS process may sample up to millions of candidate model architectures, the obtaining of hardware metrics must be accurate and efficient. Besides the search space specialization, our HW-NAS framework also allows combining nn-Meter with existing NAS algorithms in the optimization objectives and constraints. As described in [[4]](http://arxiv.org/abs/2101.09336), the HW-NAS algorithms often consider hardware efficiency metrics as the constraints of existing NAS formulation or part of the scalarized loss functions (e.g., the loss is weighted sum of both cross entropy loss and hardware-aware penalty). Since the NAS process may sample up to millions of candidate model architectures, the obtaining of hardware metrics must be accurate and efficient.
nn-Meter is now integrated with [NNI](https://github.com/microsoft/nni), the AutoML framework also published by Microsoft, and could be combined with existing NAS algorithms seamlessly. [This doc](https://nni.readthedocs.io/en/stable/NAS/multi_trial_nas.html) show how to construct a latency constraint filter in [random search algorithm](https://arxiv.org/abs/1902.07638) on [SPOS NAS](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123610528.pdf) search space. Users could use this filter in multiple phases of the NAS process, e.g., the architecture searching phase and the super-net training phase. nn-Meter is now integrated with [NNI](https://github.com/microsoft/nni), the AutoML framework also published by Microsoft, and could be combined with existing NAS algorithms seamlessly. [This doc](https://nni.readthedocs.io/en/stable/NAS/multi_trial_nas.html) show how to construct a latency constraint filter in [random search algorithm](https://arxiv.org/abs/1902.07638) on [SPOS NAS](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123610528.pdf) search space. Users could use this filter in multiple phases of the NAS process, e.g., the architecture searching phase and the super-net training phase.
***Note that current nn-Meter project is limited to the latency prediction. For the other hardware metrics, e.g., energy consumption is another important metric in edge computing. Collaborations and contributions together with nn-Meter are highly welcomed!*** ***Note that current nn-Meter project is limited to the latency prediction. For the other hardware metrics, e.g., energy consumption is another important metric in edge computing. Collaborations and contributions together with nn-Meter are highly welcomed!***
## Other hardware-aware techniques ## Other hardware-aware techniques
Besides light weighted NAS, which search for an efficient architecture directly, there are also other techniques to achieve light weight DNN models, such as model compression and knowledge distillation (KD). Both methods tries to get a smaller but similar-performed models from a pre-trained big model. The difference is that model compression removes some of the components in the origin model, while knowledge distillation constructs a new student model and lets it learn the behavior of the origin model. Hardware awareness could also be combined with these methods. Besides light weighted NAS, which search for an efficient architecture directly, there are also other techniques to achieve light weight DNN models, such as model compression and knowledge distillation (KD). Both methods tries to get a smaller but similar-performed models from a pre-trained big model. The difference is that model compression removes some of the components in the origin model, while knowledge distillation constructs a new student model and lets it learn the behavior of the origin model. Hardware awareness could also be combined with these methods.
For example, nn-Meter could help users to construct suitable student architectures for the target hardware platform in the KD task. For example, nn-Meter could help users to construct suitable student architectures for the target hardware platform in the KD task.
## References ## References
1. Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, Yunxin Liu: [&#34;Fast hardware-aware neural architecture search.&#34;](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020. 1. Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, Yunxin Liu: [&#34;Fast hardware-aware neural architecture search.&#34;](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020.
2. Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, Yunxin Liu: [&#34;nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices.&#34;](https://dl.acm.org/doi/10.1145/3458864.3467882) Proceedings of the 19th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2021) 2. Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, Yunxin Liu: [&#34;nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices.&#34;](https://dl.acm.org/doi/10.1145/3458864.3467882) Proceedings of the 19th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2021)
3. Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, Yunxin Liu: [&#34;To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks&#34;](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf) Proceedings of the 4th MLSys Conference (MLSys 2021) 3. Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, Yunxin Liu: [&#34;To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks&#34;](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf) Proceedings of the 4th MLSys Conference (MLSys 2021)
4. Benmeziane, H., Maghraoui, K. el, Ouarnoughi, H., Niar, S., Wistuba, M., & Wang, N. (2021).[&#34; A Comprehensive Survey on Hardware-Aware Neural Architecture Search.&#34;](http://arxiv.org/abs/2101.09336) 4. Benmeziane, H., Maghraoui, K. el, Ouarnoughi, H., Niar, S., Wistuba, M., & Wang, N. (2021).[&#34; A Comprehensive Survey on Hardware-Aware Neural Architecture Search.&#34;](http://arxiv.org/abs/2101.09336)

Просмотреть файл

Просмотреть файл

Просмотреть файл

@ -7,20 +7,26 @@ try:
except ModuleNotFoundError: except ModuleNotFoundError:
__version__ = 'UNKNOWN' __version__ = 'UNKNOWN'
from .nn_meter import ( import logging
nnMeter, from functools import partial, partialmethod
from .predictor import (
nnMeterPredictor,
load_latency_predictor, load_latency_predictor,
list_latency_predictors, list_latency_predictors,
latency_metrics
)
from .ir_converter import (
model_file_to_graph, model_file_to_graph,
model_to_graph, model_to_graph
)
from .utils import (
create_user_configs, create_user_configs,
change_user_data_folder change_user_data_folder
) )
from .utils.utils import download_from_url from .dataset import bench_dataset
from .predictor import latency_metrics from .utils import download_from_url
from .dataset import bench_dataset # TODO: add GNNDataloader and GNNDataset here @wenxuan
import logging
from functools import partial, partialmethod
logging.KEYINFO = 22 logging.KEYINFO = 22
logging.addLevelName(logging.KEYINFO, 'KEYINFO') logging.addLevelName(logging.KEYINFO, 'KEYINFO')

Просмотреть файл

@ -1,13 +1,12 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
import os, sys import os, sys
from nn_meter.predictor import latency_metrics import logging
import jsonlines
from glob import glob from glob import glob
from nn_meter.nn_meter import list_latency_predictors, load_latency_predictor, get_user_data_folder from nn_meter.predictor import latency_metrics, list_latency_predictors, load_latency_predictor
from nn_meter import download_from_url from nn_meter.utils import download_from_url, get_user_data_folder
import jsonlines
import logging
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset') __user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')

Просмотреть файл

@ -1,16 +1,18 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
import torch
import jsonlines
import os import os
import random import random
import torch
import jsonlines
from .bench_dataset import bench_dataset from .bench_dataset import bench_dataset
from nn_meter.nn_meter import get_user_data_folder from nn_meter.utils import get_user_data_folder
from nn_meter.utils.utils import try_import_dgl from nn_meter.utils.import_package import try_import_dgl
RAW_DATA_URL = "https://github.com/microsoft/nn-Meter/releases/download/v1.0-data/datasets.zip" RAW_DATA_URL = "https://github.com/microsoft/nn-Meter/releases/download/v1.0-data/datasets.zip"
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset') __user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')
hws = [ hws = [
"cortexA76cpu_tflite21", "cortexA76cpu_tflite21",
"adreno640gpu_tflite21", "adreno640gpu_tflite21",

Просмотреть файл

@ -2,10 +2,10 @@
# Licensed under the MIT license. # Licensed under the MIT license.
import numpy as np import numpy as np
from nn_meter.utils.graph_tool import ModelGraph
from .frozenpb_parser import FrozenPbParser from .frozenpb_parser import FrozenPbParser
from .shape_inference import ShapeInference from .shape_inference import ShapeInference
from .shape_fetcher import ShapeFetcher from .shape_fetcher import ShapeFetcher
from nn_meter.utils.graph_tool import ModelGraph
class FrozenPbConverter: class FrozenPbConverter:
def __init__(self, file_name): def __init__(self, file_name):

Просмотреть файл

@ -1,12 +1,13 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.utils.utils import try_import_tensorflow
from .protobuf_helper import ProtobufHelper
from .shape_fetcher import ShapeFetcher
import copy
import re import re
import copy
import logging import logging
from .protobuf_helper import ProtobufHelper
from nn_meter.utils.import_package import try_import_tensorflow
logging = logging.getLogger(__name__) logging = logging.getLogger(__name__)

Просмотреть файл

@ -1,9 +1,8 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.utils.utils import try_import_tensorflow
import numpy as np import numpy as np
from typing import List from typing import List
from nn_meter.utils.utils import try_import_tensorflow
class ShapeFetcher: class ShapeFetcher:
def __init__(self, input_graph): def __init__(self, input_graph):

Просмотреть файл

@ -1,10 +1,10 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from .protobuf_helper import ProtobufHelper as ph
from functools import reduce from functools import reduce
import copy import copy
import math import math
import logging import logging
from .protobuf_helper import ProtobufHelper as ph
logging = logging.getLogger(__name__) logging = logging.getLogger(__name__)

Просмотреть файл

@ -1,11 +1,11 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.utils.utils import try_import_onnx import logging
import networkx as nx import networkx as nx
from itertools import chain
from .utils import get_tensor_shape from .utils import get_tensor_shape
from .constants import SLICE_TYPE from .constants import SLICE_TYPE
from itertools import chain from nn_meter.utils.import_package import try_import_onnx
import logging
class OnnxConverter: class OnnxConverter:

Просмотреть файл

@ -1,7 +1,5 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
def get_tensor_shape(tensor): def get_tensor_shape(tensor):
shape = [] shape = []
for dim in tensor.type.tensor_type.shape.dim: for dim in tensor.type.tensor_type.shape.dim:

Просмотреть файл

@ -1,11 +1,9 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.utils.utils import try_import_onnx, try_import_torch, try_import_onnxsim, try_import_nni
import tempfile import tempfile
from nn_meter.ir_converter.onnx_converter import OnnxConverter from ..onnx_converter import OnnxConverter
from .opset_map import nni_attr_map, nni_type_map from .opset_map import nni_attr_map, nni_type_map
from nn_meter.utils.import_package import try_import_onnx, try_import_torch, try_import_onnxsim, try_import_nni
def _nchw_to_nhwc(shapes): def _nchw_to_nhwc(shapes):

Просмотреть файл

@ -2,11 +2,12 @@
# Licensed under the MIT license. # Licensed under the MIT license.
import json import json
import logging import logging
from nn_meter.utils.utils import try_import_onnx, try_import_torch, try_import_torchvision_models from nn_meter.utils.import_package import try_import_onnx, try_import_torch, try_import_torchvision_models
from .onnx_converter import OnnxConverter from .onnx_converter import OnnxConverter
from .frozenpb_converter import FrozenPbConverter from .frozenpb_converter import FrozenPbConverter
from .torch_converter import NNIBasedTorchConverter, OnnxBasedTorchConverter, NNIIRConverter from .torch_converter import NNIBasedTorchConverter, OnnxBasedTorchConverter, NNIIRConverter
def model_file_to_graph(filename: str, model_type: str, input_shape=(1, 3, 224, 224), apply_nni=False): def model_file_to_graph(filename: str, model_type: str, input_shape=(1, 3, 224, 224), apply_nni=False):
""" """
read the given file and convert the model in the file content to nn-Meter IR graph object read the given file and convert the model in the file content to nn-Meter IR graph object
@ -106,10 +107,12 @@ def onnx_model_to_graph(model):
converter = OnnxConverter(model) converter = OnnxConverter(model)
return converter.convert() return converter.convert()
def nni_model_to_graph(model): def nni_model_to_graph(model):
converter = NNIIRConverter(model) converter = NNIIRConverter(model)
return converter.convert() return converter.convert()
def torch_model_to_graph(model, input_shape=(1, 3, 224, 224), apply_nni=False): def torch_model_to_graph(model, input_shape=(1, 3, 224, 224), apply_nni=False):
torch = try_import_torch() torch = try_import_torch()
args = torch.randn(*input_shape) args = torch.randn(*input_shape)

Просмотреть файл

@ -1,3 +1,3 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from .detection.detector import KernelDetector from .kernel_detector import KernelDetector

Просмотреть файл

@ -1,2 +0,0 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

Просмотреть файл

@ -3,7 +3,7 @@
import os import os
import json import json
from nn_meter.utils.graph_tool import ModelGraph from nn_meter.utils.graph_tool import ModelGraph
from nn_meter.kernel_detector.utils.ir_tools import convert_nodes from ..utils.ir_tools import convert_nodes
BASE_DIR = os.path.dirname(os.path.abspath(__file__)) BASE_DIR = os.path.dirname(os.path.abspath(__file__))

Просмотреть файл

@ -1,11 +1,10 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.kernel_detector.rulelib.rule_reader import RuleReader
from nn_meter.kernel_detector.rulelib.rule_splitter import RuleSplitter
from nn_meter.utils.graph_tool import ModelGraph from nn_meter.utils.graph_tool import ModelGraph
from nn_meter.kernel_detector.utils.constants import DUMMY_TYPES from .utils.constants import DUMMY_TYPES
from nn_meter.kernel_detector.utils.ir_tools import convert_nodes from .utils.ir_tools import convert_nodes
# import logging from .rulelib.rule_reader import RuleReader
from .rulelib.rule_splitter import RuleSplitter
class KernelDetector: class KernelDetector:

Просмотреть файл

@ -1,8 +1,8 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
import json import json
from ..fusionlib import get_fusion_unit
from nn_meter.utils.graph_tool import ModelGraph from nn_meter.utils.graph_tool import ModelGraph
from nn_meter.kernel_detector.fusionlib import get_fusion_unit
class RuleReader: class RuleReader:

Просмотреть файл

@ -1,9 +1,9 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from .rule_reader import RuleReader from .rule_reader import RuleReader
from ..utils.match_helper import MatchHelper
from ..utils.fusion_aware_graph import FusionAwareGraph
from nn_meter.utils.graph_tool import ModelGraph from nn_meter.utils.graph_tool import ModelGraph
from nn_meter.kernel_detector.utils.match_helper import MatchHelper
from nn_meter.kernel_detector.utils.fusion_aware_graph import FusionAwareGraph
class RuleSplitter: class RuleSplitter:

Просмотреть файл

@ -1,8 +1,8 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from nn_meter.utils.graph_tool import ModelGraph
from .union_find import UF
import networkx as nx import networkx as nx
from .union_find import UF
from nn_meter.utils.graph_tool import ModelGraph
class FusionAwareGraph: class FusionAwareGraph:

Просмотреть файл

@ -5,12 +5,7 @@ import os
import sys import sys
import argparse import argparse
import logging import logging
from nn_meter.nn_meter import * from nn_meter import *
__user_config_folder__ = os.path.expanduser('~/.nn_meter/config')
__user_data_folder__ = os.path.expanduser('~/.nn_meter/data')
__predictors_cfg_filename__ = 'predictors.yaml'
def list_latency_predictors_cli(): def list_latency_predictors_cli():
@ -62,6 +57,7 @@ def apply_latency_predictor_cli(args):
return result return result
def get_nnmeter_ir_cli(args): def get_nnmeter_ir_cli(args):
"""convert pb file or onnx file to nn-Meter IR graph according to the command line interface arguments """convert pb file or onnx file to nn-Meter IR graph according to the command line interface arguments
""" """

Просмотреть файл

@ -1,5 +1,4 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from .predictors.utils import latency_metrics from .prediction.utils import latency_metrics
from .nn_meter_predictor import nnMeterPredictor, list_latency_predictors, load_latency_predictor

Просмотреть файл

@ -1,18 +1,18 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from glob import glob
from nn_meter.predictor.predictors.predict_by_kernel import nn_predict
from nn_meter.kernel_detector import KernelDetector
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
from .utils import loading_to_local
import yaml
import os import os
import yaml
import pkg_resources import pkg_resources
from shutil import copyfile from shutil import copyfile
from packaging import version from packaging import version
import logging import logging
from .utils import loading_to_local
from .prediction.predict_by_kernel import nn_predict
from nn_meter.kernel_detector import KernelDetector
from nn_meter.utils import load_config_file, get_user_data_folder from nn_meter.utils import load_config_file, get_user_data_folder
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
__predictors_cfg_filename__ = 'predictors.yaml' __predictors_cfg_filename__ = 'predictors.yaml'

Просмотреть файл

@ -1,8 +1,8 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
import logging
import numpy as np import numpy as np
from sklearn.metrics import mean_squared_error from sklearn.metrics import mean_squared_error
import logging
def get_flop(input_channel, output_channel, k, H, W, stride): def get_flop(input_channel, output_channel, k, H, W, stride):

Просмотреть файл

@ -1,9 +1,9 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
import numpy as np import numpy as np
from sklearn.metrics import mean_squared_error from sklearn.metrics import mean_squared_error
def get_kernel_name(optype): def get_kernel_name(optype):
""" """
for many similar kernels, we use one kernel predictor since their latency difference is negligible, for many similar kernels, we use one kernel predictor since their latency difference is negligible,
@ -34,6 +34,7 @@ def get_kernel_name(optype):
return optype return optype
def get_accuracy(y_pred, y_true, threshold=0.01): def get_accuracy(y_pred, y_true, threshold=0.01):
a = (y_true - y_pred) / y_true a = (y_true - y_pred) / y_true
b = np.where(abs(a) <= threshold) b = np.where(abs(a) <= threshold)

Просмотреть файл

@ -7,7 +7,7 @@ from zipfile import ZipFile
from tqdm import tqdm from tqdm import tqdm
import requests import requests
import logging import logging
from nn_meter.utils.utils import download_from_url from nn_meter.utils import download_from_url
def loading_to_local(pred_info, dir="data/predictorzoo"): def loading_to_local(pred_info, dir="data/predictorzoo"):

Просмотреть файл

@ -5,4 +5,5 @@ from config_manager import (
get_user_data_folder, get_user_data_folder,
change_user_data_folder, change_user_data_folder,
load_config_file load_config_file
) )
from utils import download_from_url

Просмотреть файл

@ -1,14 +1,9 @@
from glob import glob
from nn_meter.predictor.predictors.predict_by_kernel import nn_predict
from nn_meter.kernel_detector import KernelDetector
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
from nn_meter.predictor.load_predictors import loading_to_local
import yaml import yaml
import os import os
import logging
import pkg_resources import pkg_resources
from shutil import copyfile from shutil import copyfile
import logging
__user_config_folder__ = os.path.expanduser('~/.nn_meter/config') __user_config_folder__ = os.path.expanduser('~/.nn_meter/config')
__default_user_data_folder__ = os.path.expanduser('~/.nn_meter/data') __default_user_data_folder__ = os.path.expanduser('~/.nn_meter/data')

Просмотреть файл

@ -1,7 +1,7 @@
# Copyright (c) Microsoft Corporation. # Copyright (c) Microsoft Corporation.
# Licensed under the MIT license. # Licensed under the MIT license.
from packaging import version
import logging import logging
from packaging import version
def try_import_onnx(require_version = ["1.9.0"]): def try_import_onnx(require_version = ["1.9.0"]):

Просмотреть файл

@ -20,7 +20,7 @@ def download_from_url(urladdr, ppath):
if not os.path.isdir(ppath): if not os.path.isdir(ppath):
os.makedirs(ppath) os.makedirs(ppath)
# logging.keyinfo(f'Download from {urladdr}') logging.keyinfo(f'Download from {urladdr}')
response = requests.get(urladdr, stream=True) response = requests.get(urladdr, stream=True)
total_size_in_bytes = int(response.headers.get("content-length", 0)) total_size_in_bytes = int(response.headers.get("content-length", 0))
block_size = 2048 # 2 Kibibyte block_size = 2048 # 2 Kibibyte