refine package import
This commit is contained in:
Родитель
ac83a2bffa
Коммит
72eee00c08
|
@ -1,37 +1,37 @@
|
||||||
# Hardware-aware DNN Model Design
|
# Hardware-aware DNN Model Design
|
||||||
|
|
||||||
In many DNN model deployment scenarios, there are strict inference efficiency constraints as well as the model accuracy. For example, the **inference latency** and **energy consumption** are the most frequently used criteria of efficiencies to determine whether a DNN model could be deployed on a mobile phone or not. Therefore, DNN model designers have to consider the model efficiency. A typical methodology is to train a big model to meet the accuracy requirements first, and then apply model compression algorithms to get a light-weight model with similar accuracy but much smaller size. Due to many reasons, they use the number of parameters and FLOPs in the compression process.
|
In many DNN model deployment scenarios, there are strict inference efficiency constraints as well as the model accuracy. For example, the **inference latency** and **energy consumption** are the most frequently used criteria of efficiencies to determine whether a DNN model could be deployed on a mobile phone or not. Therefore, DNN model designers have to consider the model efficiency. A typical methodology is to train a big model to meet the accuracy requirements first, and then apply model compression algorithms to get a light-weight model with similar accuracy but much smaller size. Due to many reasons, they use the number of parameters and FLOPs in the compression process.
|
||||||
|
|
||||||
However, as pointed out in our work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and many others, ***neither number of parameters nor number of FLOPs is a good metric of the real inference efficiency (e.g., latency or energy consumption)***. Operators with similar FLOPs may have very different inference latency on different hardware platforms (e.g., CPU, GPU, and ASIC) (shown in work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and [[3]](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf)). This makes the effort of designing efficient DNN models for a target hardware bit of games of opening blind boxes. Recently, many hardware-aware NAS works are proposed to solve this challenge.
|
However, as pointed out in our work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and many others, ***neither number of parameters nor number of FLOPs is a good metric of the real inference efficiency (e.g., latency or energy consumption)***. Operators with similar FLOPs may have very different inference latency on different hardware platforms (e.g., CPU, GPU, and ASIC) (shown in work [[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) and [[3]](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf)). This makes the effort of designing efficient DNN models for a target hardware bit of games of opening blind boxes. Recently, many hardware-aware NAS works are proposed to solve this challenge.
|
||||||
|
|
||||||
Compared with the conventional NAS algorithms, some recent works (i.e. hardware-aware NAS, aka HW-NAS) integrated hardware-awareness into the search loop and achieves a balanced trade-off between accuracy and hardware efficiencies [[4]](http://arxiv.org/abs/2101.09336).
|
Compared with the conventional NAS algorithms, some recent works (i.e. hardware-aware NAS, aka HW-NAS) integrated hardware-awareness into the search loop and achieves a balanced trade-off between accuracy and hardware efficiencies [[4]](http://arxiv.org/abs/2101.09336).
|
||||||
|
|
||||||
Next, we introduce our hardware-aware NAS framework[[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf), which combines the nn-Meter, to search high-accuracy DNN models within the latency constraints for target edge devices.
|
Next, we introduce our hardware-aware NAS framework[[1]](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf), which combines the nn-Meter, to search high-accuracy DNN models within the latency constraints for target edge devices.
|
||||||
|
|
||||||
## Hardware-aware Neural Architecture Search
|
## Hardware-aware Neural Architecture Search
|
||||||
|
|
||||||
<img src="imgs/hw-nas.png" alt="drawing" width="800"/>
|
<img src="imgs/hw-nas.png" alt="drawing" width="800"/>
|
||||||
|
|
||||||
**Hardware-aware Search Space Generation.** As formulated in many works, the search space is one of the three key aspects of a NAS process (the other two are the search strategy and the evaluation methodology) and matters a lot to the final results.
|
**Hardware-aware Search Space Generation.** As formulated in many works, the search space is one of the three key aspects of a NAS process (the other two are the search strategy and the evaluation methodology) and matters a lot to the final results.
|
||||||
|
|
||||||
Our HW-NAS framework firstly automatically selects the hardware-friendly operators (or blocks) by considering both representation capacity and hardware efficiency. The selected operators could establish a ***hardware-aware search space*** for most of existing NAS algorithms.
|
Our HW-NAS framework firstly automatically selects the hardware-friendly operators (or blocks) by considering both representation capacity and hardware efficiency. The selected operators could establish a ***hardware-aware search space*** for most of existing NAS algorithms.
|
||||||
|
|
||||||
**Latency Prediction in search process by nn-Meter.** Different with other simple predictors (e.g., look-up table for operators/blocks, linear regression models), [nn-Meter](overview.md) conducts kernel-level prediction, which captures the complex model graph optimizations on edge devices. nn-Meter is the first accurate latency prediction tool for DNNs on edge devices.
|
**Latency Prediction in search process by nn-Meter.** Different with other simple predictors (e.g., look-up table for operators/blocks, linear regression models), [nn-Meter](overview.md) conducts kernel-level prediction, which captures the complex model graph optimizations on edge devices. nn-Meter is the first accurate latency prediction tool for DNNs on edge devices.
|
||||||
|
|
||||||
Besides the search space specialization, our HW-NAS framework also allows combining nn-Meter with existing NAS algorithms in the optimization objectives and constraints. As described in [[4]](http://arxiv.org/abs/2101.09336), the HW-NAS algorithms often consider hardware efficiency metrics as the constraints of existing NAS formulation or part of the scalarized loss functions (e.g., the loss is weighted sum of both cross entropy loss and hardware-aware penalty). Since the NAS process may sample up to millions of candidate model architectures, the obtaining of hardware metrics must be accurate and efficient.
|
Besides the search space specialization, our HW-NAS framework also allows combining nn-Meter with existing NAS algorithms in the optimization objectives and constraints. As described in [[4]](http://arxiv.org/abs/2101.09336), the HW-NAS algorithms often consider hardware efficiency metrics as the constraints of existing NAS formulation or part of the scalarized loss functions (e.g., the loss is weighted sum of both cross entropy loss and hardware-aware penalty). Since the NAS process may sample up to millions of candidate model architectures, the obtaining of hardware metrics must be accurate and efficient.
|
||||||
|
|
||||||
nn-Meter is now integrated with [NNI](https://github.com/microsoft/nni), the AutoML framework also published by Microsoft, and could be combined with existing NAS algorithms seamlessly. [This doc](https://nni.readthedocs.io/en/stable/NAS/multi_trial_nas.html) show how to construct a latency constraint filter in [random search algorithm](https://arxiv.org/abs/1902.07638) on [SPOS NAS](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123610528.pdf) search space. Users could use this filter in multiple phases of the NAS process, e.g., the architecture searching phase and the super-net training phase.
|
nn-Meter is now integrated with [NNI](https://github.com/microsoft/nni), the AutoML framework also published by Microsoft, and could be combined with existing NAS algorithms seamlessly. [This doc](https://nni.readthedocs.io/en/stable/NAS/multi_trial_nas.html) show how to construct a latency constraint filter in [random search algorithm](https://arxiv.org/abs/1902.07638) on [SPOS NAS](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123610528.pdf) search space. Users could use this filter in multiple phases of the NAS process, e.g., the architecture searching phase and the super-net training phase.
|
||||||
|
|
||||||
***Note that current nn-Meter project is limited to the latency prediction. For the other hardware metrics, e.g., energy consumption is another important metric in edge computing. Collaborations and contributions together with nn-Meter are highly welcomed!***
|
***Note that current nn-Meter project is limited to the latency prediction. For the other hardware metrics, e.g., energy consumption is another important metric in edge computing. Collaborations and contributions together with nn-Meter are highly welcomed!***
|
||||||
|
|
||||||
## Other hardware-aware techniques
|
## Other hardware-aware techniques
|
||||||
|
|
||||||
Besides light weighted NAS, which search for an efficient architecture directly, there are also other techniques to achieve light weight DNN models, such as model compression and knowledge distillation (KD). Both methods tries to get a smaller but similar-performed models from a pre-trained big model. The difference is that model compression removes some of the components in the origin model, while knowledge distillation constructs a new student model and lets it learn the behavior of the origin model. Hardware awareness could also be combined with these methods.
|
Besides light weighted NAS, which search for an efficient architecture directly, there are also other techniques to achieve light weight DNN models, such as model compression and knowledge distillation (KD). Both methods tries to get a smaller but similar-performed models from a pre-trained big model. The difference is that model compression removes some of the components in the origin model, while knowledge distillation constructs a new student model and lets it learn the behavior of the origin model. Hardware awareness could also be combined with these methods.
|
||||||
For example, nn-Meter could help users to construct suitable student architectures for the target hardware platform in the KD task.
|
For example, nn-Meter could help users to construct suitable student architectures for the target hardware platform in the KD task.
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
1. Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, Yunxin Liu: ["Fast hardware-aware neural architecture search."](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020.
|
1. Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, Yunxin Liu: ["Fast hardware-aware neural architecture search."](https://openaccess.thecvf.com/content_CVPRW_2020/papers/w40/Zhang_Fast_Hardware-Aware_Neural_Architecture_Search_CVPRW_2020_paper.pdf) Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2020.
|
||||||
2. Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, Yunxin Liu: ["nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices."](https://dl.acm.org/doi/10.1145/3458864.3467882) Proceedings of the 19th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2021)
|
2. Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, Yunxin Liu: ["nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices."](https://dl.acm.org/doi/10.1145/3458864.3467882) Proceedings of the 19th ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2021)
|
||||||
3. Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, Yunxin Liu: ["To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks"](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf) Proceedings of the 4th MLSys Conference (MLSys 2021)
|
3. Xiaohu Tang, Shihao Han, Li Lyna Zhang, Ting Cao, Yunxin Liu: ["To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks"](https://proceedings.mlsys.org/paper/2021/file/02522a2b2726fb0a03bb19f2d8d9524d-Paper.pdf) Proceedings of the 4th MLSys Conference (MLSys 2021)
|
||||||
4. Benmeziane, H., Maghraoui, K. el, Ouarnoughi, H., Niar, S., Wistuba, M., & Wang, N. (2021).[" A Comprehensive Survey on Hardware-Aware Neural Architecture Search."](http://arxiv.org/abs/2101.09336)
|
4. Benmeziane, H., Maghraoui, K. el, Ouarnoughi, H., Niar, S., Wistuba, M., & Wang, N. (2021).[" A Comprehensive Survey on Hardware-Aware Neural Architecture Search."](http://arxiv.org/abs/2101.09336)
|
|
@ -7,20 +7,26 @@ try:
|
||||||
except ModuleNotFoundError:
|
except ModuleNotFoundError:
|
||||||
__version__ = 'UNKNOWN'
|
__version__ = 'UNKNOWN'
|
||||||
|
|
||||||
from .nn_meter import (
|
import logging
|
||||||
nnMeter,
|
from functools import partial, partialmethod
|
||||||
|
|
||||||
|
from .predictor import (
|
||||||
|
nnMeterPredictor,
|
||||||
load_latency_predictor,
|
load_latency_predictor,
|
||||||
list_latency_predictors,
|
list_latency_predictors,
|
||||||
|
latency_metrics
|
||||||
|
)
|
||||||
|
from .ir_converter import (
|
||||||
model_file_to_graph,
|
model_file_to_graph,
|
||||||
model_to_graph,
|
model_to_graph
|
||||||
|
)
|
||||||
|
from .utils import (
|
||||||
create_user_configs,
|
create_user_configs,
|
||||||
change_user_data_folder
|
change_user_data_folder
|
||||||
)
|
)
|
||||||
from .utils.utils import download_from_url
|
from .dataset import bench_dataset
|
||||||
from .predictor import latency_metrics
|
from .utils import download_from_url
|
||||||
from .dataset import bench_dataset # TODO: add GNNDataloader and GNNDataset here @wenxuan
|
|
||||||
import logging
|
|
||||||
from functools import partial, partialmethod
|
|
||||||
|
|
||||||
logging.KEYINFO = 22
|
logging.KEYINFO = 22
|
||||||
logging.addLevelName(logging.KEYINFO, 'KEYINFO')
|
logging.addLevelName(logging.KEYINFO, 'KEYINFO')
|
||||||
|
|
|
@ -1,13 +1,12 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
import os, sys
|
import os, sys
|
||||||
from nn_meter.predictor import latency_metrics
|
import logging
|
||||||
|
import jsonlines
|
||||||
from glob import glob
|
from glob import glob
|
||||||
|
|
||||||
from nn_meter.nn_meter import list_latency_predictors, load_latency_predictor, get_user_data_folder
|
from nn_meter.predictor import latency_metrics, list_latency_predictors, load_latency_predictor
|
||||||
from nn_meter import download_from_url
|
from nn_meter.utils import download_from_url, get_user_data_folder
|
||||||
import jsonlines
|
|
||||||
import logging
|
|
||||||
|
|
||||||
|
|
||||||
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')
|
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')
|
||||||
|
|
|
@ -1,16 +1,18 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
import torch
|
|
||||||
import jsonlines
|
|
||||||
import os
|
import os
|
||||||
import random
|
import random
|
||||||
|
import torch
|
||||||
|
import jsonlines
|
||||||
from .bench_dataset import bench_dataset
|
from .bench_dataset import bench_dataset
|
||||||
from nn_meter.nn_meter import get_user_data_folder
|
from nn_meter.utils import get_user_data_folder
|
||||||
from nn_meter.utils.utils import try_import_dgl
|
from nn_meter.utils.import_package import try_import_dgl
|
||||||
|
|
||||||
|
|
||||||
RAW_DATA_URL = "https://github.com/microsoft/nn-Meter/releases/download/v1.0-data/datasets.zip"
|
RAW_DATA_URL = "https://github.com/microsoft/nn-Meter/releases/download/v1.0-data/datasets.zip"
|
||||||
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')
|
__user_dataset_folder__ = os.path.join(get_user_data_folder(), 'dataset')
|
||||||
|
|
||||||
|
|
||||||
hws = [
|
hws = [
|
||||||
"cortexA76cpu_tflite21",
|
"cortexA76cpu_tflite21",
|
||||||
"adreno640gpu_tflite21",
|
"adreno640gpu_tflite21",
|
||||||
|
|
|
@ -2,10 +2,10 @@
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
|
||||||
from .frozenpb_parser import FrozenPbParser
|
from .frozenpb_parser import FrozenPbParser
|
||||||
from .shape_inference import ShapeInference
|
from .shape_inference import ShapeInference
|
||||||
from .shape_fetcher import ShapeFetcher
|
from .shape_fetcher import ShapeFetcher
|
||||||
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
|
|
||||||
class FrozenPbConverter:
|
class FrozenPbConverter:
|
||||||
def __init__(self, file_name):
|
def __init__(self, file_name):
|
||||||
|
|
|
@ -1,12 +1,13 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.utils.utils import try_import_tensorflow
|
|
||||||
from .protobuf_helper import ProtobufHelper
|
|
||||||
from .shape_fetcher import ShapeFetcher
|
|
||||||
import copy
|
|
||||||
import re
|
import re
|
||||||
|
import copy
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
|
from .protobuf_helper import ProtobufHelper
|
||||||
|
from nn_meter.utils.import_package import try_import_tensorflow
|
||||||
|
|
||||||
|
|
||||||
logging = logging.getLogger(__name__)
|
logging = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -1,9 +1,8 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.utils.utils import try_import_tensorflow
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
from typing import List
|
from typing import List
|
||||||
|
from nn_meter.utils.utils import try_import_tensorflow
|
||||||
|
|
||||||
class ShapeFetcher:
|
class ShapeFetcher:
|
||||||
def __init__(self, input_graph):
|
def __init__(self, input_graph):
|
||||||
|
|
|
@ -1,10 +1,10 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from .protobuf_helper import ProtobufHelper as ph
|
|
||||||
from functools import reduce
|
from functools import reduce
|
||||||
import copy
|
import copy
|
||||||
import math
|
import math
|
||||||
import logging
|
import logging
|
||||||
|
from .protobuf_helper import ProtobufHelper as ph
|
||||||
|
|
||||||
|
|
||||||
logging = logging.getLogger(__name__)
|
logging = logging.getLogger(__name__)
|
||||||
|
|
|
@ -1,11 +1,11 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.utils.utils import try_import_onnx
|
import logging
|
||||||
import networkx as nx
|
import networkx as nx
|
||||||
|
from itertools import chain
|
||||||
from .utils import get_tensor_shape
|
from .utils import get_tensor_shape
|
||||||
from .constants import SLICE_TYPE
|
from .constants import SLICE_TYPE
|
||||||
from itertools import chain
|
from nn_meter.utils.import_package import try_import_onnx
|
||||||
import logging
|
|
||||||
|
|
||||||
|
|
||||||
class OnnxConverter:
|
class OnnxConverter:
|
||||||
|
|
|
@ -1,7 +1,5 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
|
|
||||||
|
|
||||||
def get_tensor_shape(tensor):
|
def get_tensor_shape(tensor):
|
||||||
shape = []
|
shape = []
|
||||||
for dim in tensor.type.tensor_type.shape.dim:
|
for dim in tensor.type.tensor_type.shape.dim:
|
||||||
|
|
|
@ -1,11 +1,9 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.utils.utils import try_import_onnx, try_import_torch, try_import_onnxsim, try_import_nni
|
|
||||||
import tempfile
|
import tempfile
|
||||||
from nn_meter.ir_converter.onnx_converter import OnnxConverter
|
from ..onnx_converter import OnnxConverter
|
||||||
|
|
||||||
|
|
||||||
from .opset_map import nni_attr_map, nni_type_map
|
from .opset_map import nni_attr_map, nni_type_map
|
||||||
|
from nn_meter.utils.import_package import try_import_onnx, try_import_torch, try_import_onnxsim, try_import_nni
|
||||||
|
|
||||||
|
|
||||||
def _nchw_to_nhwc(shapes):
|
def _nchw_to_nhwc(shapes):
|
||||||
|
|
|
@ -2,11 +2,12 @@
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
from nn_meter.utils.utils import try_import_onnx, try_import_torch, try_import_torchvision_models
|
from nn_meter.utils.import_package import try_import_onnx, try_import_torch, try_import_torchvision_models
|
||||||
from .onnx_converter import OnnxConverter
|
from .onnx_converter import OnnxConverter
|
||||||
from .frozenpb_converter import FrozenPbConverter
|
from .frozenpb_converter import FrozenPbConverter
|
||||||
from .torch_converter import NNIBasedTorchConverter, OnnxBasedTorchConverter, NNIIRConverter
|
from .torch_converter import NNIBasedTorchConverter, OnnxBasedTorchConverter, NNIIRConverter
|
||||||
|
|
||||||
|
|
||||||
def model_file_to_graph(filename: str, model_type: str, input_shape=(1, 3, 224, 224), apply_nni=False):
|
def model_file_to_graph(filename: str, model_type: str, input_shape=(1, 3, 224, 224), apply_nni=False):
|
||||||
"""
|
"""
|
||||||
read the given file and convert the model in the file content to nn-Meter IR graph object
|
read the given file and convert the model in the file content to nn-Meter IR graph object
|
||||||
|
@ -106,10 +107,12 @@ def onnx_model_to_graph(model):
|
||||||
converter = OnnxConverter(model)
|
converter = OnnxConverter(model)
|
||||||
return converter.convert()
|
return converter.convert()
|
||||||
|
|
||||||
|
|
||||||
def nni_model_to_graph(model):
|
def nni_model_to_graph(model):
|
||||||
converter = NNIIRConverter(model)
|
converter = NNIIRConverter(model)
|
||||||
return converter.convert()
|
return converter.convert()
|
||||||
|
|
||||||
|
|
||||||
def torch_model_to_graph(model, input_shape=(1, 3, 224, 224), apply_nni=False):
|
def torch_model_to_graph(model, input_shape=(1, 3, 224, 224), apply_nni=False):
|
||||||
torch = try_import_torch()
|
torch = try_import_torch()
|
||||||
args = torch.randn(*input_shape)
|
args = torch.randn(*input_shape)
|
||||||
|
|
|
@ -1,3 +1,3 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from .detection.detector import KernelDetector
|
from .kernel_detector import KernelDetector
|
||||||
|
|
|
@ -1,2 +0,0 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
|
||||||
# Licensed under the MIT license.
|
|
|
@ -3,7 +3,7 @@
|
||||||
import os
|
import os
|
||||||
import json
|
import json
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
from nn_meter.kernel_detector.utils.ir_tools import convert_nodes
|
from ..utils.ir_tools import convert_nodes
|
||||||
|
|
||||||
|
|
||||||
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
|
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
|
|
@ -1,11 +1,10 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.kernel_detector.rulelib.rule_reader import RuleReader
|
|
||||||
from nn_meter.kernel_detector.rulelib.rule_splitter import RuleSplitter
|
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
from nn_meter.kernel_detector.utils.constants import DUMMY_TYPES
|
from .utils.constants import DUMMY_TYPES
|
||||||
from nn_meter.kernel_detector.utils.ir_tools import convert_nodes
|
from .utils.ir_tools import convert_nodes
|
||||||
# import logging
|
from .rulelib.rule_reader import RuleReader
|
||||||
|
from .rulelib.rule_splitter import RuleSplitter
|
||||||
|
|
||||||
|
|
||||||
class KernelDetector:
|
class KernelDetector:
|
|
@ -1,8 +1,8 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
import json
|
import json
|
||||||
|
from ..fusionlib import get_fusion_unit
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
from nn_meter.kernel_detector.fusionlib import get_fusion_unit
|
|
||||||
|
|
||||||
|
|
||||||
class RuleReader:
|
class RuleReader:
|
||||||
|
|
|
@ -1,9 +1,9 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from .rule_reader import RuleReader
|
from .rule_reader import RuleReader
|
||||||
|
from ..utils.match_helper import MatchHelper
|
||||||
|
from ..utils.fusion_aware_graph import FusionAwareGraph
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
from nn_meter.kernel_detector.utils.match_helper import MatchHelper
|
|
||||||
from nn_meter.kernel_detector.utils.fusion_aware_graph import FusionAwareGraph
|
|
||||||
|
|
||||||
|
|
||||||
class RuleSplitter:
|
class RuleSplitter:
|
||||||
|
|
|
@ -1,8 +1,8 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from nn_meter.utils.graph_tool import ModelGraph
|
|
||||||
from .union_find import UF
|
|
||||||
import networkx as nx
|
import networkx as nx
|
||||||
|
from .union_find import UF
|
||||||
|
from nn_meter.utils.graph_tool import ModelGraph
|
||||||
|
|
||||||
|
|
||||||
class FusionAwareGraph:
|
class FusionAwareGraph:
|
||||||
|
|
|
@ -5,12 +5,7 @@ import os
|
||||||
import sys
|
import sys
|
||||||
import argparse
|
import argparse
|
||||||
import logging
|
import logging
|
||||||
from nn_meter.nn_meter import *
|
from nn_meter import *
|
||||||
|
|
||||||
__user_config_folder__ = os.path.expanduser('~/.nn_meter/config')
|
|
||||||
__user_data_folder__ = os.path.expanduser('~/.nn_meter/data')
|
|
||||||
|
|
||||||
__predictors_cfg_filename__ = 'predictors.yaml'
|
|
||||||
|
|
||||||
|
|
||||||
def list_latency_predictors_cli():
|
def list_latency_predictors_cli():
|
||||||
|
@ -62,6 +57,7 @@ def apply_latency_predictor_cli(args):
|
||||||
|
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
def get_nnmeter_ir_cli(args):
|
def get_nnmeter_ir_cli(args):
|
||||||
"""convert pb file or onnx file to nn-Meter IR graph according to the command line interface arguments
|
"""convert pb file or onnx file to nn-Meter IR graph according to the command line interface arguments
|
||||||
"""
|
"""
|
||||||
|
|
|
@ -1,5 +1,4 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from .predictors.utils import latency_metrics
|
from .prediction.utils import latency_metrics
|
||||||
|
from .nn_meter_predictor import nnMeterPredictor, list_latency_predictors, load_latency_predictor
|
||||||
|
|
||||||
|
|
|
@ -1,18 +1,18 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from glob import glob
|
|
||||||
from nn_meter.predictor.predictors.predict_by_kernel import nn_predict
|
|
||||||
from nn_meter.kernel_detector import KernelDetector
|
|
||||||
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
|
|
||||||
from .utils import loading_to_local
|
|
||||||
|
|
||||||
import yaml
|
|
||||||
import os
|
import os
|
||||||
|
import yaml
|
||||||
import pkg_resources
|
import pkg_resources
|
||||||
from shutil import copyfile
|
from shutil import copyfile
|
||||||
from packaging import version
|
from packaging import version
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
|
from .utils import loading_to_local
|
||||||
|
from .prediction.predict_by_kernel import nn_predict
|
||||||
|
from nn_meter.kernel_detector import KernelDetector
|
||||||
from nn_meter.utils import load_config_file, get_user_data_folder
|
from nn_meter.utils import load_config_file, get_user_data_folder
|
||||||
|
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
|
||||||
|
|
||||||
|
|
||||||
__predictors_cfg_filename__ = 'predictors.yaml'
|
__predictors_cfg_filename__ = 'predictors.yaml'
|
||||||
|
|
||||||
|
|
|
@ -1,8 +1,8 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
|
import logging
|
||||||
import numpy as np
|
import numpy as np
|
||||||
from sklearn.metrics import mean_squared_error
|
from sklearn.metrics import mean_squared_error
|
||||||
import logging
|
|
||||||
|
|
||||||
|
|
||||||
def get_flop(input_channel, output_channel, k, H, W, stride):
|
def get_flop(input_channel, output_channel, k, H, W, stride):
|
|
@ -1,9 +1,9 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
from sklearn.metrics import mean_squared_error
|
from sklearn.metrics import mean_squared_error
|
||||||
|
|
||||||
|
|
||||||
def get_kernel_name(optype):
|
def get_kernel_name(optype):
|
||||||
"""
|
"""
|
||||||
for many similar kernels, we use one kernel predictor since their latency difference is negligible,
|
for many similar kernels, we use one kernel predictor since their latency difference is negligible,
|
||||||
|
@ -34,6 +34,7 @@ def get_kernel_name(optype):
|
||||||
|
|
||||||
return optype
|
return optype
|
||||||
|
|
||||||
|
|
||||||
def get_accuracy(y_pred, y_true, threshold=0.01):
|
def get_accuracy(y_pred, y_true, threshold=0.01):
|
||||||
a = (y_true - y_pred) / y_true
|
a = (y_true - y_pred) / y_true
|
||||||
b = np.where(abs(a) <= threshold)
|
b = np.where(abs(a) <= threshold)
|
|
@ -7,7 +7,7 @@ from zipfile import ZipFile
|
||||||
from tqdm import tqdm
|
from tqdm import tqdm
|
||||||
import requests
|
import requests
|
||||||
import logging
|
import logging
|
||||||
from nn_meter.utils.utils import download_from_url
|
from nn_meter.utils import download_from_url
|
||||||
|
|
||||||
|
|
||||||
def loading_to_local(pred_info, dir="data/predictorzoo"):
|
def loading_to_local(pred_info, dir="data/predictorzoo"):
|
||||||
|
|
|
@ -5,4 +5,5 @@ from config_manager import (
|
||||||
get_user_data_folder,
|
get_user_data_folder,
|
||||||
change_user_data_folder,
|
change_user_data_folder,
|
||||||
load_config_file
|
load_config_file
|
||||||
)
|
)
|
||||||
|
from utils import download_from_url
|
|
@ -1,14 +1,9 @@
|
||||||
from glob import glob
|
|
||||||
from nn_meter.predictor.predictors.predict_by_kernel import nn_predict
|
|
||||||
from nn_meter.kernel_detector import KernelDetector
|
|
||||||
from nn_meter.ir_converter import model_file_to_graph, model_to_graph
|
|
||||||
from nn_meter.predictor.load_predictors import loading_to_local
|
|
||||||
|
|
||||||
import yaml
|
import yaml
|
||||||
import os
|
import os
|
||||||
|
import logging
|
||||||
import pkg_resources
|
import pkg_resources
|
||||||
from shutil import copyfile
|
from shutil import copyfile
|
||||||
import logging
|
|
||||||
|
|
||||||
__user_config_folder__ = os.path.expanduser('~/.nn_meter/config')
|
__user_config_folder__ = os.path.expanduser('~/.nn_meter/config')
|
||||||
__default_user_data_folder__ = os.path.expanduser('~/.nn_meter/data')
|
__default_user_data_folder__ = os.path.expanduser('~/.nn_meter/data')
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
# Copyright (c) Microsoft Corporation.
|
# Copyright (c) Microsoft Corporation.
|
||||||
# Licensed under the MIT license.
|
# Licensed under the MIT license.
|
||||||
from packaging import version
|
|
||||||
import logging
|
import logging
|
||||||
|
from packaging import version
|
||||||
|
|
||||||
|
|
||||||
def try_import_onnx(require_version = ["1.9.0"]):
|
def try_import_onnx(require_version = ["1.9.0"]):
|
||||||
|
|
|
@ -20,7 +20,7 @@ def download_from_url(urladdr, ppath):
|
||||||
if not os.path.isdir(ppath):
|
if not os.path.isdir(ppath):
|
||||||
os.makedirs(ppath)
|
os.makedirs(ppath)
|
||||||
|
|
||||||
# logging.keyinfo(f'Download from {urladdr}')
|
logging.keyinfo(f'Download from {urladdr}')
|
||||||
response = requests.get(urladdr, stream=True)
|
response = requests.get(urladdr, stream=True)
|
||||||
total_size_in_bytes = int(response.headers.get("content-length", 0))
|
total_size_in_bytes = int(response.headers.get("content-length", 0))
|
||||||
block_size = 2048 # 2 Kibibyte
|
block_size = 2048 # 2 Kibibyte
|
||||||
|
|
Загрузка…
Ссылка в новой задаче