add doc for torch converter switch

2021-08-19 04:19:00 +00:00 · 2021-08-19 04:19:00 +00:00 · 5f8633a4dd
--- a/README.md
+++ b/README.md
@ -35,19 +35,22 @@ Then simply run the following pip install in an environment that has `python >=
 pip install .
 ```

-nn-Meter is a latency predictor of models with type of tensorflow, pytorch, onnx, nn-meter IR graph and [NNI IR graph](https://github.com/microsoft/nni). To use nn-Meter for specific model type, you also need to install corresponding required pacakges. The well tested versions are listed below:
+nn-Meter is a latency predictor of models with type of Tensorflow, PyTorch, Onnx, nn-meter IR graph and [NNI IR graph](https://github.com/microsoft/nni). To use nn-Meter for specific model type, you also need to install corresponding required packages. The well tested versions are listed below:

-|  Testing Model Tpye   |                       Requirments                      |
+|  Testing Model Type   |                       Requirements                      |
 | :-------------------: | :------------------------------------------------:     |
 |       Tensorflow      |  `tensorflow==1.15.0`                                  |
-|         Torch         |  `torch==1.7.1`, `torchvision==0.8.2`, `onnx==1.9.0`, `onnx-simplifier==0.3.6`  |
+|         Torch         |  `torch==1.7.1`, `torchvision==0.8.2`, (alternative)[`onnx==1.9.0`, `onnx-simplifier==0.3.6`] or [`nni==2.4`]$^{[1]}$  |
 |          Onnx         |  `onnx==1.9.0`                                         |
 |    nn-Meter IR graph  |   ---                                                  |
 |      NNI IR graph     |  `nni==2.4`                                            |

+${[1]}$ Please refer to [nn-Meter Usage](#torch-model-converters) for more information
+
+
 Please also check the versions of `numpy` and `scikit_learn`. The different versions may change the prediction accuracy of kernel predictors.

-The stable version of wheel binary pacakge will be released soon.
+The stable version of wheel binary package will be released soon.


 # Usage
@ -61,13 +64,14 @@ Here is a summary of supported inputs of the two methods.

 |       Testing Model Type       |                                   Command Support                                   |                                                   Python Binding                                                   |
 | :---------------: | :---------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------: |
-|    Tensorflow    |         Checkpoint file dumped by `tf.saved_model()` and endwith `.pb`         |                          Checkpoint file dumped by `tf.saved_model` and endwith `.pb`                          |
+|    Tensorflow    |         Checkpoint file dumped by `tf.saved_model()` and end with `.pb`         |                          Checkpoint file dumped by `tf.saved_model` and end with `.pb`                          |
 |       Torch       |                          Models in `torchvision.models`                          |                                            Object of `torch.nn.Module`                                            |
-|       Onnx       |           Checkpoint file dumped by `torch.onnx.export()` or `onnx.save()` and endwith `.onnx`           |                    Checkpoint file dumped by `onnx.save()` or model loaded by `onnx.load()`                    |
+|       Onnx       |           Checkpoint file dumped by `torch.onnx.export()` or `onnx.save()` and end with `.onnx`           |                    Checkpoint file dumped by `onnx.save()` or model loaded by `onnx.load()`                    |
 | nn-Meter IR graph | Json file in the format of [nn-Meter IR Graph](./docs/input_models.md#nnmeter-ir-graph) |          `dict` object following the format of [nn-Meter IR Graph](./docs/input_models.md#nnmeter-ir-graph)          |
 |   NNI IR graph   |                                          -                                          | NNI IR graph object |

 In both methods, users could appoint predictor name and version to target a specific hardware platform (device). Currently, nn-Meter supports prediction on the following four configs:
+
 | Predictor (device_inferenceframework) | Processor Category | Version |
 | :-----------------------------------: | :----------------: | :-----: |
 |         cortexA76cpu_tflite21         |        CPU         |   1.0   |
@ -100,11 +104,11 @@ nn-meter --predictor <hardware> [--predictor-version <version>] --torchvision <m
 nn-meter --predictor <hardware> [--predictor-version <version>] --nn-meter-ir <json-file_or_folder> 
 ```

-`--predictor-version <version>` arguments is optional. When the predictor version is not specified by users, nn-meter will use the latest verison of the predictor.
+`--predictor-version <version>` arguments is optional. When the predictor version is not specified by users, nn-meter will use the latest version of the predictor.

 nn-Meter can support batch mode prediction. To predict latency for multiple models in the same model type once, user should collect all models in one folder and state the folder after `--[model-type]` liked argument.

-It should also be noted that for PyTorch model, nn-meter can only support existing models in torchvision model zoo. The string followed by `--torchvision` should be exactly one or more string indicating name(s) of some existing torchvision models.
+It should also be noted that for PyTorch model, nn-meter can only support existing models in torchvision model zoo. The string followed by `--torchvision` should be exactly one or more string indicating name(s) of some existing torchvision models. To apply latency prediction for torchvision model in command line, `onnx` and `onnx-simplifier` packages are required.

 ### Convert to nn-Meter IR Graph

@ -137,7 +141,24 @@ lat = predictor.predict(model, model_type) # the resulting latency is in unit of

 By calling `load_latency_predictor`, user selects the target hardware and loads the corresponding predictor. nn-Meter will try to find the right predictor file in `~/.nn_meter/data`. If the predictor file doesn't exist, it will download from the Github release.

-In `predictor.predict`, the allowed items of the parameter `model_type` include `["pb", "torch", "onnx", "nnmeter-ir", "nni-ir"]`, representing model types of tensorflow, torch, onnx, nn-meter IR graph and NNI IR graph, respectively.
+In `predictor.predict()`, the allowed items of the parameter `model_type` include `["pb", "torch", "onnx", "nnmeter-ir", "nni-ir"]`, representing model types of tensorflow, torch, onnx, nn-meter IR graph and NNI IR graph, respectively.
+
+<span id="torch-model-converters"> For Torch models, the shape of feature maps is unknown merely based on the given network structure, which is, however, significant parameters in latency prediction. Therefore, torch model requires a shape of input tensor for inference as a input of `predictor.predict()`. Based on the given input shape, a random tensor according to the shape will be generated and used. Another thing for Torch model prediction is that users can install the `onnx` and `onnx-simplifier` packages for latency prediction (referred to as Onnx-based latency prediction for torch model), or alternatively install the `nni` package (referred to as NNI-based latency prediction for torch model). Note that the `nni` option does not support command line calls. In addition, if users use `nni` for latency prediction, the PyTorch modules should be defined by the `nn` interface from NNI `import nni.retiarii.nn.pytorch as nn` (view [NNI doc](https://nni.readthedocs.io/en/stable/NAS/QuickStart.html#define-base-model) for more information), and the parameter `apply_nni` should be set as `True` in the function `predictor.predict()`. Here is an example of NNI-based latency prediction for Torch model:
+
+```python
+import nni.retiarii.nn.pytorch as nn
+from nn_meter import load_latency_predictor
+
+predictor = load_latency_predictor(...)
+
+# build your model using nni.retiarii.nn.pytorch as nn
+model = nn.Module ...
+
+input_shape = (1, 3, 224, 224)
+lat = predictor.predict(model, model_type='torch', input_shape=input_shape, apply_nni=True) 
+```
+
+The Onnx-based latency prediction for torch model is stable but slower, while the NNI-based latency prediction for torch model is unstable as it could fail in some case but much faster compared to the Onnx-based model. The Onnx-based model is set as the default one for Torch model latency prediction in nn-Meter. Users could choose which one they preferred to use according to their needs. </span>

 Users could view the information all built-in predictors by `list_latency_predictors` or view the config file in `nn_meter/configs/predictors.yaml`.

--- a/docs/input_models.md
+++ b/docs/input_models.md
@ -1,6 +1,6 @@
 # Input Models

-nn-Meter currently supports both a saved model file and the model object in code. In particular, we support saved models in .pb and .onnx formats, and we support to directly predict onnx and pytorch models. Besides, to support the hardware-aware NAS, nn-Meter can also predict the inference latency of models in [NNI graph](https://nni.readthedocs.io/en/stable/nas.html).
+nn-Meter currently supports both a saved model file and the model object in code. In particular, we support saved models in .pb and .onnx formats, and we support to directly predict Onnx and PyTorch models. Besides, to support the hardware-aware NAS, nn-Meter can also predict the inference latency of models in [NNI graph](https://nni.readthedocs.io/en/stable/nas.html).

 When taking the different model formats as input, nn-Meter converts them in nn-Meter IR graph. The kernel detection code will split the nn-Meter IR graph into the set of kernel units, and conduct kernel-level prediction.

@ -13,7 +13,7 @@ You can save tensorflow models into frozen pb formats, and use the following nn-
 nn-meter --predictor <hardware> --tensorflow <pb-file> 
 ```

-For the other frameworks (e.g., Pytorch), you can convert the models into onnx models, and use the following nn-meter command to predict the latency:
+For the other frameworks (e.g., PyTorch), you can convert the models into onnx models, and use the following nn-meter command to predict the latency:

 ```bash
 # for ONNX (*.onnx) file
@ -24,7 +24,7 @@ You can download the test [tensorflow models]("https://github.com/Lynazhang/nnme

 ### Input model as a code object

-You can also directly apply nn-Meter in your python code. In this case, please directly pass the onnx model and Pytorch model objects as the input model. The following is an example in Pytorch code:
+You can also directly apply nn-Meter in your python code. In this case, please directly pass the onnx model and PyTorch model objects as the input model. The following is an example for PyTorch code:

 ```python
 from nn_meter import load_latency_predictor
@ -34,9 +34,15 @@ predictor = load_lat_predictor(hardware_name) # case insensitive in backend
 # build your model here
 model = ... # model is instance of torch.nn.Module

-lat = predictor.predict(model,model_type='torch', input_shape=(3, 224, 224))
+lat = predictor.predict(model, model_type='torch', input_shape=(3, 224, 224), apply_nni=False)
 ```

+There are two converters, i.e., model processors, for torch model, namely the Onnx-based torch converter and the NNI-based torch converter. Onnx-based torch converter export the torch model to onnx model, and reload the onnx model to the onnx converter. The serialization and postprocessing for Onnx-based torch converter is time-consuming, but the Onnx conversion is more stable. 
+
+NNI-based torch converter generate a NNI IR graph based on the torch model, and use NNI converter for the subsequent steps. Note that if users use NNI-based converter, the PyTorch modules should be defined by the `nn` interface from NNI `import nni.retiarii.nn.pytorch as nn` (view [NNI doc](https://nni.readthedocs.io/en/stable/NAS/QuickStart.html#define-base-model) for more information). NNI-based torch converter get advantage in speed, but could fail in case the model contains some operators not supported by NNI. 
+
+One can switch two converters by setting `True` or `False` of the parameter `apply_nni` in `predictor.predict()`. Onnx-based torch converter is used as the default one for torch model. If `apply_nni-True`, NNI-based torch converter is used instead. Users could choose which one they preferred to use according to their needs. 
+
 ### <span id="nnmeter-ir-graph"> nn-Meter IR graph </span>

 As introduced, nn-Meter will perform a pre-processing step to convert the above model formats into the nn-Meter IR graphs. Now we introduce the defined IR graph.
--- a/docs/quick_start.md
+++ b/docs/quick_start.md
@ -14,25 +14,27 @@ Then simply run the following pip install in an environment that has `python >=
 pip install .
 ```

-nn-Meter is a latency predictor of models with type of tensorflow, pytorch, onnx, nn-meter IR graph and [NNI IR graph](https://github.com/microsoft/nni). To use nn-Meter for specific model type, you also need to install corresponding required pacakges. The well tested versions are listed below:
+nn-Meter is a latency predictor of models with type of Tensorflow, PyTorch, Onnx, nn-meter IR graph and [NNI IR graph](https://github.com/microsoft/nni). To use nn-Meter for specific model type, you also need to install corresponding required packages. The well tested versions are listed below:

-|  Testing Model Tpye   |                       Requirments                      |
+|  Testing Model Type   |                       Requirements                      |
 | :-------------------: | :------------------------------------------------:     |
 |       Tensorflow      |  `tensorflow==1.15.0`                                  |
-|         Torch         |  `torch==1.7.1`, `torchvision==0.8.2`, `onnx==1.9.0`, `onnx-simplifier==0.3.6`  |
+|         Torch         |  `torch==1.7.1`, `torchvision==0.8.2`, (alternative)[`onnx==1.9.0`, `onnx-simplifier==0.3.6`] or [`nni==2.4`]$^{[1]}$ |
 |          Onnx         |  `onnx==1.9.0`                                         |
 |    nn-Meter IR graph  |   ---                                                  |
 |      NNI IR graph     |  `nni==2.4`                                            |

+${[1]}$ Please refer to [nn-Meter Usage](usage.md#torch-model-converters) for more information.
+
 Please also check the versions of `numpy` and `scikit_learn`. The different versions may change the prediction accuracy of kernel predictors.

-The stable version of wheel binary pacakge will be released soon.
+The stable version of wheel binary package will be released soon.


 ## "Hello World" example on torch model
 nn-Meter is an accurate inference latency predictor for DNN models on diverse edge devices. nn-Meter supports tensorflow pb-file, onnx file, torch model and nni IR model for latency prediction.

-Here is an example script to predict latency for Resnet18 in torch. To run the example, package `torch`, `torchvision` and `onnx` are required. The well tested versions are `torch==1.7.1`, `torchvision==0.8.2`, `onnx==1.9.0` and `onnx-simplifier==0.3.6`.   
+Here is an example script to predict latency for Resnet18 in torch. To run the example, package `torch`, `torchvision`, `onnx` and `onnx-simplifier` are required. The well tested versions are `torch==1.7.1`, `torchvision==0.8.2`, `onnx==1.9.0` and `onnx-simplifier==0.3.6`.   

 ```python
 from nn_meter import load_latency_predictor
--- a/docs/usage.md
+++ b/docs/usage.md
@ -9,10 +9,10 @@ Here is a summary of supported inputs of the two methods.

 |       Testing Model Type       |                                   Command Support                                   |                                                   Python Binding                                                   |
 | :---------------: | :---------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------: |
-|    Tensorflow    |         Checkpoint file dumped by `tf.saved_model()` and endwith `.pb`         |                          Checkpoint file dumped by `tf.saved_model` and endwith `.pb`                          |
+|    Tensorflow    |         Checkpoint file dumped by `tf.saved_model()` and end with `.pb`         |                          Checkpoint file dumped by `tf.saved_model` and end with `.pb`                          |
 |       Torch       |                          Models in `torchvision.models`                          |                                            Object of `torch.nn.Module`                                            |
-|       Onnx       |           Checkpoint file dumped by `onnx.save()` and endwith `.onnx`           |                    Checkpoint file dumped by `onnx.save()` or model loaded by `onnx.load()`                    |
-| nn-Meter IR graph | Json file in the format of [nn-Meter IR Graph](./docs/input_models.md#nnmeter-ir-graph) |          `dict` object following the format of [nn-Meter IR Graph](./docs/input_models.md#nnmeter-ir-graph)          |
+|       Onnx       |           Checkpoint file dumped by `onnx.save()` and end with `.onnx`           |                    Checkpoint file dumped by `onnx.save()` or model loaded by `onnx.load()`                    |
+| nn-Meter IR graph | Json file in the format of [nn-Meter IR Graph](input_models.md#nnmeter-ir-graph) |          `dict` object following the format of [nn-Meter IR Graph](input_models.md#nnmeter-ir-graph)          |
 |   NNI IR graph   |                                          -                                          | NNI IR graph object |

 In both methods, users could appoint predictor name and version to target a specific hardware platform (device). Currently, nn-Meter supports prediction on the following four configs:
@ -48,11 +48,11 @@ nn-meter --predictor <hardware> [--predictor-version <version>] --torchvision <m
 nn-meter --predictor <hardware> [--predictor-version <version>] --nn-meter-ir <json-file_or_folder> 
 ```

-`--predictor-version <version>` arguments is optional. When the predictor version is not specified by users, nn-meter will use the latest verison of the predictor.
+`--predictor-version <version>` arguments is optional. When the predictor version is not specified by users, nn-meter will use the latest version of the predictor.

 nn-Meter can support batch mode prediction. To predict latency for multiple models in the same model type once, user should collect all models in one folder and state the folder after `--[model-type]` liked argument.

-It should also be noted that for PyTorch model, nn-meter can only support existing models in torchvision model zoo. The string followed by `--torchvision` should be exactly one or more string indicating name(s) of some existing torchvision models.
+It should also be noted that for PyTorch model, nn-meter can only support existing models in torchvision model zoo. The string followed by `--torchvision` should be exactly one or more string indicating name(s) of some existing torchvision models. To apply latency prediction for torchvision model in command line, `onnx` and `onnx-simplifier` packages are required.

 ### Convert to nn-Meter IR Graph

@ -85,7 +85,24 @@ lat = predictor.predict(model, model_type) # the resulting latency is in unit of

 By calling `load_latency_predictor`, user selects the target hardware and loads the corresponding predictor. nn-Meter will try to find the right predictor file in `~/.nn_meter/data`. If the predictor file doesn't exist, it will download from the Github release.

-In `predictor.predict`, the allowed items of the parameter `model_type` include `["pb", "torch", "onnx", "nnmeter-ir", "nni-ir"]`, representing model types of tensorflow, torch, onnx, nn-meter IR graph and NNI IR graph, respectively.
+In `predictor.predict()`, the allowed items of the parameter `model_type` include `["pb", "torch", "onnx", "nnmeter-ir", "nni-ir"]`, representing model types of tensorflow, torch, onnx, nn-meter IR graph and NNI IR graph, respectively.
+
+<span id="torch-model-converters"> For Torch models, the shape of feature maps is unknown merely based on the given network structure, which is, however, significant parameters in latency prediction. Therefore, torch model requires a shape of input tensor for inference as a input of `predictor.predict()`. Based on the given input shape, a random tensor according to the shape will be generated and used. Another thing for Torch model prediction is that users can install the `onnx` and `onnx-simplifier` packages for latency prediction (referred to as Onnx-based latency prediction for torch model), or alternatively install the `nni` package (referred to as NNI-based latency prediction for torch model). Note that the `nni` option does not support command line calls. In addition, if users use `nni` for latency prediction, the PyTorch modules should be defined by the `nn` interface from NNI `import nni.retiarii.nn.pytorch as nn` (view [NNI doc](https://nni.readthedocs.io/en/stable/NAS/QuickStart.html#define-base-model) for more information), and the parameter `apply_nni` should be set as `True` in the function `predictor.predict()`. Here is an example of NNI-based latency prediction for Torch model:
+
+```python
+import nni.retiarii.nn.pytorch as nn
+from nn_meter import load_latency_predictor
+
+predictor = load_latency_predictor(...)
+
+# build your model using nni.retiarii.nn.pytorch as nn
+model = nn.Module ...
+
+input_shape = (1, 3, 224, 224)
+lat = predictor.predict(model, model_type='torch', input_shape=input_shape, apply_nni=True) 
+```
+
+The Onnx-based latency prediction for torch model is stable but slower, while the NNI-based latency prediction for torch model is unstable as it could fail in some case but much faster compared to the Onnx-based model. The Onnx-based model is set as the default one for Torch model latency prediction in nn-Meter. Users could choose which one they preferred to use according to their needs. </span>

 Users could view the information all built-in predictors by `list_latency_predictors` or view the config file in `nn_meter/configs/predictors.yaml`.