update the docs for imagenet pnp (#186)

2021-11-29 10:53:15 -08:00 · 2021-11-29 10:53:15 -08:00 · 0b2cfe7dd7
--- a/README.md
+++ b/README.md
@ -4,11 +4,39 @@
 # Introduction
 ONNXRuntime Extensions is a comprehensive package to extend the capability of the ONNX conversion and inference.
 1. The CustomOp C++ library for [ONNX Runtime](http://onnxruntime.ai) on ONNXRuntime CustomOp API.
-2. Support PyOp feature to implement the custom op with a Python function.
-3. Build all-in-one ONNX model from the pre/post processing code, go to [docs/pre_post_processing.md](https://github.com/microsoft/onnxruntime-extensions/blob/main/docs/pre_post_processing.md) for details.
-4. Support Python per operator debugging, checking ```hook_model_op``` in onnxruntime_extensions Python package.
+2. Integrate the pre/post processing steps into ONNX model which can be executed on all platforms that ONNXRuntime supported. check [ONNXCompose](onnxruntime_extensions/compose.py) for more details
+3. Support PyOp feature to implement the custom op with a Python function.
+4. Build all-in-one ONNX model from the pre/post processing code, go to [docs/pre_post_processing.md](https://github.com/microsoft/onnxruntime-extensions/blob/main/docs/pre_post_processing.md) for details.
+5. Support Python per operator debugging, checking ```hook_model_op``` in onnxruntime_extensions Python package.

 # Quick Start
+### **ImageNet Pre/Post Processing**
+Build a full ONNX model with ImageNet pre/post processing
+```Python
+import onnx
+import torch
+from onnxruntime_extensions import pnp, ONNXCompose
+
+
+mnv2 = onnx.load_model('test/data/mobilev2.onnx')
+full_model = ONNXCompose(
+    mnv2,
+    preprocessors=pnp.PreMobileNet(224),
+    postprocessors=pnp.PostMobileNet())
+
+# need a prediction to get some data info for exporting
+# the image size can be arbitrary, which is 400x500 in this example
+fake_image_input = torch.ones(500, 400, 3).to(torch.uint8)
+full_model.predict(fake_image_input)
+full_model.export(opset_version=11, output_file='temp_exmobilev2.onnx')
+```
+The above python code will translate the ImageNet pre/post processing functions into an all-in-one model which can do inference on all platforms that ONNNXRuntime supports, like Android/iOS, without any Python runtime and the 3rd-party libraries dependency.
+
+Note: On mobile platform, the ONNXRuntime package may not support all kernels required by the model, to ensure all the ONNX operator kernels were built into ONNXRuntime binraries, please use [ONNX Runtime Mobile Custom Build](https://onnxruntime.ai/docs/tutorials/mobile/custom-build.html).
+
+Here is a [tutorial](tutorials/imagenet_processing.ipynb) for pre/post processing details.
+
+### **GPT-2 Pre/Post Processing**
 The following code shows how to run ONNX model and ONNXRuntime customop more straightforwardly.
 ```python
 import numpy
@ -24,10 +52,9 @@ output, *_ = gpt2_core(input_ids)
 next_id = numpy.argmax(output[:, :, -1, :], axis=-1)
 print(input_text[0] + decode(next_id).item())
 ```
-This is a simplified version of GPT-2 inference for the demonstration only, The comprehensive solution on the GPT-2 model and its deviants are under development, and here is the [link](https://github.com/microsoft/onnxruntime-extensions/blob/main/tutorials/gpt2bs.py) to the experimental.
+This is a simplified version of GPT-2 inference for the demonstration only, The comprehensive solution on the GPT-2 model and its deviants are under development, and here is the [link](tutorials/gpt2bs.py) to the experimental.
+

-## Android/iOS
-The previous processing python code can be translated into all-in-one model to be run in Android/iOS mobile platform, without any Python runtime and the 3rd-party dependencies requirement. Here is the [tutorial](https://github.com/microsoft/onnxruntime-extensions/blob/main/tutorials/gpt2bs.py)

 ## CustomOp Conversion
 The mainstream ONNX converters support the custom op generation if there is the operation from the original framework cannot be interpreted as ONNX standard operators. Check the following two examples on how to do this.
--- a/cmake/externals/farmhash/ltmain.sh
+++ b/cmake/externals/farmhash/ltmain.sh
@ -1196,7 +1196,7 @@ func_enable_tag "$optarg"
      func_fatal_configuration "not configured to build any kind of library"
    fi

-    # Darwin sucks
+    # Darwin
    eval std_shrext=\"$shrext_cmds\"

    # Only execute mode is allowed to have -dlopen flags.
--- a/onnxruntime_extensions/pnp/_base.py
+++ b/onnxruntime_extensions/pnp/_base.py
@ -18,7 +18,6 @@ class ProcessingModule(torch.nn.Module):
        cls.loaded = True
        return True

-    @classmethod
    def export(self, opset_version, *args):
        return None

--- a/onnxruntime_extensions/pnp/_imagenet.py
+++ b/onnxruntime_extensions/pnp/_imagenet.py
@ -14,56 +14,30 @@ def _resize_param(img, size):
    return onnx_where(onnx_greater(scale_x, scale_y), scale_x, scale_y)


-class ImagenetPreProcessingLite(ProcessingModule):
-    def __init__(self, size):
-        super(ImagenetPreProcessingLite, self).__init__()
+class ImageNetPreProcessing(ProcessingModule):
+    def __init__(self, size, resize_image=True):
+        super(ImageNetPreProcessing, self).__init__()
        self.target_size = size
+        self.resize_image = resize_image

    def forward(self, img):
        if not isinstance(img, torch.Tensor):
            img = torch.tensor(img)
+        assert img.shape[-1] == 3, 'the input image should be in RGB channels'
        img = torch.permute(img, (2, 0, 1))
-        x = img.to(torch.float32).unsqueeze(0)
-        # T.CenterCrop(224),
-        width, height = tuple(self.target_size)
-        img_h, img_w = x.shape[-2:]
-        s_h = torch.div((img_h - height), 2, rounding_mode='trunc')
-        s_w = torch.div((img_w - width), 2, rounding_mode='trunc')
-        x = x[:, :, s_h:s_h + height, s_w:s_w + width]
-        # T.ToTensor(),
-        x /= 255.  # ToTensor
-        # T.Normalize(
-        #     mean=[0.485, 0.456, 0.406],
-        #     std=[0.229, 0.224, 0.225]
-        # )
-        mean = torch.tensor([0.485, 0.456, 0.406])
-        std = torch.tensor([0.229, 0.224, 0.225])
-        x -= torch.reshape(torch.tensor(mean), (3, 1, 1))
-        x /= torch.reshape(torch.tensor(std), (3, 1, 1))
-        return x
-
-
-class ImagenetPreProcessing(ProcessingModule):
-    def __init__(self, size):
-        super(ImagenetPreProcessing, self).__init__()
-        self.target_size = size
-
-    def forward(self, img):
-        if not isinstance(img, torch.Tensor):
-            img = torch.tensor(img)
-        img = torch.permute(img, (2, 0, 1))
-        # T.Resize(256),
        img = img.to(torch.float32).unsqueeze(0)
-        scale = _resize_param(img, torch.tensor(256))
-        x = interpolate(img, scale_factor=scale,
-                        recompute_scale_factor=True,
-                        mode="bilinear", align_corners=False)
+        # T.Resize(256),
+        if self.resize_image:
+            scale = _resize_param(img, torch.tensor(256))
+            img = interpolate(img, scale_factor=scale,
+                            recompute_scale_factor=True,
+                            mode="bilinear", align_corners=False)
        # T.CenterCrop(224),
        width, height = self.target_size, self.target_size
-        img_h, img_w = x.shape[-2:]
+        img_h, img_w = img.shape[-2:]
        s_h = torch.div((img_h - height), 2, rounding_mode='trunc')
        s_w = torch.div((img_w - width), 2, rounding_mode='trunc')
-        x = x[:, :, s_h:s_h + height, s_w:s_w + width]
+        x = img[:, :, s_h:s_h + height, s_w:s_w + width]
        # T.ToTensor(),
        x /= 255.  # ToTensor
        # T.Normalize(
@ -74,22 +48,8 @@ class ImagenetPreProcessing(ProcessingModule):
        std = [0.229, 0.224, 0.225]
        x -= torch.reshape(torch.tensor(mean), (3, 1, 1))
        x /= torch.reshape(torch.tensor(std), (3, 1, 1))
-        # x[:, 0, :, :] -= mean[0]
-        # x[:, 1, :, :] -= mean[1]
-        # x[:, 2, :, :] -= mean[2]
-        # x[:, 0, :, :] /= std[0]
-        # x[:, 1, :, :] /= std[1]
-        # x[:, 2, :, :] /= std[2]
        return x

-
-class ImagePostProcessing(ProcessingModule):
-    def forward(self, scores):
-        ProcessingModule.register_customops()
-        probabilities = torch.softmax(scores, dim=1)
-        ids = probabilities.argsort(dim=1, descending=True)
-        return ids, probabilities
-
    def export(self, opset_version, *args):
        with io.BytesIO() as f:
            name_i = 'image'
@ -101,10 +61,18 @@ class ImagePostProcessing(ProcessingModule):
            return onnx.load_model(io.BytesIO(f.getvalue()))


-class PreMobileNet(ImagenetPreProcessing):
+class ImageNetPostProcessing(ProcessingModule):
+    def forward(self, scores):
+        ProcessingModule.register_customops()
+        probabilities = torch.softmax(scores, dim=1)
+        top10_prob, top10_ids = probabilities.topk(k=10, dim=1, largest=True, sorted=True)
+        return top10_ids, top10_prob
+
+
+class PreMobileNet(ImageNetPreProcessing):
    def __init__(self, size=None):
        super(PreMobileNet, self).__init__(224 if size is None else size)


-class PostMobileNet(ImagePostProcessing):
+class PostMobileNet(ImageNetPostProcessing):
    pass
--- a/tutorials/imagenet_processing.ipynb
+++ b/tutorials/imagenet_processing.ipynb