Add AzureML packaging support (#1005)

## Describe your changes * Add AzureML packaging support: * AzureMLModels: output models will be registered to AML workspace Models * AzureMLData: output models will be uploaded to AML workspace Data. * Support multi type packaging options. ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Lint and apply fixes to your code by running `lintrunner -a` - [x] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. * Add 2 new packaging type: `AzureMLModels`, `AzureMLData`. The output models from Olive pipeline can be automatically registered to AzureML workspace Models or Data * Support multi type packaging options. - [ ] Is this PR including examples changes? If yes, please remember to update [example documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md) in a follow-up PR. ## (Optional) Issue link
2024-03-21 10:21:47 -07:00 · 2024-03-21 10:21:47 -07:00 · 257685a1d5
--- a/docs/source/features/packaging_output_models.md
+++ b/docs/source/features/packaging_output_models.md
@ -1,8 +1,7 @@
 # Packaging Olive artifacts

 ## What is Olive Packaging
-Olive will output multiple candidate models based on metrics priorities. It also can package output artifacts when the user required. Olive packaging can be used in different scenarios. There is only one packaging type: `Zipfile`.
-
+Olive will output multiple candidate models based on metrics priorities. It also can package output artifacts when the user requires. Olive packaging can be used in different scenarios. There are 3 packaging types: `Zipfile`, `AzureMLModels` and `AzureMLData`.

 ### Zipfile
 Zipfile packaging will generate a ZIP file which includes 3 folders: `CandidateModels`, `SampleCode` and `ONNXRuntimePackages`, and a `models_rank.json` file in the `output_dir` folder (from Engine Configuration):
@ -18,71 +17,14 @@ Zipfile packaging will generate a ZIP file which includes 3 folders: `CandidateM
 * `models_rank.json`: A JSON file containing a list that ranks all output models based on specific metrics across all accelerators.

 #### CandidateModels
-`CandidateModels` includes k folders where k is the number of output models, with name `BestCandidateModel_1`, `BestCandidateModel_2`, ... and `BestCandidateModel_k`. The order is ranked by metrics priorities. e.g., if you have 3 metrics `metric_1`, `metric_2` and `metric_3` with priority `1`, `2` and `3`. The output models will be sorted firstly by `metric_1`. If the value of `metric_1` of 2 output models are same, they will be sorted by `metric_2`, and followed by next lower priority metric.
+`CandidateModels` includes k folders where k is the number of ranked output models, with name `BestCandidateModel_1`, `BestCandidateModel_2`, ... and `BestCandidateModel_k`. The order is ranked by metrics priorities, starting from 1. e.g., if you have 3 metrics `metric_1`, `metric_2` and `metric_3` with priority `1`, `2` and `3`. The output models will be sorted firstly by `metric_1`. If the value of `metric_1` of 2 output models are same, they will be sorted by `metric_2`, and followed by next lower priority metric.

 Each `BestCandidateModel` folder will include model file/folder. The folder also includes a json file which includes the Olive Pass run history configurations since input model, a json file with performance metrics and a json file for inference settings for the candidate model if the candidate model is an ONNX model.

-##### Inference config file
-The inference config file is a json file including `execution_provider` and `session_options`. e.g.:
-
-```
-{
-    "execution_provider": [
-        [
-            "CPUExecutionProvider",
-            {}
-        ]
-    ],
-    "session_options": {
-        "execution_mode": 1,
-        "graph_optimization_level": 99,
-        "extra_session_config": null,
-        "inter_op_num_threads": 1,
-        "intra_op_num_threads": 64
-    }
-}
-```
-
 #### SampleCode
 Olive will only provide sample codes for ONNX model. Sample code supports 3 different programming languages: `C++`, `C#` and `Python`. And a code snippet introducing how to use Olive output artifacts to inference candidate model with recommended inference configurations.

-
-## How to package Olive artifacts
-Olive packaging configuration is configured in `PackagingConfig` in Engine configuration. If not specified, Olive will not package artifacts.
-
-* `PackagingConfig`
-    * `type [PackagingType]`:
-      Olive packaging type. Olive will package different artifacts based on `type`.
-    * `name [str]`:
-      For `PackagingType.Zipfile` type, Olive will generate a ZIP file with `name` prefix: `<name>.zip`. By default, the output artifacts will be named as `OutputModels.zip`.
-    * `export_in_mlflow_format [bool]`:
-      Export model in mlflow format. This is `false` by default.
-
-You can add `PackagingConfig` to Engine configurations. e.g.:
-
-```
-"engine": {
-    "search_strategy": {
-        "execution_order": "joint",
-        "search_algorithm": "tpe",
-        "search_algorithm_config": {
-            "num_samples": 5,
-            "seed": 0
-        }
-    },
-    "evaluator": "common_evaluator",
-    "host": "local_system",
-    "target": "local_system",
-    "packaging_config": {
-        "type": "Zipfile",
-        "name": "OutputModels"
-    },
-    "clean_cache": true,
-    "cache_dir": "cache"
-}
-```
-
-### Models rank JSON file
+#### Models rank JSON file
 A file that contains a JSON list for ranked model info across all accelerators, e.g.:
 ```
 [
@ -128,3 +70,150 @@ A file that contains a JSON list for ranked model info across all accelerators,
    {"rank": 3, "model_config": <model_config>, "metrics": <metrics>}
 ]
 ```
+
+### AzureMLModels
+AzureMLModels packaging will register the output models to your Azure Machine Learning workspace. The asset name will be set as `<packaging_config_name>_<accelerator_spec>_<model_rank>`. The order is ranked by metrics priorities, starting from 1. For instance, if the output model is ONNX model and the packaging config is:
+
+```
+{
+    "type": "AzureMLModels",
+    "name": "olive_output_model",
+    "config": {
+        "version": "1",
+        "description": "description"
+    }
+}
+```
+
+and for CPU, the best execution provider is CPUExecutionProvider, so the first ranked model name registered on AML will be `olive_output_model_cpu-cpu_1`.
+
+Olive will also upload model configuration file, inference config file, metrics file and model info file to the Azure ML.
+
+### AzureMLData
+AzureMLData packaging will upload the output models to your Azure Machine Learning workspace as Data assets. The asset name will be set as `<packaging_config_name>_<accelerator_spec>_<model_rank>`. The order is ranked by metrics priorities, starting from 1. For instance, if the output model is ONNX model and the packaging config is:
+
+```
+{
+    "type": "AzureMLData",
+    "name": "olive_output_model",
+    "config": {
+        "version": "1",
+        "description": "description"
+    }
+}
+```
+
+and for CPU, the best execution provider is CPUExecutionProvider, so the first ranked model Data name on AML will be `olive_output_model_cpu-cpu_1`.
+
+Olive will also upload model configuration file, inference config file, metrics file and model info file to the Azure ML.
+
+## How to package Olive artifacts
+Olive packaging configuration is configured in `PackagingConfig` in Engine configuration. `PackagingConfig` can be a single packging configuration. Alternatively, if you want to apply multiple packaging types, you can also define a list of packaging configurations.
+
+If not specified, Olive will not package artifacts.
+
+* `PackagingConfig`
+    * `type [PackagingType]`:
+      Olive packaging type. Olive will package different artifacts based on `type`.
+    * `name [str]`:
+      For `PackagingType.Zipfile` type, Olive will generate a ZIP file with `name` prefix: `<name>.zip`.
+      For `PackagingType.AzureMLModels` and `PackagingType.AzureMLData`, Olive will use this `name` for Azure ML resource.
+      The default value is `OutputModels`.
+    * `config [dict]`:
+      The packaging config.
+      * `Zipfile`
+        * `export_in_mlflow_format [bool]`:
+          Export model in mlflow format. This is `false` by default.
+      * `AzureMLModels`
+        * `export_in_mlflow_format [bool]`:
+          Export model in mlflow format. This is `false` by default.
+        * `version [int | str]`：
+          The version for this model registration. This is `1` by default.
+        * `description [str]`
+          The description for this model registration. This is `None` by default.
+      * `AzureMLData`
+        * `export_in_mlflow_format [bool]`:
+          Export model in mlflow format. This is `false` by default.
+        * `version [int | str]`：
+          The version for this data asset. This is `1` by default.
+        * `description [str]`
+          The description for this data asset. This is `None` by default.
+
+You can add `PackagingConfig` to Engine configurations. e.g.:
+
+```
+"engine": {
+    "search_strategy": {
+        "execution_order": "joint",
+        "search_algorithm": "tpe",
+        "search_algorithm_config": {
+            "num_samples": 5,
+            "seed": 0
+        }
+    },
+    "evaluator": "common_evaluator",
+    "host": "local_system",
+    "target": "local_system",
+    "packaging_config": [
+        {
+            "type": "Zipfile",
+            "name": "OutputModels"
+        },
+        {
+            "type": "AzureMLModels",
+            "name": "OutputModels"
+        },
+        {
+            "type": "AzureMLData",
+            "name": "OutputModels"
+        }
+    ]
+    "clean_cache": true,
+    "cache_dir": "cache"
+}
+```
+
+## Packaged files
+### Inference config file
+The inference config file is a json file including `execution_provider` and `session_options`. e.g.:
+
+```
+{
+    "execution_provider": [
+        [
+            "CPUExecutionProvider",
+            {}
+        ]
+    ],
+    "session_options": {
+        "execution_mode": 1,
+        "graph_optimization_level": 99,
+        "extra_session_config": null,
+        "inter_op_num_threads": 1,
+        "intra_op_num_threads": 64
+    }
+}
+```
+
+### Model configuration file
+The model configuration file is a json file including the history of applied Passes history to the output model. e.g.:
+```
+{
+  "53fc6781998a4624b61959bb064622ce": null,
+  "0_OnnxConversion-53fc6781998a4624b61959bb064622ce-7a320d6d630bced3548f242238392730": {
+    ...
+  },
+  "1_OrtTransformersOptimization-0-c499e39e42693aaab050820afd31e0c3-cpu-cpu": {
+    ...
+  },
+  "2_OnnxQuantization-1-1431c563dcfda9c9c3bf26c5d61ef58e": {
+    ...
+  },
+  "3_OrtPerfTuning-2-a843d77ae4964c04e145b83567fb5b05-cpu-cpu": {
+    ...
+  }
+}
+```
+
+### Metrics file
+The metrics file is a json file including input model metrics and output model metrics.
--- a/olive/engine/engine.py
+++ b/olive/engine/engine.py
@ -226,7 +226,7 @@ class Engine:
        input_model_config: ModelConfig,
        accelerator_specs: List["AcceleratorSpec"],
        data_root: str = None,
-        packaging_config: Optional["PackagingConfig"] = None,
+        packaging_config: Optional[Union["PackagingConfig", List["PackagingConfig"]]] = None,
        output_dir: str = None,
        output_name: str = None,
        evaluate_input_model: bool = True,
@ -297,6 +297,7 @@ class Engine:
                self.footprints,
                outputs,
                output_dir,
+                self.azureml_client_config,
            )
        else:
            logger.info("No packaging config provided, skip packaging artifacts")
--- a/olive/engine/packaging/packaging_config.py
+++ b/olive/engine/packaging/packaging_config.py
@ -3,14 +3,43 @@
 # Licensed under the MIT License.
 # --------------------------------------------------------------------------
 from enum import Enum
+from typing import Optional, Union

-from olive.common.config_utils import ConfigBase
+from olive.common.config_utils import ConfigBase, validate_config
+from olive.common.pydantic_v1 import validator


 class PackagingType(str, Enum):
    """Output Artifacts type."""

    Zipfile = "Zipfile"
+    AzureMLModels = "AzureMLModels"
+    AzureMLData = "AzureMLData"
+
+
+class CommonPackagingConfig(ConfigBase):
+    export_in_mlflow_format: bool = False
+
+
+class ZipfilePackagingConfig(CommonPackagingConfig):
+    pass
+
+
+class AzureMLDataPackagingConfig(CommonPackagingConfig):
+    version: Union[int, str] = "1"
+    description: Optional[str] = None
+
+
+class AzureMLModelsPackagingConfig(CommonPackagingConfig):
+    version: Union[int, str] = "1"
+    description: Optional[str] = None
+
+
+_type_to_config = {
+    PackagingType.Zipfile: ZipfilePackagingConfig,
+    PackagingType.AzureMLModels: AzureMLModelsPackagingConfig,
+    PackagingType.AzureMLData: AzureMLDataPackagingConfig,
+}


 class PackagingConfig(ConfigBase):
@ -18,4 +47,10 @@ class PackagingConfig(ConfigBase):

    type: PackagingType = PackagingType.Zipfile
    name: str = "OutputModels"
-    export_in_mlflow_format: bool = False
+    config: CommonPackagingConfig = None
+
+    @validator("config", pre=True, always=True)
+    def _validate_config(cls, v, values):
+        packaging_type = values.get("type")
+        config_class = _type_to_config.get(packaging_type)
+        return validate_config(v, config_class)
--- a/olive/engine/packaging/packaging_generator.py
+++ b/olive/engine/packaging/packaging_generator.py
@ -2,7 +2,6 @@
 # Copyright (c) Microsoft Corporation. All rights reserved.
 # Licensed under the MIT License.
 # --------------------------------------------------------------------------
-import itertools
 import json
 import logging
 import platform
@ -13,18 +12,24 @@ import urllib.request
 from collections import OrderedDict
 from pathlib import Path
 from string import Template
-from typing import TYPE_CHECKING, Dict, List
+from typing import TYPE_CHECKING, Dict, List, Union

 import pkg_resources

-from olive.common.utils import copy_dir, run_subprocess
-from olive.engine.packaging.packaging_config import PackagingConfig, PackagingType
+from olive.common.utils import copy_dir, retry_func, run_subprocess
+from olive.engine.packaging.packaging_config import (
+    AzureMLDataPackagingConfig,
+    AzureMLModelsPackagingConfig,
+    PackagingConfig,
+    PackagingType,
+)
 from olive.model import ONNXModelHandler
 from olive.resource_path import ResourceType, create_resource_path
 from olive.systems.utils import get_package_name_from_ep

 if TYPE_CHECKING:
-    from olive.engine.footprint import Footprint
+    from olive.azureml.azureml_client import AzureMLClientConfig
+    from olive.engine.footprint import Footprint, FootprintNode
    from olive.hardware import AcceleratorSpec

 logger = logging.getLogger(__name__)
@ -33,167 +38,254 @@ logger = logging.getLogger(__name__)


 def generate_output_artifacts(
-    packaging_config: PackagingConfig,
+    packaging_configs: Union[PackagingConfig, List[PackagingConfig]],
    footprints: Dict["AcceleratorSpec", "Footprint"],
    pf_footprints: Dict["AcceleratorSpec", "Footprint"],
    output_dir: Path,
+    azureml_client_config: "AzureMLClientConfig" = None,
 ):
    if sum(len(f.nodes) if f.nodes else 0 for f in pf_footprints.values()) == 0:
        logger.warning("No model is selected. Skip packaging output artifacts.")
        return
-    if packaging_config.type == PackagingType.Zipfile:
-        _generate_zipfile_output(packaging_config, footprints, pf_footprints, output_dir)
+    packaging_config_list = packaging_configs if isinstance(packaging_configs, list) else [packaging_configs]
+    for packaging_config in packaging_config_list:
+        _package_candidate_models(packaging_config, output_dir, footprints, pf_footprints, azureml_client_config)


-def _generate_zipfile_output(
+def _package_candidate_models(
    packaging_config: PackagingConfig,
+    output_dir: Path,
    footprints: Dict["AcceleratorSpec", "Footprint"],
    pf_footprints: Dict["AcceleratorSpec", "Footprint"],
-    output_dir: Path,
-) -> None:
-    logger.info("Packaging Zipfile output artifacts")
-    cur_path = Path(__file__).parent
+    azureml_client_config: "AzureMLClientConfig" = None,
+):
+    packaging_type = packaging_config.type
+    output_name = packaging_config.name
+    config = packaging_config.config
+    export_in_mlflow_format = config.export_in_mlflow_format
+
+    logger.info("Packaging output models to %s", packaging_type)
+
    with tempfile.TemporaryDirectory() as temp_dir:
+
        tempdir = Path(temp_dir)
-        _package_sample_code(cur_path, tempdir)
-        _package_models_rank(tempdir, pf_footprints)
+
+        if packaging_type == PackagingType.Zipfile:
+            cur_path = Path(__file__).parent
+            _package_sample_code(cur_path, tempdir)
+            _package_onnxruntime_packages(tempdir, next(iter(pf_footprints.values())))
+
        for accelerator_spec, pf_footprint in pf_footprints.items():
-            if pf_footprint.nodes and footprints[accelerator_spec].nodes:
-                _package_candidate_models(
-                    tempdir,
-                    footprints[accelerator_spec],
-                    pf_footprint,
-                    accelerator_spec,
-                    packaging_config.export_in_mlflow_format,
-                )
-        _package_onnxruntime_packages(tempdir, next(iter(pf_footprints.values())))
-        shutil.make_archive(packaging_config.name, "zip", tempdir)
-        package_file = f"{packaging_config.name}.zip"
-        shutil.move(package_file, output_dir / package_file)
+            footprint = footprints[accelerator_spec]
+            if pf_footprint.nodes and footprint.nodes:
+                model_rank = 1
+                input_node = footprint.get_input_node()
+                for model_id, node in pf_footprint.nodes.items():
+                    model_name = f"{output_name}_{accelerator_spec}_{model_rank}"
+                    if packaging_type == PackagingType.Zipfile:
+                        model_dir = (
+                            tempdir / "CandidateModels" / str(accelerator_spec) / f"BestCandidateModel_{model_rank}"
+                        )
+                    else:
+                        model_dir = tempdir / model_name
+
+                    model_dir.mkdir(parents=True, exist_ok=True)
+
+                    # Copy inference config
+                    inference_config_path = model_dir / "inference_config.json"
+                    inference_config = pf_footprint.get_model_inference_config(model_id) or {}
+
+                    _copy_inference_config(inference_config_path, inference_config)
+                    _copy_configurations(model_dir, footprint, model_id)
+                    _copy_metrics(model_dir, input_node, node)
+                    _save_model(pf_footprint, model_id, model_dir, inference_config, export_in_mlflow_format)
+
+                    model_info_list = []
+                    model_info = _get_model_info(node, model_rank)
+                    model_info_list.append(model_info)
+                    _copy_model_info(model_dir, model_info)
+
+                    if packaging_type == PackagingType.AzureMLModels:
+                        _upload_to_azureml_models(azureml_client_config, model_dir, model_name, config)
+                    elif packaging_type == PackagingType.AzureMLData:
+                        _upload_to_azureml_data(azureml_client_config, model_dir, model_name, config)
+
+                model_rank += 1
+
+        if packaging_type == PackagingType.Zipfile:
+            _copy_models_rank(tempdir, model_info_list)
+            _package_zipfile_model(output_dir, output_name, tempdir)


-def _package_models_rank(tempdir, footprints: Dict["AcceleratorSpec", "Footprint"]):
-    metrics_dict = next(iter(footprints.values())).objective_dict
-    sorted_nodes = sorted(
-        itertools.chain.from_iterable(f.nodes.values() for f in footprints.values()),
-        key=lambda x: tuple(
-            x.metrics.value[metric].value if x.metrics.cmp_direction[metric] == 1 else -x.metrics.value[metric].value
-            for metric in metrics_dict
-        ),
-        reverse=True,
+def _upload_to_azureml_models(
+    azureml_client_config: "AzureMLClientConfig",
+    model_path: Path,
+    model_name: str,
+    config: AzureMLModelsPackagingConfig,
+):
+    """Upload model to AzureML workspace Models."""
+    from azure.ai.ml.constants import AssetTypes
+    from azure.ai.ml.entities import Model
+    from azure.core.exceptions import ServiceResponseError
+
+    ml_client = azureml_client_config.create_client()
+    model = Model(
+        path=model_path,
+        type=AssetTypes.MLFLOW_MODEL if config.export_in_mlflow_format else AssetTypes.CUSTOM_MODEL,
+        name=model_name,
+        version=str(config.version),
+        description=config.description,
    )
-    rank = 1
-    model_info_list = []
-    for node in sorted_nodes:
-        model_info = {
-            "rank": rank,
-            "model_config": node.model_config,
-            "metrics": node.metrics.value.to_json() if node.metrics else None,
-        }
-        model_info_list.append(model_info)
-        rank += 1
+    retry_func(
+        ml_client.models.create_or_update,
+        [model],
+        max_tries=azureml_client_config.max_operation_retries,
+        delay=azureml_client_config.operation_retry_interval,
+        exceptions=ServiceResponseError,
+    )
+
+
+def _upload_to_azureml_data(
+    azureml_client_config: "AzureMLClientConfig", model_path: Path, model_name: str, config: AzureMLDataPackagingConfig
+):
+    """Upload model as Data to AzureML workspace Data."""
+    from azure.ai.ml.constants import AssetTypes
+    from azure.ai.ml.entities import Data
+    from azure.core.exceptions import ServiceResponseError
+
+    ml_client = azureml_client_config.create_client()
+    data = Data(
+        path=str(model_path),
+        type=AssetTypes.URI_FILE if model_path.is_file() else AssetTypes.URI_FOLDER,
+        description=config.description,
+        name=model_name,
+        version=str(config.version),
+    )
+    retry_func(
+        ml_client.data.create_or_update,
+        [data],
+        max_tries=azureml_client_config.max_operation_retries,
+        delay=azureml_client_config.operation_retry_interval,
+        exceptions=ServiceResponseError,
+    )
+
+
+def _get_model_info(node: "FootprintNode", model_rank: int):
+    return {
+        "rank": model_rank,
+        "model_config": node.model_config,
+        "metrics": node.metrics.value.to_json() if node.metrics else None,
+    }
+
+
+def _copy_models_rank(tempdir: Path, model_info_list: List[Dict]):
    with (tempdir / "models_rank.json").open("w") as f:
        f.write(json.dumps(model_info_list))


-def _package_sample_code(cur_path, tempdir):
+def _package_sample_code(cur_path: Path, tempdir: Path):
    copy_dir(cur_path / "sample_code", tempdir / "SampleCode")


-def _package_candidate_models(
-    tempdir,
-    footprint: "Footprint",
+def _package_zipfile_model(output_dir: Path, output_name: str, model_dir: Path):
+    shutil.make_archive(output_name, "zip", model_dir)
+    package_file = f"{output_name}.zip"
+    shutil.move(package_file, output_dir / package_file)
+
+
+def _copy_model_info(model_dir: Path, model_info: Dict):
+    model_info_path = model_dir / "model_info.json"
+    with model_info_path.open("w") as f:
+        json.dump(model_info, f, indent=4)
+
+
+def _copy_inference_config(path: Path, inference_config: Dict):
+    with path.open("w") as f:
+        json.dump(inference_config, f, indent=4)
+
+
+def _copy_configurations(model_dir: Path, footprint: "Footprint", model_id: str):
+    configuration_path = model_dir / "configurations.json"
+    with configuration_path.open("w") as f:
+        json.dump(OrderedDict(reversed(footprint.trace_back_run_history(model_id).items())), f, indent=4)
+
+
+# TODO(xiaoyu): Add target info to metrics file
+def _copy_metrics(model_dir: Path, input_node: "FootprintNode", node: "FootprintNode"):
+    metric_path = model_dir / "metrics.json"
+    if node.metrics:
+        with metric_path.open("w") as f:
+            metrics = {
+                "input_model_metrics": input_node.metrics.value.to_json() if input_node.metrics else None,
+                "candidate_model_metrics": node.metrics.value.to_json(),
+            }
+            json.dump(metrics, f, indent=4)
+
+
+def _save_model(
    pf_footprint: "Footprint",
-    accelerator_spec: "AcceleratorSpec",
-    export_in_mlflow_format=False,
-) -> None:
-    candidate_models_dir = tempdir / "CandidateModels"
-    model_rank = 1
-    input_node = footprint.get_input_node()
-    for model_id, node in pf_footprint.nodes.items():
-        model_dir = candidate_models_dir / str(accelerator_spec) / f"BestCandidateModel_{model_rank}"
-        model_dir.mkdir(parents=True, exist_ok=True)
-        model_rank += 1
+    model_id: str,
+    saved_model_path: Path,
+    inference_config: Dict,
+    export_in_mlflow_format: bool,
+):
+    model_path = pf_footprint.get_model_path(model_id)
+    model_resource_path = create_resource_path(model_path) if model_path else None
+    model_type = pf_footprint.get_model_type(model_id)

-        # Copy inference config
-        inference_config_path = model_dir / "inference_config.json"
-        inference_config = pf_footprint.get_model_inference_config(model_id) or {}
-
-        # Add use_ort_extensions to inference config if needed
-        use_ort_extensions = pf_footprint.get_use_ort_extensions(model_id)
-        if use_ort_extensions:
-            inference_config["use_ort_extensions"] = True
-
-        with inference_config_path.open("w") as f:
-            json.dump(inference_config, f)
-
-        # Copy model file
-        model_path = pf_footprint.get_model_path(model_id)
-        model_resource_path = create_resource_path(model_path) if model_path else None
-        model_type = pf_footprint.get_model_type(model_id)
-        if model_type.lower() == "onnxmodel":
-            with tempfile.TemporaryDirectory(dir=model_dir, prefix="olive_tmp") as model_tempdir:
-                # save to model_tempdir first since model_path may be a folder
-                temp_resource_path = create_resource_path(model_resource_path.save_to_dir(model_tempdir, "model", True))
-                # save to model_dir
-                if temp_resource_path.type == ResourceType.LocalFile:
-                    # if model_path is a file, rename it to model_dir / model.onnx
-                    Path(temp_resource_path.get_path()).rename(model_dir / "model.onnx")
-                elif temp_resource_path.type == ResourceType.LocalFolder:
-                    # if model_path is a folder, save all files in the folder to model_dir / file_name
-                    # file_name for .onnx file is model.onnx, otherwise keep the original file name
-                    model_config = pf_footprint.get_model_config(model_id)
-                    onnx_file_name = model_config.get("onnx_file_name")
-                    onnx_model = ONNXModelHandler(temp_resource_path, onnx_file_name)
-                    model_name = Path(onnx_model.model_path).name
-                    for file in Path(temp_resource_path.get_path()).iterdir():
-                        if file.name == model_name:
-                            file_name = "model.onnx"
-                        else:
-                            file_name = file.name
-                        Path(file).rename(model_dir / file_name)
-                if export_in_mlflow_format:
-                    try:
-                        import mlflow
-                    except ImportError:
-                        raise ImportError("Exporting model in MLflow format requires mlflow>=2.4.0") from None
-                    from packaging.version import Version
-
-                    if Version(mlflow.__version__) < Version("2.4.0"):
-                        logger.warning(
-                            "Exporting model in MLflow format requires mlflow>=2.4.0. "
-                            "Skip exporting model in MLflow format."
-                        )
+    if model_type.lower() == "onnxmodel":
+        with tempfile.TemporaryDirectory(dir=saved_model_path, prefix="olive_tmp") as model_tempdir:
+            # save to model_tempdir first since model_path may be a folder
+            temp_resource_path = create_resource_path(model_resource_path.save_to_dir(model_tempdir, "model", True))
+            # save to model_dir
+            if temp_resource_path.type == ResourceType.LocalFile:
+                # if model_path is a file, rename it to model_dir / model.onnx
+                Path(temp_resource_path.get_path()).rename(saved_model_path / "model.onnx")
+            elif temp_resource_path.type == ResourceType.LocalFolder:
+                # if model_path is a folder, save all files in the folder to model_dir / file_name
+                # file_name for .onnx file is model.onnx, otherwise keep the original file name
+                model_config = pf_footprint.get_model_config(model_id)
+                onnx_file_name = model_config.get("onnx_file_name")
+                onnx_model = ONNXModelHandler(temp_resource_path, onnx_file_name)
+                model_name = Path(onnx_model.model_path).name
+                for file in Path(temp_resource_path.get_path()).iterdir():
+                    if file.name == model_name:
+                        file_name = "model.onnx"
                    else:
-                        _generate_onnx_mlflow_model(model_dir, inference_config)
-
-        elif model_type.lower() == "openvinomodel":
-            model_resource_path.save_to_dir(model_dir, "model", True)
-        else:
-            raise ValueError(
-                f"Unsupported model type: {model_type} for packaging,"
-                " you can set `packaging_config` as None to mitigate this issue."
+                        file_name = file.name
+                    Path(file).rename(saved_model_path / file_name)
+            if export_in_mlflow_format:
+                _generate_onnx_mlflow_model(saved_model_path, inference_config)
+                return saved_model_path / "mlflow_model"
+            return (
+                saved_model_path
+                if model_resource_path.type == ResourceType.LocalFolder
+                else saved_model_path / "model.onnx"
            )

-        # Copy Passes configurations
-        configuration_path = model_dir / "configurations.json"
-        with configuration_path.open("w") as f:
-            json.dump(OrderedDict(reversed(footprint.trace_back_run_history(model_id).items())), f)
-
-        # Copy metrics
-        # TODO(xiaoyu): Add target info to metrics file
-        if node.metrics:
-            metric_path = model_dir / "metrics.json"
-            with metric_path.open("w") as f:
-                metrics = {
-                    "input_model_metrics": input_node.metrics.value.to_json() if input_node.metrics else None,
-                    "candidate_model_metrics": node.metrics.value.to_json(),
-                }
-                json.dump(metrics, f, indent=4)
+    elif model_type.lower() == "openvinomodel":
+        model_resource_path.save_to_dir(saved_model_path, "model", True)
+        return saved_model_path
+    else:
+        raise ValueError(
+            f"Unsupported model type: {model_type} for packaging,"
+            " you can set `packaging_config` as None to mitigate this issue."
+        )


-def _generate_onnx_mlflow_model(model_dir, inference_config):
-    import mlflow
+def _generate_onnx_mlflow_model(model_dir: Path, inference_config: Dict):
+    try:
+        import mlflow
+    except ImportError:
+        raise ImportError("Exporting model in MLflow format requires mlflow>=2.4.0") from None
+    from packaging.version import Version
+
+    if Version(mlflow.__version__) < Version("2.4.0"):
+        logger.warning(
+            "Exporting model in MLflow format requires mlflow>=2.4.0. Skip exporting model in MLflow format."
+        )
+        return None
+
    import onnx

    logger.info("Exporting model in MLflow format")
@ -208,19 +300,21 @@ def _generate_onnx_mlflow_model(model_dir, inference_config):
    onnx_model_path = model_dir / "model.onnx"
    model_proto = onnx.load(onnx_model_path)
    onnx_model_path.unlink()
+    mlflow_model_path = model_dir / "mlflow_model"

    # MLFlow will save models with default config save_as_external_data=True
    # https://github.com/mlflow/mlflow/blob/1d6eaaa65dca18688d9d1efa3b8b96e25801b4e9/mlflow/onnx.py#L175
    # There will be an aphanumeric file generated in the same folder as the model file
    mlflow.onnx.save_model(
        model_proto,
-        model_dir / "mlflow_model",
+        mlflow_model_path,
        onnx_execution_providers=inference_config.get("execution_provider"),
        onnx_session_options=session_dict,
    )
+    return mlflow_model_path


-def _package_onnxruntime_packages(tempdir, pf_footprint: "Footprint"):
+def _package_onnxruntime_packages(tempdir: Path, pf_footprint: "Footprint"):
    # pylint: disable=not-an-iterable
    installed_packages = pkg_resources.working_set
    onnxruntime_pkg = [i for i in installed_packages if i.key.startswith("onnxruntime")]
@ -351,7 +445,7 @@ def _download_c_packages(package_name_list: List[str], ort_version: str, ort_dow
            )


-def _skip_download_c_package(package_path):
+def _skip_download_c_package(package_path: Path):
    warning_msg = (
        "Found ort-nightly package installed. Please manually download "
        "ort-nightly package from https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly"
--- a/olive/workflows/run/config.py
+++ b/olive/workflows/run/config.py
@ -35,7 +35,7 @@ class RunEngineConfig(EngineConfig):
    evaluate_input_model: bool = True
    output_dir: Union[Path, str] = None
    output_name: str = None
-    packaging_config: PackagingConfig = None
+    packaging_config: Union[PackagingConfig, List[PackagingConfig]] = None
    log_severity_level: int = 1
    ort_log_severity_level: int = 3
    ort_py_log_severity_level: int = 3
--- a/test/unit_test/engine/packaging/test_packaging_generator.py
+++ b/test/unit_test/engine/packaging/test_packaging_generator.py
@ -7,21 +7,28 @@ import shutil
 import zipfile
 from pathlib import Path
 from test.unit_test.utils import get_accuracy_metric, get_pytorch_model_config
-from unittest.mock import patch
+from unittest.mock import Mock, patch

 import onnx
 import pytest

 from olive.engine import Engine
-from olive.engine.footprint import Footprint
-from olive.engine.packaging.packaging_config import PackagingConfig, PackagingType
+from olive.engine.footprint import Footprint, FootprintNode
+from olive.engine.packaging.packaging_config import (
+    AzureMLDataPackagingConfig,
+    AzureMLModelsPackagingConfig,
+    PackagingConfig,
+    PackagingType,
+)
 from olive.engine.packaging.packaging_generator import generate_output_artifacts
 from olive.evaluator.metric import AccuracySubType
 from olive.evaluator.olive_evaluator import OliveEvaluatorConfig
 from olive.hardware import DEFAULT_CPU_ACCELERATOR
+from olive.hardware.accelerator import AcceleratorSpec
 from olive.passes.onnx.conversion import OnnxConversion


+# TODO(team): no engine API envolved, use generate_output_artifacts API directly
@patch("onnx.external_data_helper.sys.getsizeof")
@pytest.mark.parametrize(
    ("save_as_external_data", "mocked_size_value"),
@ -91,6 +98,7 @@ def test_generate_zipfile_artifacts(mock_sys_getsizeof, save_as_external_data, m
    shutil.rmtree(output_dir)


+# TODO(team): no engine API envolved, use generate_output_artifacts API directly
 def test_generate_zipfile_artifacts_no_search(tmp_path):
    # setup
    options = {
@ -131,6 +139,7 @@ def test_generate_zipfile_artifacts_no_search(tmp_path):
    shutil.rmtree(output_dir)


+# TODO(team): no engine API envolved, use generate_output_artifacts API directly
 def test_generate_zipfile_artifacts_mlflow(tmp_path):
    # setup
    options = {
@ -146,7 +155,7 @@ def test_generate_zipfile_artifacts_mlflow(tmp_path):
    packaging_config = PackagingConfig()
    packaging_config.type = PackagingType.Zipfile
    packaging_config.name = "OutputModels"
-    packaging_config.export_in_mlflow_format = True
+    packaging_config.config.export_in_mlflow_format = True

    output_dir = tmp_path / "outputs"

@ -215,6 +224,116 @@ def test_generate_zipfile_artifacts_zero_len_nodes(tmp_path):
    assert not artifacts_path.exists()


+@patch("olive.engine.packaging.packaging_generator.retry_func")
+@patch("olive.engine.packaging.packaging_generator.create_resource_path")
+def test_generate_azureml_models(mock_create_resource_path, mock_retry_func):
+    from azure.ai.ml.constants import AssetTypes
+    from azure.ai.ml.entities import Model
+    from azure.core.exceptions import ServiceResponseError
+
+    version = "1.0"
+    description = "Test description"
+    name = "OutputModels"
+    model_id = "model_id"
+
+    packaging_config = PackagingConfig(
+        type=PackagingType.AzureMLModels,
+        config=AzureMLModelsPackagingConfig(version=version, description=description),
+        name=name,
+    )
+
+    model_path = "fake_model_file"
+
+    footprints = get_footprints(model_id, model_path)
+
+    azureml_client_config = Mock(max_operation_retries=3, operation_retry_interval=5)
+    ml_client_mock = Mock()
+    azureml_client_config.create_client.return_value = ml_client_mock
+    resource_path_mock = Mock()
+    mock_create_resource_path.return_value = resource_path_mock
+
+    model = Model(
+        path=model_path,
+        type=AssetTypes.CUSTOM_MODEL,
+        name=name,
+        version=version,
+        description=description,
+    )
+
+    # execute
+    generate_output_artifacts(
+        packaging_config, footprints, footprints, output_dir=None, azureml_client_config=azureml_client_config
+    )
+
+    # assert
+    assert mock_retry_func.call_once_with(
+        ml_client_mock.models.create_client,
+        [model],
+        max_tries=azureml_client_config.max_operation_retries,
+        delay=azureml_client_config.operation_retry_interval,
+        exceptions=ServiceResponseError,
+    )
+
+
+@patch("olive.engine.packaging.packaging_generator.retry_func")
+@patch("olive.engine.packaging.packaging_generator.create_resource_path")
+def test_generate_azureml_data(mock_create_resource_path, mock_retry_func):
+    from azure.ai.ml.constants import AssetTypes
+    from azure.ai.ml.entities import Data
+    from azure.core.exceptions import ServiceResponseError
+
+    version = "1.0"
+    description = "Test description"
+    name = "OutputModels"
+    model_id = "model_id"
+
+    packaging_config = PackagingConfig(
+        type=PackagingType.AzureMLData,
+        config=AzureMLDataPackagingConfig(version=version, description=description),
+        name=name,
+    )
+
+    model_path = "fake_model_file"
+
+    footprints = get_footprints(model_id, model_path)
+
+    azureml_client_config = Mock(max_operation_retries=3, operation_retry_interval=5)
+    ml_client_mock = Mock()
+    azureml_client_config.create_client.return_value = ml_client_mock
+    resource_path_mock = Mock()
+    mock_create_resource_path.return_value = resource_path_mock
+
+    data = Data(
+        path=model_path,
+        type=AssetTypes.URI_FILE,
+        name=name,
+        version=version,
+        description=description,
+    )
+
+    # execute
+    generate_output_artifacts(
+        packaging_config, footprints, footprints, output_dir=None, azureml_client_config=azureml_client_config
+    )
+
+    # assert
+    assert mock_retry_func.call_once_with(
+        ml_client_mock.models.create_client,
+        [data],
+        max_tries=azureml_client_config.max_operation_retries,
+        delay=azureml_client_config.operation_retry_interval,
+        exceptions=ServiceResponseError,
+    )
+
+
+def get_footprints(model_id, model_path):
+    acc_spec = AcceleratorSpec(accelerator_type="cpu", execution_provider="CPUExecutionProvider")
+    model_config = {"config": {"model_path": model_path}, "type": "ONNXModel"}
+    footprint_node = FootprintNode(model_id=model_id, is_pareto_frontier=True, model_config=model_config)
+    footprint = Footprint(nodes={model_id: footprint_node}, is_marked_pareto_frontier=True)
+    return {acc_spec: footprint}
+
+
 def verify_output_artifacts(output_dir):
    assert (output_dir / "SampleCode").exists()
    assert (output_dir / "CandidateModels").exists()