Santiagxf/mlflow python deprecation (#2909)

* fix: updating deployments schemas * fix: python version * fix: python version * fix: python version * fixes * fixes * fixes * model
2023-12-19 15:43:59 -05:00 · 2023-12-19 15:43:59 -05:00 · 36f1cb2c65
--- a/.gitignore
+++ b/.gitignore
@ -1,4 +1,5 @@
 *.amlignore
+*.amlignore.amltmp
 *.azureml
 *pythonenv
 *dask-worker-space
--- a/sdk/python/.gitignore
+++ b/sdk/python/.gitignore
@ -1,5 +1,8 @@
 .ipynb_checkpoints
+.ipynb_aml_checkpoints
+named-outputs
 */.ipynb_checkpoints/*
+*.amltmp

 # config files are required to use Semantic Kernel
 !endpoints/online/llm/src/sk/skills/*/*/config.json
--- a/sdk/python/endpoints/online/mlflow/online-endpoints-deploy-mlflow-model-with-script.ipynb
+++ b/sdk/python/endpoints/online/mlflow/online-endpoints-deploy-mlflow-model-with-script.ipynb
@ -87,6 +87,22 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Or if you are working in a compute instance in Azure Machine Learning:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ml_client = MLClient.from_config(DefaultAzureCredential())"
+   ]
+  },
  {
   "cell_type": "markdown",
   "metadata": {},
@ -244,7 +260,7 @@
   "source": [
    "environment = Environment(\n",
    "    conda_file=\"sklearn-diabetes/environment/conda.yaml\",\n",
-    "    image=\"mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest\",\n",
+    "    image=\"mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest\",\n",
    ")"
   ]
  },
@ -388,7 +404,7 @@
   "name": "python310-sdkv2"
  },
  "kernelspec": {
-   "display_name": "Python 3.10 - SDK V2",
+   "display_name": "Python 3.10 - SDK v2",
   "language": "python",
   "name": "python310-sdkv2"
  },
@ -402,7 +418,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]"
+   "version": "3.10.11"
  },
  "nteract": {
   "version": "nteract-front-end@1.0.0"
--- a/sdk/python/endpoints/online/mlflow/online-endpoints-deploy-mlflow-model.ipynb
+++ b/sdk/python/endpoints/online/mlflow/online-endpoints-deploy-mlflow-model.ipynb
@ -2,40 +2,43 @@
 "cells": [
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# Deploy MLflow model to online endpoints\n",
    "Learn how to deploy your [MLflow](https://www.mlflow.org/) model to an [online endpoint](https://docs.microsoft.com/azure/machine-learning/concept-endpoints). When you deploy your MLflow model to an online endpoint, it's a no-code-deployment. It doesn't require scoring script and environment."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "### Requirements - In order to benefit from this tutorial, you will need:\n",
    "- This sample notebook assumes you're using online endpoints; for more information, see [What are Azure Machine Learning endpoints?](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).\n",
    "- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)\n",
    "- An Azure ML workspace with computer cluster - [Configure workspace](../../jobs/configuration.ipynb)\n",
    "- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../README.md) - check the getting started section"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# 1. Connect to Azure Machine Learning Workspace\n",
    "The [workspace](https://docs.microsoft.com/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 1.1 Import the required libraries"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# import required libraries\n",
    "from azure.ai.ml import MLClient\n",
@ -43,51 +46,63 @@
    "    ManagedOnlineEndpoint,\n",
    "    ManagedOnlineDeployment,\n",
    "    Model,\n",
-    "    Environment,\n",
-    "    CodeConfiguration,\n",
    ")\n",
    "from azure.identity import DefaultAzureCredential\n",
    "from azure.ai.ml.constants import AssetTypes"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 1.2 Configure workspace details and get a handle to the workspace\n",
    "\n",
    "To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# enter details of your AML workspace\n",
    "subscription_id = \"<SUBSCRIPTION_ID>\"\n",
    "resource_group = \"<RESOURCE_GROUP>\"\n",
    "workspace = \"<AML_WORKSPACE_NAME>\""
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# get a handle to the workspace\n",
    "ml_client = MLClient(\n",
    "    DefaultAzureCredential(), subscription_id, resource_group, workspace\n",
    ")"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Or if you are working in a compute instance in Azure Machine Learning:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ml_client = MLClient.from_config(DefaultAzureCredential())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# 2. Create Online Endpoint\n",
    "\n",
@ -101,18 +116,20 @@
    "    - `type`- The type of managed identity. Azure Machine Learning supports `system_assigned` or `user_assigned identity`.\n",
    "    - `user_assigned_identities` - List (array) of fully qualified resource IDs of the user-assigned identities. This property is required is `identity.type` is user_assigned.\n",
    "- `description`- Description of the endpoint."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 2.1 Configure the endpoint"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# Creating a unique endpoint name with current datetime to avoid conflicts\n",
    "import datetime\n",
@ -126,30 +143,28 @@
    "    auth_mode=\"key\",\n",
    "    tags={\"foo\": \"bar\"},\n",
    ")"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 2.2 Create the endpoint\n",
    "Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "ml_client.begin_create_or_update(endpoint).result()"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 3. Create a blue deployment\n",
    "\n",
@ -160,11 +175,11 @@
    "- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.\n",
    "- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](https://docs.microsoft.com/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).\n",
    "- `instance_count` - The number of instances to use for the deployment"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "### No code deployment\n",
    "For MLflow no-code-deployment (NCD) to work, setting `type` to `MLFLOW` is mandatory. \n",
@ -179,18 +194,22 @@
    "- `path` - Local path to the model file(s). This can point to either a file or a directory.\n",
    "- `type` - Storage format of the model. Applicable for no-code deployment scenarios. Allowed values are `CUSTOM`, `MLFLOW` and `TRITON`\n",
    "- `description` - Description of the model."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 3.1 Configure the deployment"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "name": "blue_deployment"
+   },
+   "outputs": [],
   "source": [
    "# create a blue deployment\n",
    "model = Model(\n",
@ -206,48 +225,44 @@
    "    instance_type=\"Standard_F4s_v2\",\n",
    "    instance_count=1,\n",
    ")"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {
-    "name": "blue_deployment"
-   }
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "## 3.2 Create the deployment\n",
    "\n",
    "Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
-   "source": [
-    "ml_client.online_deployments.begin_create_or_update(blue_deployment).result()"
-   ],
-   "outputs": [],
   "execution_count": null,
   "metadata": {
    "name": "ml_client"
-   }
+   },
+   "outputs": [],
+   "source": [
+    "ml_client.online_deployments.begin_create_or_update(blue_deployment).result()"
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "name": "endpoint.traffic"
+   },
+   "outputs": [],
   "source": [
    "# blue deployment takes 100 traffic\n",
    "endpoint.traffic = {\"blue\": 100}\n",
    "ml_client.begin_create_or_update(endpoint).result()"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {
-    "name": "endpoint.traffic"
-   }
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# 4. Test the deployment\n",
    "\n",
@ -258,11 +273,13 @@
    "- `deployment_name` - Name of the specific deployment to test in an endpoint\n",
    "\n",
    "We will send a sample request using a [sample-request-lightgbm.json](sample-request-lightgbm.json) file."
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# test the blue deployment with some sample data\n",
    "ml_client.online_endpoints.invoke(\n",
@ -270,20 +287,20 @@
    "    deployment_name=\"blue\",\n",
    "    request_file=\"sample-request-sklearn.json\",\n",
    ")"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# 5. Get endpoint details"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "# Get the details for online endpoint\n",
    "endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)\n",
@ -293,26 +310,23 @@
    "\n",
    "# Get the scoring URI\n",
    "print(endpoint.scoring_uri)"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "markdown",
+   "metadata": {},
   "source": [
    "# 6. Delete the deployment and endopoint"
-   ],
-   "metadata": {}
+   ]
  },
  {
   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": [
    "ml_client.online_endpoints.begin_delete(name=online_endpoint_name)"
-   ],
-   "outputs": [],
-   "execution_count": null,
-   "metadata": {}
+   ]
  }
 ],
 "metadata": {
@ -322,30 +336,30 @@
  "interpreter": {
   "hash": "f6c16bbccc10ca03d07b4bd30ccd59ce8e25aee703e3c8d834bba9b524ce7685"
  },
+  "kernel_info": {
+   "name": "python310-sdkv2"
+  },
  "kernelspec": {
-   "display_name": "Python 3.10 - SDK V2",
+   "display_name": "Python 3.10 - SDK v2",
   "language": "python",
   "name": "python310-sdkv2"
  },
  "language_info": {
-   "name": "python",
-   "version": "3.10.4",
-   "mimetype": "text/x-python",
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
-   "pygments_lexer": "ipython3",
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
   "nbconvert_exporter": "python",
-   "file_extension": ".py"
-  },
-  "orig_nbformat": 4,
-  "kernel_info": {
-   "name": "python310-sdkv2"
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
  },
  "nteract": {
   "version": "nteract-front-end@1.0.0"
-  }
+  },
+  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
--- a/sdk/python/endpoints/online/mlflow/sample-request-sklearn.json
+++ b/sdk/python/endpoints/online/mlflow/sample-request-sklearn.json
@ -1,4 +1,5 @@
-{"input_data": {
+{
+  "input_data": {
    "columns": [
      "age",
      "sex",
@ -12,8 +13,8 @@
      "s6"
    ],
    "data": [
-      [ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ],
-      [ 10.0,2.0,9.0,8.0,7.0,6.0,5.0,4.0,3.0,2.0]
+      [ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ]
    ],
-    "index": [0,1]
-  }}
+    "index": [0]
+  }
+}
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/environment/conda.yaml
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/environment/conda.yaml
@ -1,13 +1,19 @@
 channels:
 - conda-forge
 dependencies:
- python=3.7.11
- pip
+- python=3.10
+- pip<=23.1.2
 - pip:
-  - mlflow
-  - scikit-learn==0.24.1
-  - cloudpickle==2.0.0
-  - psutil==5.8.0
-  - pandas==1.3.5
+  - mlflow==2.7.1
+  - cloudpickle==1.6.0
+  - dataclasses==0.6
+  - lz4==4.0.0
+  - numpy==1.23.5
+  - packaging==23.0
+  - psutil==5.9.0
+  - pyyaml==6.0
+  - scikit-learn==1.1.2
+  - scipy==1.10.1
+  - uuid==1.30
  - azureml-inference-server-http
 name: mlflow-env
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/MLmodel
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/MLmodel
@ -1,15 +1,21 @@
 artifact_path: model
 flavors:
  python_function:
-    env: conda.yaml
+    env:
+      conda: conda.yaml
+      virtualenv: python_env.yaml
    loader_module: mlflow.sklearn
    model_path: model.pkl
-    python_version: 3.7.11
+    predict_fn: predict
+    python_version: 3.10.11
  sklearn:
+    code: null
    pickled_model: model.pkl
    serialization_format: cloudpickle
-    sklearn_version: 0.24.1
-run_id: f1e06708-641d-4a49-8f36-e9dcd8d34346
+    sklearn_version: 1.1.2
+mlflow_version: 2.7.1
+model_uuid: 3f725f3264314c02808dd99d5e5b2781
+run_id: 70f15bab-cf98-48f1-a2ea-9ad2108c28cd
 signature:
  inputs: '[{"name": "age", "type": "double"}, {"name": "sex", "type": "double"},
    {"name": "bmi", "type": "double"}, {"name": "bp", "type": "double"}, {"name":
@ -17,4 +23,3 @@ signature:
    "double"}, {"name": "s4", "type": "double"}, {"name": "s5", "type": "double"},
    {"name": "s6", "type": "double"}]'
  outputs: '[{"type": "double"}]'
-utc_time_created: '2022-03-17 01:56:03.706848'
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/conda.yaml
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/conda.yaml
@ -1,11 +1,18 @@
 channels:
 - conda-forge
 dependencies:
- python=3.7.11
- pip
+- python=3.10.11
+- pip<=23.1.2
 - pip:
-  - mlflow
-  - scikit-learn==0.24.1
-  - cloudpickle==2.0.0
-  - psutil==5.8.0
+  - mlflow==2.7.1
+  - cloudpickle==1.6.0
+  - dataclasses==0.6
+  - lz4==4.0.0
+  - numpy==1.23.5
+  - packaging==23.0
+  - psutil==5.9.0
+  - pyyaml==6.0
+  - scikit-learn==1.1.2
+  - scipy==1.10.1
+  - uuid==1.30
 name: mlflow-env
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/model.pkl
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/model.pkl
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/python_env.yaml
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/python_env.yaml
@ -0,0 +1,7 @@
+python: 3.10.11
+build_dependencies:
+- pip==23.1.2
+- setuptools==67.8.0
+- wheel==0.38.4
+dependencies:
+- -r requirements.txt
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/requirements.txt
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/model/requirements.txt
@ -1,4 +1,11 @@
-mlflow
-cloudpickle==2.0.0
-psutil==5.8.0
-scikit-learn==0.24.1
+mlflow==2.7.1
+cloudpickle==1.6.0
+dataclasses==0.6
+lz4==4.0.0
+numpy==1.23.5
+packaging==23.0
+psutil==5.9.0
+pyyaml==6.0
+scikit-learn==1.1.2
+scipy==1.10.1
+uuid==1.30
--- a/sdk/python/endpoints/online/mlflow/sklearn-diabetes/src/score.py
+++ b/sdk/python/endpoints/online/mlflow/sklearn-diabetes/src/score.py
@ -3,7 +3,7 @@ import os
 import json
 import mlflow
 from io import StringIO
-from mlflow.pyfunc.scoring_server import infer_and_parse_json_input, predictions_to_json
+from mlflow.pyfunc.scoring_server import infer_and_parse_data, predictions_to_json


 def init():
@ -21,8 +21,8 @@ def run(raw_data):
    if "input_data" not in json_data.keys():
        raise Exception("Request must contain a top level key named 'input_data'")

-    serving_input = json.dumps(json_data["input_data"])
-    data = infer_and_parse_json_input(serving_input, input_schema)
+    serving_input = {"dataframe_split": json_data["input_data"]}
+    data = infer_and_parse_data(serving_input, input_schema)
    predictions = model.predict(data)

    result = StringIO()