Santiagxf/mlflow python deprecation (#2909)

* fix: updating deployments schemas

* fix: python version

* fix: python version

* fix: python version

* fixes

* fixes

* fixes

* model
This commit is contained in:
Facundo Santiago 2023-12-19 15:43:59 -05:00 коммит произвёл GitHub
Родитель 22ad22c4fe
Коммит 36f1cb2c65
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
12 изменённых файлов: 192 добавлений и 125 удалений

1
.gitignore поставляемый
Просмотреть файл

@ -1,4 +1,5 @@
*.amlignore
*.amlignore.amltmp
*.azureml
*pythonenv
*dask-worker-space

3
sdk/python/.gitignore поставляемый
Просмотреть файл

@ -1,5 +1,8 @@
.ipynb_checkpoints
.ipynb_aml_checkpoints
named-outputs
*/.ipynb_checkpoints/*
*.amltmp
# config files are required to use Semantic Kernel
!endpoints/online/llm/src/sk/skills/*/*/config.json

Просмотреть файл

@ -87,6 +87,22 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or if you are working in a compute instance in Azure Machine Learning:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client = MLClient.from_config(DefaultAzureCredential())"
]
},
{
"cell_type": "markdown",
"metadata": {},
@ -244,7 +260,7 @@
"source": [
"environment = Environment(\n",
" conda_file=\"sklearn-diabetes/environment/conda.yaml\",\n",
" image=\"mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest\",\n",
" image=\"mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest\",\n",
")"
]
},
@ -388,7 +404,7 @@
"name": "python310-sdkv2"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK V2",
"display_name": "Python 3.10 - SDK v2",
"language": "python",
"name": "python310-sdkv2"
},
@ -402,7 +418,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]"
"version": "3.10.11"
},
"nteract": {
"version": "nteract-front-end@1.0.0"

Просмотреть файл

@ -2,40 +2,43 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploy MLflow model to online endpoints\n",
"Learn how to deploy your [MLflow](https://www.mlflow.org/) model to an [online endpoint](https://docs.microsoft.com/azure/machine-learning/concept-endpoints). When you deploy your MLflow model to an online endpoint, it's a no-code-deployment. It doesn't require scoring script and environment."
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Requirements - In order to benefit from this tutorial, you will need:\n",
"- This sample notebook assumes you're using online endpoints; for more information, see [What are Azure Machine Learning endpoints?](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).\n",
"- An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)\n",
"- An Azure ML workspace with computer cluster - [Configure workspace](../../jobs/configuration.ipynb)\n",
"- Installed Azure Machine Learning Python SDK v2 - [install instructions](../../README.md) - check the getting started section"
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1. Connect to Azure Machine Learning Workspace\n",
"The [workspace](https://docs.microsoft.com/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run."
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.1 Import the required libraries"
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# import required libraries\n",
"from azure.ai.ml import MLClient\n",
@ -43,51 +46,63 @@
" ManagedOnlineEndpoint,\n",
" ManagedOnlineDeployment,\n",
" Model,\n",
" Environment,\n",
" CodeConfiguration,\n",
")\n",
"from azure.identity import DefaultAzureCredential\n",
"from azure.ai.ml.constants import AssetTypes"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.2 Configure workspace details and get a handle to the workspace\n",
"\n",
"To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace."
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# enter details of your AML workspace\n",
"subscription_id = \"<SUBSCRIPTION_ID>\"\n",
"resource_group = \"<RESOURCE_GROUP>\"\n",
"workspace = \"<AML_WORKSPACE_NAME>\""
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get a handle to the workspace\n",
"ml_client = MLClient(\n",
" DefaultAzureCredential(), subscription_id, resource_group, workspace\n",
")"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or if you are working in a compute instance in Azure Machine Learning:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client = MLClient.from_config(DefaultAzureCredential())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Create Online Endpoint\n",
"\n",
@ -101,18 +116,20 @@
" - `type`- The type of managed identity. Azure Machine Learning supports `system_assigned` or `user_assigned identity`.\n",
" - `user_assigned_identities` - List (array) of fully qualified resource IDs of the user-assigned identities. This property is required is `identity.type` is user_assigned.\n",
"- `description`- Description of the endpoint."
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.1 Configure the endpoint"
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Creating a unique endpoint name with current datetime to avoid conflicts\n",
"import datetime\n",
@ -126,30 +143,28 @@
" auth_mode=\"key\",\n",
" tags={\"foo\": \"bar\"},\n",
")"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2.2 Create the endpoint\n",
"Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues."
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.begin_create_or_update(endpoint).result()"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Create a blue deployment\n",
"\n",
@ -160,11 +175,11 @@
"- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.\n",
"- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](https://docs.microsoft.com/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).\n",
"- `instance_count` - The number of instances to use for the deployment"
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### No code deployment\n",
"For MLflow no-code-deployment (NCD) to work, setting `type` to `MLFLOW` is mandatory. \n",
@ -179,18 +194,22 @@
"- `path` - Local path to the model file(s). This can point to either a file or a directory.\n",
"- `type` - Storage format of the model. Applicable for no-code deployment scenarios. Allowed values are `CUSTOM`, `MLFLOW` and `TRITON`\n",
"- `description` - Description of the model."
],
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.1 Configure the deployment"
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "blue_deployment"
},
"outputs": [],
"source": [
"# create a blue deployment\n",
"model = Model(\n",
@ -206,48 +225,44 @@
" instance_type=\"Standard_F4s_v2\",\n",
" instance_count=1,\n",
")"
],
"outputs": [],
"execution_count": null,
"metadata": {
"name": "blue_deployment"
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.2 Create the deployment\n",
"\n",
"Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues."
],
"metadata": {}
]
},
{
"cell_type": "code",
"source": [
"ml_client.online_deployments.begin_create_or_update(blue_deployment).result()"
],
"outputs": [],
"execution_count": null,
"metadata": {
"name": "ml_client"
}
},
"outputs": [],
"source": [
"ml_client.online_deployments.begin_create_or_update(blue_deployment).result()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "endpoint.traffic"
},
"outputs": [],
"source": [
"# blue deployment takes 100 traffic\n",
"endpoint.traffic = {\"blue\": 100}\n",
"ml_client.begin_create_or_update(endpoint).result()"
],
"outputs": [],
"execution_count": null,
"metadata": {
"name": "endpoint.traffic"
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4. Test the deployment\n",
"\n",
@ -258,11 +273,13 @@
"- `deployment_name` - Name of the specific deployment to test in an endpoint\n",
"\n",
"We will send a sample request using a [sample-request-lightgbm.json](sample-request-lightgbm.json) file."
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# test the blue deployment with some sample data\n",
"ml_client.online_endpoints.invoke(\n",
@ -270,20 +287,20 @@
" deployment_name=\"blue\",\n",
" request_file=\"sample-request-sklearn.json\",\n",
")"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Get endpoint details"
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the details for online endpoint\n",
"endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)\n",
@ -293,26 +310,23 @@
"\n",
"# Get the scoring URI\n",
"print(endpoint.scoring_uri)"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 6. Delete the deployment and endopoint"
],
"metadata": {}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_delete(name=online_endpoint_name)"
],
"outputs": [],
"execution_count": null,
"metadata": {}
]
}
],
"metadata": {
@ -322,30 +336,30 @@
"interpreter": {
"hash": "f6c16bbccc10ca03d07b4bd30ccd59ce8e25aee703e3c8d834bba9b524ce7685"
},
"kernel_info": {
"name": "python310-sdkv2"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK V2",
"display_name": "Python 3.10 - SDK v2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"name": "python",
"version": "3.10.4",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"file_extension": ".py"
},
"orig_nbformat": 4,
"kernel_info": {
"name": "python310-sdkv2"
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2

Просмотреть файл

@ -1,4 +1,5 @@
{"input_data": {
{
"input_data": {
"columns": [
"age",
"sex",
@ -12,8 +13,8 @@
"s6"
],
"data": [
[ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ],
[ 10.0,2.0,9.0,8.0,7.0,6.0,5.0,4.0,3.0,2.0]
[ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ]
],
"index": [0,1]
}}
"index": [0]
}
}

Просмотреть файл

@ -1,13 +1,19 @@
channels:
- conda-forge
dependencies:
- python=3.7.11
- pip
- python=3.10
- pip<=23.1.2
- pip:
- mlflow
- scikit-learn==0.24.1
- cloudpickle==2.0.0
- psutil==5.8.0
- pandas==1.3.5
- mlflow==2.7.1
- cloudpickle==1.6.0
- dataclasses==0.6
- lz4==4.0.0
- numpy==1.23.5
- packaging==23.0
- psutil==5.9.0
- pyyaml==6.0
- scikit-learn==1.1.2
- scipy==1.10.1
- uuid==1.30
- azureml-inference-server-http
name: mlflow-env

Просмотреть файл

@ -1,15 +1,21 @@
artifact_path: model
flavors:
python_function:
env: conda.yaml
env:
conda: conda.yaml
virtualenv: python_env.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
python_version: 3.7.11
predict_fn: predict
python_version: 3.10.11
sklearn:
code: null
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 0.24.1
run_id: f1e06708-641d-4a49-8f36-e9dcd8d34346
sklearn_version: 1.1.2
mlflow_version: 2.7.1
model_uuid: 3f725f3264314c02808dd99d5e5b2781
run_id: 70f15bab-cf98-48f1-a2ea-9ad2108c28cd
signature:
inputs: '[{"name": "age", "type": "double"}, {"name": "sex", "type": "double"},
{"name": "bmi", "type": "double"}, {"name": "bp", "type": "double"}, {"name":
@ -17,4 +23,3 @@ signature:
"double"}, {"name": "s4", "type": "double"}, {"name": "s5", "type": "double"},
{"name": "s6", "type": "double"}]'
outputs: '[{"type": "double"}]'
utc_time_created: '2022-03-17 01:56:03.706848'

Просмотреть файл

@ -1,11 +1,18 @@
channels:
- conda-forge
dependencies:
- python=3.7.11
- pip
- python=3.10.11
- pip<=23.1.2
- pip:
- mlflow
- scikit-learn==0.24.1
- cloudpickle==2.0.0
- psutil==5.8.0
- mlflow==2.7.1
- cloudpickle==1.6.0
- dataclasses==0.6
- lz4==4.0.0
- numpy==1.23.5
- packaging==23.0
- psutil==5.9.0
- pyyaml==6.0
- scikit-learn==1.1.2
- scipy==1.10.1
- uuid==1.30
name: mlflow-env

Двоичный файл не отображается.

Просмотреть файл

@ -0,0 +1,7 @@
python: 3.10.11
build_dependencies:
- pip==23.1.2
- setuptools==67.8.0
- wheel==0.38.4
dependencies:
- -r requirements.txt

Просмотреть файл

@ -1,4 +1,11 @@
mlflow
cloudpickle==2.0.0
psutil==5.8.0
scikit-learn==0.24.1
mlflow==2.7.1
cloudpickle==1.6.0
dataclasses==0.6
lz4==4.0.0
numpy==1.23.5
packaging==23.0
psutil==5.9.0
pyyaml==6.0
scikit-learn==1.1.2
scipy==1.10.1
uuid==1.30

Просмотреть файл

@ -3,7 +3,7 @@ import os
import json
import mlflow
from io import StringIO
from mlflow.pyfunc.scoring_server import infer_and_parse_json_input, predictions_to_json
from mlflow.pyfunc.scoring_server import infer_and_parse_data, predictions_to_json
def init():
@ -21,8 +21,8 @@ def run(raw_data):
if "input_data" not in json_data.keys():
raise Exception("Request must contain a top level key named 'input_data'")
serving_input = json.dumps(json_data["input_data"])
data = infer_and_parse_json_input(serving_input, input_schema)
serving_input = {"dataframe_split": json_data["input_data"]}
data = infer_and_parse_data(serving_input, input_schema)
predictions = model.predict(data)
result = StringIO()