azureml-examples/tutorials/get-started-notebooks/quickstart.ipynb

723 строки
25 KiB
Plaintext

{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# First look at Azure Machine Learning\n",
"\n",
"This tutorial is an introduction to some of the most used features of the Azure Machine Learning service. In it, you will create, register and deploy a model. This tutorial will help you become familiar with the core concepts of Azure Machine Learning and their most common usage. \n",
"\n",
"You'll learn how to run a training job on a scalable compute resource, then deploy it, and finally test the deployment.\n",
"\n",
"You'll create a training script to handle the data preparation, train and register a model. Once you train the model, you'll *deploy* it as an *endpoint*, then call the endpoint for *inferencing*.\n",
"\n",
"The steps you'll take are:\n",
"\n",
"> * Set up a handle to your Azure Machine Learning workspace\n",
"> * Create your training script\n",
"> * Create and run a command job that will run the training script on the compute cluster, configured with the appropriate job environment\n",
"> * View the output of your training script\n",
"> * Deploy the newly-trained model as an endpoint\n",
"> * Call the Azure Machine Learning endpoint for inferencing"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"* If you opened this notebook from Azure Machine Learning studio, you need a compute instance to run the code. If you don't have a compute instance, select **Create compute** on the toolbar to first create one. You can use all the default settings. \n",
"\n",
" ![Create compute](./media/create-compute.png)\n",
"\n",
"* If you're seeing this notebook elsewhere, complete [Create resources you need to get started](https://docs.microsoft.com/azure/machine-learning/quickstart-create-resources) to create an Azure Machine Learning workspace and a compute instance.\n",
"\n",
"## Set your kernel\n",
"\n",
"* If your compute instance is stopped, start it now. \n",
" \n",
" ![Start compute](./media/start-compute.png)\n",
"\n",
"* Once your compute instance is running, make sure the that the kernel, found on the top right, is `Python 3.10 - SDK v2`. If not, use the dropdown to select this kernel.\n",
"\n",
" ![Set the kernel](./media/set-kernel.png)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create handle to workspace\n",
"\n",
"Before we dive in the code, you need a way to reference your workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.\n",
"\n",
"You'll create `ml_client` for a handle to the workspace. You'll then use `ml_client` to manage resources and jobs."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the next cell, enter your Subscription ID, Resource Group name and Workspace name. To find these values:\n",
"\n",
"1. In the upper right Azure Machine Learning studio toolbar, select your workspace name.\n",
"1. Copy the value for workspace, resource group and subscription ID into the code. \n",
"1. You'll need to copy one value, close the area and paste, then come back for the next one.\n",
"\n",
"![image of workspace credentials](./media/find-credentials.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003731988
},
"name": "ml_client"
},
"outputs": [],
"source": [
"from azure.ai.ml import MLClient\n",
"from azure.identity import DefaultAzureCredential\n",
"\n",
"# authenticate\n",
"credential = DefaultAzureCredential()\n",
"\n",
"SUBSCRIPTION = \"<SUBSCRIPTION_ID>\"\n",
"RESOURCE_GROUP = \"<RESOURCE_GROUP>\"\n",
"WS_NAME = \"<AML_WORKSPACE_NAME>\"\n",
"# Get a handle to the workspace\n",
"ml_client = MLClient(\n",
" credential=credential,\n",
" subscription_id=SUBSCRIPTION,\n",
" resource_group_name=RESOURCE_GROUP,\n",
" workspace_name=WS_NAME,\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"> [!NOTE]\n",
"> Creating MLClient will not connect to the workspace. The client initialization is lazy, it will wait for the first time it needs to make a call (this will happen in the next code cell)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Verify that the handle works correctly.\n",
"# If you ge an error here, modify your SUBSCRIPTION, RESOURCE_GROUP, and WS_NAME in the previous cell.\n",
"ws = ml_client.workspaces.get(WS_NAME)\n",
"print(ws.location, \":\", ws.resource_group)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create training script\n",
"\n",
"Let's start by creating the training script - the *main.py* Python file.\n",
"\n",
"First create a source folder for the script:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003739848
},
"name": "train_src_dir"
},
"outputs": [],
"source": [
"import os\n",
"\n",
"train_src_dir = \"./src\"\n",
"os.makedirs(train_src_dir, exist_ok=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This script handles the preprocessing of the data, splitting it into test and train data. It then consumes this data to train a tree based model and return the output model. \n",
"\n",
"[MLFlow](https://learn.microsoft.com/azure/machine-learning/how-to-log-mlflow-models) will be used to log the parameters and metrics during our pipeline run. \n",
"\n",
"The cell below uses IPython magic to write the training script into the directory you just created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "write_main"
},
"outputs": [],
"source": [
"%%writefile {train_src_dir}/main.py\n",
"import os\n",
"import argparse\n",
"import pandas as pd\n",
"import mlflow\n",
"import mlflow.sklearn\n",
"from sklearn.ensemble import GradientBoostingClassifier\n",
"from sklearn.metrics import classification_report\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"def main():\n",
" \"\"\"Main function of the script.\"\"\"\n",
"\n",
" # input and output arguments\n",
" parser = argparse.ArgumentParser()\n",
" parser.add_argument(\"--data\", type=str, help=\"path to input data\")\n",
" parser.add_argument(\"--test_train_ratio\", type=float, required=False, default=0.25)\n",
" parser.add_argument(\"--n_estimators\", required=False, default=100, type=int)\n",
" parser.add_argument(\"--learning_rate\", required=False, default=0.1, type=float)\n",
" parser.add_argument(\"--registered_model_name\", type=str, help=\"model name\")\n",
" args = parser.parse_args()\n",
" \n",
" # Start Logging\n",
" mlflow.start_run()\n",
"\n",
" # enable autologging\n",
" mlflow.sklearn.autolog()\n",
"\n",
" ###################\n",
" #<prepare the data>\n",
" ###################\n",
" print(\" \".join(f\"{k}={v}\" for k, v in vars(args).items()))\n",
"\n",
" print(\"input data:\", args.data)\n",
" \n",
" credit_df = pd.read_csv(args.data, header=1, index_col=0)\n",
"\n",
" mlflow.log_metric(\"num_samples\", credit_df.shape[0])\n",
" mlflow.log_metric(\"num_features\", credit_df.shape[1] - 1)\n",
"\n",
" train_df, test_df = train_test_split(\n",
" credit_df,\n",
" test_size=args.test_train_ratio,\n",
" )\n",
" ####################\n",
" #</prepare the data>\n",
" ####################\n",
"\n",
" ##################\n",
" #<train the model>\n",
" ##################\n",
" # Extracting the label column\n",
" y_train = train_df.pop(\"default payment next month\")\n",
"\n",
" # convert the dataframe values to array\n",
" X_train = train_df.values\n",
"\n",
" # Extracting the label column\n",
" y_test = test_df.pop(\"default payment next month\")\n",
"\n",
" # convert the dataframe values to array\n",
" X_test = test_df.values\n",
"\n",
" print(f\"Training with data of shape {X_train.shape}\")\n",
"\n",
" clf = GradientBoostingClassifier(\n",
" n_estimators=args.n_estimators, learning_rate=args.learning_rate\n",
" )\n",
" clf.fit(X_train, y_train)\n",
"\n",
" y_pred = clf.predict(X_test)\n",
"\n",
" print(classification_report(y_test, y_pred))\n",
" ###################\n",
" #</train the model>\n",
" ###################\n",
"\n",
" ##########################\n",
" #<save and register model>\n",
" ##########################\n",
" # Registering the model to the workspace\n",
" print(\"Registering the model via MLFlow\")\n",
"\n",
" # pin numpy\n",
" conda_env = {\n",
" 'name': 'mlflow-env',\n",
" 'channels': ['conda-forge'],\n",
" 'dependencies': [\n",
" 'python=3.10.15',\n",
" 'pip<=21.3.1',\n",
" {\n",
" 'pip': [\n",
" 'mlflow==2.17.0',\n",
" 'cloudpickle==2.2.1',\n",
" 'pandas==1.5.3',\n",
" 'psutil==5.8.0',\n",
" 'scikit-learn==1.5.2',\n",
" 'numpy==1.26.4',\n",
" ]\n",
" }\n",
" ],\n",
" }\n",
"\n",
" mlflow.sklearn.log_model(\n",
" sk_model=clf,\n",
" registered_model_name=args.registered_model_name,\n",
" artifact_path=args.registered_model_name,\n",
" conda_env=conda_env,\n",
" )\n",
"\n",
" # Saving the model to a file\n",
" mlflow.sklearn.save_model(\n",
" sk_model=clf,\n",
" path=os.path.join(args.registered_model_name, \"trained_model\"),\n",
" )\n",
" ###########################\n",
" #</save and register model>\n",
" ###########################\n",
" \n",
" # Stop Logging\n",
" mlflow.end_run()\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see in this script, once the model is trained, the model file is saved and registered to the workspace. Now you can use the registered model in inferencing endpoints.\n",
"\n",
"You might need to select **Refresh** to see the new folder and script in your **Files**.\n",
"\n",
"![refresh](./media/refresh.png)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure the command\n",
"\n",
"Now that you have a script that can perform the desired tasks, and a compute cluster to run the script, you'll use a general purpose **command** that can run command line actions. This command line action can directly call system commands or run a script. \n",
"\n",
"Here, you'll create input variables to specify the input data, split ratio, learning rate and registered model name. The command script will:\n",
"* Use an *environment* that defines software and runtime libraries needed for the training script. Azure Machine Learning provides many curated or ready-made environments, which are useful for common training and inference scenarios. You'll use one of those environments here. In the [Train a model](train-model.ipynb) tutorial, you'll learn how to create a custom environment. \n",
"* Configure the command line action itself - `python main.py` in this case. The inputs/outputs are accessible in the command via the `${{ ... }}` notation.\n",
"* In this sample, we access the data from a file on the internet. \n",
"* Since a compute resource was not specified, the script will be run on a [serverless compute cluster](https://learn.microsoft.com/azure/machine-learning/how-to-use-serverless-compute?view=azureml-api-2&tabs=python) that is automatically created."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003747393
},
"name": "registered_model_name"
},
"outputs": [],
"source": [
"from azure.ai.ml import command\n",
"from azure.ai.ml import Input\n",
"\n",
"registered_model_name = \"credit_defaults_model\"\n",
"\n",
"job = command(\n",
" inputs=dict(\n",
" data=Input(\n",
" type=\"uri_file\",\n",
" path=\"https://azuremlexamples.blob.core.windows.net/datasets/credit_card/default_of_credit_card_clients.csv\",\n",
" ),\n",
" test_train_ratio=0.2,\n",
" learning_rate=0.25,\n",
" registered_model_name=registered_model_name,\n",
" ),\n",
" code=\"./src/\", # location of source code\n",
" command=\"python main.py --data ${{inputs.data}} --test_train_ratio ${{inputs.test_train_ratio}} --learning_rate ${{inputs.learning_rate}} --registered_model_name ${{inputs.registered_model_name}}\",\n",
" environment=\"azureml://registries/azureml/environments/sklearn-1.5/labels/latest\",\n",
" display_name=\"credit_default_prediction\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit the job \n",
"\n",
"It's now time to submit the job to run in Azure Machine Learning. This time you'll use `create_or_update` on `ml_client`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003755505
},
"name": "create_job"
},
"outputs": [],
"source": [
"ml_client.create_or_update(job)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## View job output and wait for job completion\n",
"\n",
"View the job in Azure Machine Learning studio by selecting the link in the output of the previous cell. \n",
"\n",
"The output of this job will look like this in the Azure Machine Learning studio. Explore the tabs for various details like metrics, outputs etc. Once completed, the job will register a model in your workspace as a result of training. \n",
"\n",
"![Screenshot that shows the job overview](./media/view-job.gif \"View the job in studio\")\n",
"\n",
"> [!IMPORTANT]\n",
"> Wait until the status of the job is complete before returning to this notebook to continue. The job will take 2 to 3 minutes to run. It could take longer (up to 10 minutes) if the compute cluster has been scaled down to zero nodes and custom environment is still building."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy the model as an online endpoint\n",
"\n",
"Now deploy your machine learning model as a web service in the Azure cloud, an [`online endpoint`](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).\n",
"\n",
"To deploy a machine learning service, you'll use the model you registered."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a new online endpoint\n",
"\n",
"Now that you have a registered model, it's time to create your online endpoint. The endpoint name needs to be unique in the entire Azure region. For this tutorial, you'll create a unique name using [`UUID`](https://en.wikipedia.org/wiki/Universally_unique_identifier#:~:text=A%20universally%20unique%20identifier%20)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003781233
},
"name": "online_endpoint_name"
},
"outputs": [],
"source": [
"import uuid\n",
"\n",
"# Creating a unique name for the endpoint\n",
"online_endpoint_name = \"credit-endpoint-\" + str(uuid.uuid4())[:8]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create the endpoint:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003878862
},
"name": "endpoint"
},
"outputs": [],
"source": [
"# Expect the endpoint creation to take a few minutes\n",
"from azure.ai.ml.entities import (\n",
" ManagedOnlineEndpoint,\n",
" ManagedOnlineDeployment,\n",
" Model,\n",
" Environment,\n",
")\n",
"\n",
"# create an online endpoint\n",
"endpoint = ManagedOnlineEndpoint(\n",
" name=online_endpoint_name,\n",
" description=\"this is an online endpoint\",\n",
" auth_mode=\"key\",\n",
" tags={\n",
" \"training_dataset\": \"credit_defaults\",\n",
" \"model_type\": \"sklearn.GradientBoostingClassifier\",\n",
" },\n",
")\n",
"\n",
"endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()\n",
"\n",
"print(f\"Endpoint {endpoint.name} provisioning state: {endpoint.provisioning_state}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> [!NOTE]\n",
"> Expect the endpoint creation to take a few minutes.\n",
"\n",
"Once the endpoint has been created, you can retrieve it as below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003879481
},
"name": "retrieve_endpoint"
},
"outputs": [],
"source": [
"endpoint = ml_client.online_endpoints.get(name=online_endpoint_name)\n",
"\n",
"print(\n",
" f'Endpoint \"{endpoint.name}\" with provisioning state \"{endpoint.provisioning_state}\" is retrieved'\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy the model to the endpoint\n",
"\n",
"Once the endpoint is created, deploy the model with the entry script. Each endpoint can have multiple deployments. Direct traffic to these deployments can be specified using rules. Here you'll create a single deployment that handles 100% of the incoming traffic. We have chosen a color name for the deployment, for example, *blue*, *green*, *red* deployments, which is arbitrary.\n",
"\n",
"You can check the **Models** page on Azure Machine Learning studio, to identify the latest version of your registered model. Alternatively, the code below will retrieve the latest version number for you to use."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679003879136
},
"name": "latest_model_version"
},
"outputs": [],
"source": [
"# Let's pick the latest version of the model\n",
"latest_model_version = max(\n",
" [int(m.version) for m in ml_client.models.list(name=registered_model_name)]\n",
")\n",
"print(f'Latest model is version \"{latest_model_version}\" ')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Deploy the latest version of the model. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679004373833
},
"name": "blue_deployment"
},
"outputs": [],
"source": [
"# picking the model to deploy. Here we use the latest version of our registered model\n",
"model = ml_client.models.get(name=registered_model_name, version=latest_model_version)\n",
"\n",
"# Expect this deployment to take approximately 6 to 8 minutes.\n",
"# create an online deployment.\n",
"# if you run into an out of quota error, change the instance_type to a comparable VM that is available.\n",
"# Learn more on https://azure.microsoft.com/en-us/pricing/details/machine-learning/.\n",
"blue_deployment = ManagedOnlineDeployment(\n",
" name=\"blue\",\n",
" endpoint_name=online_endpoint_name,\n",
" model=model,\n",
" instance_type=\"Standard_DS3_v2\",\n",
" instance_count=1,\n",
")\n",
"\n",
"blue_deployment = ml_client.begin_create_or_update(blue_deployment).result()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> [!NOTE]\n",
"> Expect this deployment to take approximately 6 to 8 minutes.\n",
"\n",
"When the deployment is done, you're ready to test it."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test with a sample query\n",
"\n",
"Once the model is deployed to the endpoint, you can run inference with it.\n",
"\n",
"Create a sample request file following the design expected in the run method in the score script."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1679004374166
},
"name": "deploy_dir"
},
"outputs": [],
"source": [
"deploy_dir = \"./deploy\"\n",
"os.makedirs(deploy_dir, exist_ok=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "write_sample"
},
"outputs": [],
"source": [
"%%writefile {deploy_dir}/sample-request.json\n",
"{\n",
" \"input_data\": {\n",
" \"columns\": [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22],\n",
" \"index\": [0, 1],\n",
" \"data\": [\n",
" [20000,2,2,1,24,2,2,-1,-1,-2,-2,3913,3102,689,0,0,0,0,689,0,0,0,0],\n",
" [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 10, 9, 8]\n",
" ]\n",
" }\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "test"
},
"outputs": [],
"source": [
"# test the blue deployment with some sample data\n",
"ml_client.online_endpoints.invoke(\n",
" endpoint_name=online_endpoint_name,\n",
" request_file=\"./deploy/sample-request.json\",\n",
" deployment_name=\"blue\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clean up resources\n",
"\n",
"If you're not going to use the endpoint, delete it to stop using the resource. Make sure no other deployments are using an endpoint before you delete it.\n",
"\n",
"\n",
"> [!NOTE]\n",
"> Expect the complete deletion to take approximately 20 minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "delete_endpoint"
},
"outputs": [],
"source": [
"ml_client.online_endpoints.begin_delete(name=online_endpoint_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"You now have an Azure Machine Learning workspace, which contains a compute instance to use for your development environment.\n",
"\n",
"Continue on to learn how to use the compute instance to run notebooks and scripts in the Azure Machine Learning cloud. \n",
"\n",
"|Tutorial |Description |\n",
"|---------|---------|\n",
"| [Tutorial: Upload, access and explore your data in Azure Machine Learning](https://learn.microsoft.com/azure/tutorial-explore-data) | Store large data in the cloud and retrieve it from notebooks and scripts |\n",
"| [Tutorial: Model development on a cloud workstation](https://learn.microsoft.com/azure/tutorial-cloud-workstation) | Start prototyping and developing machine learning models |\n",
"| [Tutorial: Train a model in Azure Machine Learning](https://learn.microsoft.com/azure/tutorial-train-model) | Dive in to the details of training a model |\n",
"| [Tutorial: Deploy a model as an online endpoint](https://learn.microsoft.com/azure/tutorial-deploy-model) | Dive in to the details of deploying a model |\n",
"| [Tutorial: Create production machine learning pipelines](https://learn.microsoft.com/azure/tutorial-pipeline-python-sdk) | Split a complete machine learning task into a multistep workflow. |"
]
}
],
"metadata": {
"description": "Learn how a data scientist uses Azure Machine Learning to train a model, then use the model for prediction. This tutorial will help you become familiar with the core concepts of Azure ML and their most common usage.",
"kernel_info": {
"name": "python310-sdkv2"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK v2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"microsoft": {
"ms_spell_check": {
"ms_spell_check_language": "en"
}
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}