Deleted Azure-DevOps lab.ipynb

This commit is contained in:
Wolfgang Pauli 2019-04-07 01:46:31 +00:00
Родитель 3ffe0fb527
Коммит a81f06a79b
1 изменённых файлов: 0 добавлений и 498 удалений

Просмотреть файл

@ -1,498 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure DevOps\n",
"\n",
"With Azure DevOps data scientists and application developers can work together to create and maintain AI-infused applications. Using a DevOps mindset is not new to software developers, who are used to running applications in production. However, data scientists in the past have often worked in silos and not followed best practices to facilitate the transition from development to production. With Azure DevOps data scientists can now develop with an eye toward production."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 1: Getting started\n",
"\n",
"This lab allows you to perform setup for building a **Continuous Integration/Continuous Deployment** pipeline related to Anomoly Detection and Predictive Maintenance.\n",
"\n",
"### Pre-requisites\n",
"\n",
"- Azure account\n",
"- Azure DevOps account\n",
"- Azure Machine Learning Service Workspace\n",
"- Basic knowledge of Python\n",
"\n",
"After you launch your environment, follow the below steps:\n",
"\n",
"### Azure Machine Learning Service Workspace\n",
"\n",
"We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:\n",
"\n",
"1. Login to Azure portal using the credentials provided with the environment.\n",
"\n",
"2. Select **Create a Resource** and search the marketplace for **Machine Learning Service Workspace**.\n",
"\n",
"![Market Place](../images/marketplace.png)\n",
"\n",
"3. Select **Machine Learning Service Workspace** followed by **Create**:\n",
"\n",
"![Create Workspace](../images/createWorkspace.png)\n",
"\n",
"4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):\n",
"\n",
"![Workspace Fields](../images/workspaceFields.png)\n",
"\n",
"### Sign in to Azure DevOps\n",
"\n",
"Go to **https://dev.azure.com** and login using your Azure username and password. You will be asked to provide a name and email. An organization is created for you based on the name you provide. Within the organization, you will be asked to create a project. Name your project \"ADPM\" and click on **Create project**. With private projects, only people you give access to will be able to view this project. After logging in, you should see the below:\n",
"\n",
"![Get Started](../images/getStarted.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Service connection\n",
"\n",
"The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a **Service Connection** and authorizing the build pipeline to use this connection.\n",
"\n",
"> If we didn't set up this **service connection**, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.\n",
"> The steps below assume that we already have a service principal. If this is not the case, you can go to [this link](https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal) for instructions on how to create a service principal using Azure Active Directory. Please name your service principal `servicePrincipal` (with capital P) to avoid having the to change lab upcoming scripts. Once you have your **directory (tenant) ID**, the **application (client) ID** and the **applcation secret key** you can return to run the rest of the instructions.\n",
"\n",
"Setting up a service connection involves the following steps:\n",
"1. Click on **Project settings** in the bottom-left corner of your screen.\n",
"2. On the next page, search for menu section **Pipelines** and select **Service Connection**.\n",
"3. Create a **New service connection**, of type **Azure Resource Manager**.\n",
"\n",
"![Get Started](../images/createServiceConnection.png)\n",
"\n",
"4. On the page you are presented with, scroll down and click on the link saying **use the full version of the service connection dialog**.\n",
"\n",
"![Get Started](../images/changeToFullVersionServiceConnection.png)\n",
"\n",
"5. Begin filling out the full version of the form. All the information you need is provided in the lab setup page. If you closed this page, a link to it was emailed to you. Look for emails from **No Reply (CloudLabs) <noreply@cloudlabs.ai>**.\n",
"\n",
"![Get Started](../images/fullDialogueServiceConnection.png \"width=50\")\n",
"\n",
" - **Important!** Set **connection name** to **serviceConnection** (careful about capitalization).\n",
" - For **Service principal client ID** paste the field called **Application/Client Id** in the lab setup page.\n",
" - Set **Scope level** to **Subscription**.\n",
" - For **Subscription**, select the same which you have been using throughout the course. You may already have a compute target in there (e.g. \"aml-copute\") and a AML workspace.\n",
" - **Important!** Leave **Resource Group** empty.\n",
" - For **Service principal key** paste the filed called **Application Secret Key** in the lab setup page.\n",
" - Allow all pipelines to use this connection.\n",
" - Click on **Verify connection** to make sure the connection is valid and then click on **OK**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Repository\n",
"\n",
"After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to go to **Repos > Files > Import** as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.\n",
"\n",
"![import repository](../images/importGit.png)\n",
"\n",
"You should now be able to see the git repo in your project."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 2: Building a pipeline\n",
"\n",
"Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hosted Agents\n",
"\n",
"With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care of. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:\n",
"\n",
"- Ubuntu 16.04\n",
"- Visual Studio 2017 on Windows Server 2016\n",
"- macOS 10.13\n",
"- Windows Server 1803 (win1803) - for running Windows containers\n",
"- Visual Studio 2015 on Windows Server 2012R2\n",
"\n",
"YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Code Repository\n",
"\n",
"The repo is organized as follows:\n",
"\n",
"```\n",
" code\n",
" code/testing/\n",
" code/scoring/\n",
" code/aml_config/\n",
" data_sample\n",
" azure-pipelines.yml\n",
"```\n",
"\n",
"The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## About the scripts\n",
"\n",
"For the purpose of DevOps, it's best not to use a Notebook because it can be error-prone. Instead, we have all the code sitting in individual Python scripts. This means that if we used a Notebook to develop our scripts, like we did throughout this course, we have some work to do to refactor the code and turn it into a series of modular Python scripts. We would also add scripts for running various tests everytime our build is triggered, such as unit tests, integration tests, tests to measure **drift** (a degradation over time of the predictions returned by the model on incoming data), etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's take a look at a brief overview of what each script does:\n",
"\n",
"| num | script | what it does |\n",
"| --- | ------------------------ | ----------------------------------------------- |\n",
"| 1 | anom_detect.py | detect anomalies in data and output them |\n",
"| 2 | automl_step.py | train a PdM model using automated ML |\n",
"| 3 | pipeline.py | runs 1 and 2 against a remote compute target |\n",
"| 4 | evaluate_model.py | evaluates the result of 2 |\n",
"| 5 | register_model.py | registeres the best model |\n",
"| 6 | scoring/score.py | scoring script |\n",
"| 7 | create_scoring_image.py | creates a scoring image from the scoring script |\n",
"| 8 | deploy_aci.py | deploys scoring image to ACI |\n",
"| 9 | aci_service_test.py | tests the ACI deployment |\n",
"| 10 | testing/data_test.py | used to test the ACI deployment |\n",
"| 11 | deploy_aks.py | deploys the AKS deployment |\n",
"| 12 | aks_service_test.py | tests the AKS deployment |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to the Python scripts. We have another script called `azure-pipeline.yml`, which contains in it the logic for our build. Like a **conda config** file or a **dockerfile**, this file allows us to set in place *infrastructure as code*. Let's take a look at its content:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting ./azure-pipelines.yml\n"
]
}
],
"source": [
"# %load ./azure-pipelines.yml\n",
"pool:\n",
" vmImage: 'Ubuntu 16.04'\n",
"steps:\n",
"- task: UsePythonVersion@0\n",
" inputs:\n",
" versionSpec: 3.5\n",
" architecture: 'x64'\n",
"\n",
"- task: DownloadSecureFile@1\n",
" inputs:\n",
" name: configFile\n",
" secureFile: config.json\n",
"- script: echo \"Printing the secure file path\" \n",
"- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)\n",
"\n",
"- task: CondaEnvironment@1\n",
" displayName: 'Create Conda Environment '\n",
" inputs:\n",
" createCustomEnvironment: true\n",
" environmentName: azuremlsdk\n",
" packageSpecs: 'python=3.6'\n",
" updateConda: false\n",
" createOptions: 'cython==0.29 urllib3<1.24'\n",
"- script: |\n",
" pip install --user azureml-sdk==1.0.17 pandas\n",
" displayName: 'Install prerequisites'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/pipeline.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/pipeline.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/evaluate_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/evaluate_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/register_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/register_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/create_scoring_image.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/create_scoring_image.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/deploy_aci.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/deploy_aci.py'\n",
" \n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/aci_service_test.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/aci_service_test.py'\n",
"- script: |\n",
" python devops/code/testing/data_test.py devops/data_sample/predmain_bad_schema.csv\n",
" displayName: 'Test Schema'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a config file and uploading it as a Secure File\n",
"\n",
"On your own labtop, create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:\n",
"\n",
"```\n",
"{\n",
" \"subscription_id\": \".......\",\n",
" \"resource_group\": \".......\",\n",
" \"workspace_name\": \".......\",\n",
" \"workspace_region\": \".......\"\n",
"}\n",
"```\n",
"\n",
"You can get all of the info from the Machine Learning Service Workspace created in the portal as shown below. **Attention:** For `workspace_region` use one word and all lowercase, e.g. `westus2`.\n",
"\n",
"![ML Workspace](../images/configFileOnPortal.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.\n",
"\n",
"The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.\n",
"\n",
"#### Upload Secure File\n",
"\n",
"1. Select **Pipelines**, **Library** and **Secure Files**, then **+Secure File** to upload `config.json` file.\n",
"\n",
"![Upload Secure File](../images/uploadSecureFile.png)\n",
"\n",
"2. Select the uploaded file `config.json` and ensure **Authorize for use in all pipelines** is ticked and click on **Save**. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a build\n",
"\n",
"Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.\n",
"\n",
"#### New Pipeline\n",
"\n",
"1. To create a new pipeline, select **New pipeline** from the Pipelines blade:\n",
"\n",
" ![New Pipeline](../images/newPipeline.png)\n",
"\n",
"2. You will be prompted with **Where is your code?**. Select **Azure Repos** followed by your repo.\n",
"\n",
"3. Select **Run**. Once the agent is allocated, you'll start seeing the live logs of the build.\n",
"\n",
"#### Notification\n",
"\n",
"The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure Pipelines with YAML\n",
"\n",
"You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure. \n",
"\n",
"The basic steps include:\n",
"\n",
"1. Configure Azure Pipelines to use your Git repo.\n",
"2. Edit your `azure-pipelines.yml` file to define your build.\n",
"3. Push your code to your version control repository which kicks off the default trigger to build and deploy.\n",
"4. Code is now updated, built, tested, and packaged. It can be deployed to any target.\n",
"\n",
"![Pipelines-Image-Yam](../images/pipelines-image-yaml.png)\n",
"\n",
"\n",
"Open the yml file in the repo to understand the build steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating test scripts\n",
"\n",
"In this workshop, multiple tests are included:\n",
"\n",
"1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.\n",
"\n",
"2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.\n",
"\n",
"#### Exercise\n",
"\n",
"- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types? \n",
"\n",
"- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?\n",
"\n",
"- Make sure `azure-pipelines.yml` captures the above changes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Build trigger (continuous deployment trigger)\n",
"\n",
"Along with the time triggers, we cann can also create a release every time a new build is available.\n",
"\n",
"1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:\n",
"\n",
"![Release Build Trigger](../images/releaseBuildTrigger.png)\n",
"\n",
"2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting \"master\" will trigger a release for every build from the master branch.\n",
"\n",
"#### Approvals\n",
"\n",
"For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:\n",
"\n",
"![Pending Approval](../images/pendingApproval.png)\n",
"\n",
"There is also provision to include comments with approval/reject:\n",
"\n",
"![Approval Comments](../images/approvalComments.png)\n",
"\n",
"Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines: \n",
"\n",
"![Release Passed](../images/releasePassed.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Application Insights (optional)\n",
"\n",
"For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.\n",
"\n",
"Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.\n",
"\n",
"```python\n",
"import requests\n",
"import json\n",
"\n",
"input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]\n",
"\n",
"data = json.dumps({'data': input_j})\n",
"test_sample = bytes(data, encoding = 'utf8')\n",
"\n",
"url = '<scoring_url>'\n",
"api_key = '<key>' \n",
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}\n",
"\n",
"resp = requests.post(url, test_sample, headers=headers)\n",
"print(resp.text)\n",
"\n",
"```\n",
"\n",
"1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:\n",
"\n",
"![ML Workspace](../images/mlworkspace.png)\n",
"\n",
"2. Select Analytics.\n",
"\n",
"3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:\n",
"\n",
"````\n",
" requests\n",
" | where timestamp > ago(3h)\n",
"````\n",
"\n",
"![LogAnalytics Query](../images/logAnalyticsQuery.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Changes\n",
"\n",
"A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:\n",
"\n",
"````python\n",
" print(\"Adding incremental data...\")\n",
" telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))\n",
" telemetry = telemetry.append(telemetry_incremental, ignore_index=True)\n",
"````\n",
"\n",
"The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:learnai-adpm]",
"language": "python",
"name": "conda-env-learnai-adpm-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}