This commit is contained in:
Seth Mottaghinejad 2019-03-06 16:12:23 -08:00
Родитель 918aad527f
Коммит 70483a05c4
71 изменённых файлов: 10863 добавлений и 653 удалений

4
.gitignore поставляемый
Просмотреть файл

@ -3,4 +3,6 @@ config.json
train.csv
test.csv
sample_submission.csv
mnt_blob_rw.ipynb
mnt_blob_rw.ipynb
data/*
.ipynb_checkpoints

Просмотреть файл

@ -0,0 +1,497 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure DevOps\n",
"\n",
"With Azure DevOps data scientists and application developers can work together to create and maintain AI-infused applications. Using a DevOps mindset is not new to software developers, who are used to running applications in production. However, data scientists in the past have often worked in silos and not followed best practices to facilitate the transition from development to production. With Azure DevOps data scientists can now develop with an eye toward production."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 1: Getting started\n",
"\n",
"This lab allows you to perform setup for building a **Continuous Integration/Continuous Deployment** pipeline related to Anomoly Detection and Predictive Maintenance.\n",
"\n",
"### Pre-requisites\n",
"\n",
"- Azure account\n",
"- Azure DevOps account\n",
"- Azure Machine Learning Service Workspace\n",
"- Basic knowledge of Python\n",
"\n",
"After you launch your environment, follow the below steps:\n",
"\n",
"### Azure Machine Learning Service Workspace\n",
"\n",
"We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:\n",
"\n",
"1. Login to Azure portal using the credentials provided with the environment.\n",
"\n",
"2. Select **Create a Resource** and search the marketplace for **Machine Learning Service Workspace**.\n",
"\n",
"![Market Place](../images/marketplace.png)\n",
"\n",
"3. Select **Machine Learning Service Workspace** followed by **Create**:\n",
"\n",
"![Create Workspace](../images/createWorkspace.png)\n",
"\n",
"4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):\n",
"\n",
"![Workspace Fields](../images/workspaceFields.png)\n",
"\n",
"### Sign in to Azure DevOps\n",
"\n",
"Go to **https://dev.azure.com** and login using your Azure username and password. You will be asked to provide a name and email. An organization is created for you based on the name you provide. Within the organization, you will be asked to create a project. Name your project \"ADPM\" and click on **Create project**. With private projects, only people you give access to will be able to view this project. After logging in, you should see the below:\n",
"\n",
"![Get Started](../images/getStarted.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Service connection\n",
"\n",
"The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a **Service Connection** and authorizing the build pipeline to use this connection.\n",
"\n",
"> If we didn't set up this **service connection**, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.\n",
"\n",
"Setting up a service connection involves the following steps:\n",
"1. Click on **Project settings** in the bottom-left corner of your screen.\n",
"2. On the next page, search for menu section **Pipelines** and select **Service Connection**.\n",
"3. Create a **New service connection**, of type **Azure Resource Manager**.\n",
"\n",
"![Get Started](../images/createServiceConnection.png)\n",
"\n",
"4. On the page you are presented with, scroll down and click on the link saying **use the full version of the service connection dialog**.\n",
"\n",
"![Get Started](../images/changeToFullVersionServiceConnection.png)\n",
"\n",
"5. Begin filling out the full version of the form. All the information you need is provided in the lab setup page. If you closed this page, a link to it was emailed to you. Look for emails from **No Reply (CloudLabs) <noreply@cloudlabs.ai>**.\n",
"\n",
"![Get Started](../images/fullDialogueServiceConnection.png \"width=50\")\n",
"\n",
" - **Important!** Set **connection name** to **serviceConnection** (careful about capitalization).\n",
" - For **Service principal client ID** paste the field called **Application/Client Id** in the lab setup page.\n",
" - Set **Scope level** to **Subscription**.\n",
" - For **Subscription**, select the same which you have been using throughout the course. You may already have a compute target in there (e.g. \"aml-copute\") and a AML workspace.\n",
" - **Important!** Leave **Resource Group** empty.\n",
" - For **Service principal key** paste the filed called **Application Secret Key** in the lab setup page.\n",
" - Allow all pipelines to use this connection.\n",
" - Click on **Verify connection** to make sure the connection is valid and then click on **OK**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Repository\n",
"\n",
"After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to go to **Repos > Files > Import** as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.\n",
"\n",
"![import repository](../images/importGit.png)\n",
"\n",
"You should now be able to see the git repo in your project."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 2: Building a pipeline\n",
"\n",
"Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hosted Agents\n",
"\n",
"With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care of. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:\n",
"\n",
"- Ubuntu 16.04\n",
"- Visual Studio 2017 on Windows Server 2016\n",
"- macOS 10.13\n",
"- Windows Server 1803 (win1803) - for running Windows containers\n",
"- Visual Studio 2015 on Windows Server 2012R2\n",
"\n",
"YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Code Repository\n",
"\n",
"The repo is organized as follows:\n",
"\n",
"```\n",
" code\n",
" code/testing/\n",
" code/scoring/\n",
" code/aml_config/\n",
" data_sample\n",
" azure-pipelines.yml\n",
"```\n",
"\n",
"The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## About the scripts\n",
"\n",
"For the purpose of DevOps, it's best not to use a Notebook because it can be error-prone. Instead, we have all the code sitting in individual Python scripts. This means that if we used a Notebook to develop our scripts, like we did throughout this course, we have some work to do to refactor the code and turn it into a series of modular Python scripts. We would also add scripts for running various tests everytime our build is triggered, such as unit tests, integration tests, tests to measure **drift** (a degradation over time of the predictions returned by the model on incoming data), etc."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's take a look at a brief overview of what each script does:\n",
"\n",
"| num | script | what it does |\n",
"| --- | ------------------------ | ----------------------------------------------- |\n",
"| 1 | anom_detect.py | detect anomalies in data and output them |\n",
"| 2 | automl_step.py | train a PdM model using automated ML |\n",
"| 3 | pipeline.py | runs 1 and 2 against a remote compute target |\n",
"| 4 | evaluate_model.py | evaluates the result of 2 |\n",
"| 5 | register_model.py | registeres the best model |\n",
"| 6 | scoring/score.py | scoring script |\n",
"| 7 | create_scoring_image.py | creates a scoring image from the scoring script |\n",
"| 8 | deploy_aci.py | deploys scoring image to ACI |\n",
"| 9 | aci_service_test.py | tests the ACI deployment |\n",
"| 10 | testing/data_test.py | used to test the ACI deployment |\n",
"| 11 | deploy_aks.py | deploys the AKS deployment |\n",
"| 12 | aks_service_test.py | tests the AKS deployment |\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In addition to the Python scripts. We have another script called `azure-pipeline.yml`, which contains in it the logic for our build. Like a **conda config** file or a **dockerfile**, this file allows us to set in place *infrastructure as code*. Let's take a look at its content:"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Overwriting ./azure-pipelines.yml\n"
]
}
],
"source": [
"# %load ./azure-pipelines.yml\n",
"pool:\n",
" vmImage: 'Ubuntu 16.04'\n",
"steps:\n",
"- task: UsePythonVersion@0\n",
" inputs:\n",
" versionSpec: 3.5\n",
" architecture: 'x64'\n",
"\n",
"- task: DownloadSecureFile@1\n",
" inputs:\n",
" name: configFile\n",
" secureFile: config.json\n",
"- script: echo \"Printing the secure file path\" \n",
"- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)\n",
"\n",
"- task: CondaEnvironment@1\n",
" displayName: 'Create Conda Environment '\n",
" inputs:\n",
" createCustomEnvironment: true\n",
" environmentName: azuremlsdk\n",
" packageSpecs: 'python=3.6'\n",
" updateConda: false\n",
" createOptions: 'cython==0.29 urllib3<1.24'\n",
"- script: |\n",
" pip install --user azureml-sdk==1.0.17 pandas\n",
" displayName: 'Install prerequisites'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/pipeline.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/pipeline.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/evaluate_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/evaluate_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/register_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/register_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/create_scoring_image.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/create_scoring_image.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/deploy_aci.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/deploy_aci.py'\n",
" \n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI devops/code/aci_service_test.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python devops/code/aci_service_test.py'\n",
"- script: |\n",
" python devops/code/testing/data_test.py devops/data_sample/predmain_bad_schema.csv\n",
" displayName: 'Test Schema'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a config file and uploading it as a Secure File\n",
"\n",
"On your own labtop, create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:\n",
"\n",
"```\n",
"{\n",
" \"subscription_id\": \".......\",\n",
" \"resource_group\": \".......\",\n",
" \"workspace_name\": \".......\",\n",
" \"workspace_region\": \".......\"\n",
"}\n",
"```\n",
"\n",
"You can get all of the info from the Machine Learning Service Workspace created in the portal as shown below. **Attention:** For `workspace_region` use one word and all lowercase, e.g. `westus2`.\n",
"\n",
"![ML Workspace](../images/configFileOnPortal.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.\n",
"\n",
"The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.\n",
"\n",
"#### Upload Secure File\n",
"\n",
"1. Select **Pipelines**, **Library** and **Secure Files**, then **+Secure File** to upload `config.json` file.\n",
"\n",
"![Upload Secure File](../images/uploadSecureFile.png)\n",
"\n",
"2. Select the uploaded file `config.json` and ensure **Authorize for use in all pipelines** is ticked and click on **Save**. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a build\n",
"\n",
"Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.\n",
"\n",
"#### New Pipeline\n",
"\n",
"1. To create a new pipeline, select **New pipeline** from the Pipelines blade:\n",
"\n",
" ![New Pipeline](../images/newPipeline.png)\n",
"\n",
"2. You will be prompted with **Where is your code?**. Select **Azure Repos** followed by your repo.\n",
"\n",
"3. Select **Run**. Once the agent is allocated, you'll start seeing the live logs of the build.\n",
"\n",
"#### Notification\n",
"\n",
"The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure Pipelines with YAML\n",
"\n",
"You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure. \n",
"\n",
"The basic steps include:\n",
"\n",
"1. Configure Azure Pipelines to use your Git repo.\n",
"2. Edit your `azure-pipelines.yml` file to define your build.\n",
"3. Push your code to your version control repository which kicks off the default trigger to build and deploy.\n",
"4. Code is now updated, built, tested, and packaged. It can be deployed to any target.\n",
"\n",
"![Pipelines-Image-Yam](../images/pipelines-image-yaml.png)\n",
"\n",
"\n",
"Open the yml file in the repo to understand the build steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating test scripts\n",
"\n",
"In this workshop, multiple tests are included:\n",
"\n",
"1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.\n",
"\n",
"2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.\n",
"\n",
"#### Exercise\n",
"\n",
"- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types? \n",
"\n",
"- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?\n",
"\n",
"- Make sure `azure-pipelines.yml` captures the above changes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Build trigger (continuous deployment trigger)\n",
"\n",
"Along with the time triggers, we cann can also create a release every time a new build is available.\n",
"\n",
"1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:\n",
"\n",
"![Release Build Trigger](../images/releaseBuildTrigger.png)\n",
"\n",
"2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting \"master\" will trigger a release for every build from the master branch.\n",
"\n",
"#### Approvals\n",
"\n",
"For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:\n",
"\n",
"![Pending Approval](../images/pendingApproval.png)\n",
"\n",
"There is also provision to include comments with approval/reject:\n",
"\n",
"![Approval Comments](../images/approvalComments.png)\n",
"\n",
"Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines: \n",
"\n",
"![Release Passed](../images/releasePassed.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Application Insights (optional)\n",
"\n",
"For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.\n",
"\n",
"Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.\n",
"\n",
"```python\n",
"import requests\n",
"import json\n",
"\n",
"input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]\n",
"\n",
"data = json.dumps({'data': input_j})\n",
"test_sample = bytes(data, encoding = 'utf8')\n",
"\n",
"url = '<scoring_url>'\n",
"api_key = '<key>' \n",
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}\n",
"\n",
"resp = requests.post(url, test_sample, headers=headers)\n",
"print(resp.text)\n",
"\n",
"```\n",
"\n",
"1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:\n",
"\n",
"![ML Workspace](../images/mlworkspace.png)\n",
"\n",
"2. Select Analytics.\n",
"\n",
"3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:\n",
"\n",
"````\n",
" requests\n",
" | where timestamp > ago(3h)\n",
"````\n",
"\n",
"![LogAnalytics Query](../images/logAnalyticsQuery.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Changes\n",
"\n",
"A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:\n",
"\n",
"````python\n",
" print(\"Adding incremental data...\")\n",
" telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))\n",
" telemetry = telemetry.append(telemetry_incremental, ignore_index=True)\n",
"````\n",
"\n",
"The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:learnai-adpm]",
"language": "python",
"name": "conda-env-learnai-adpm-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

Просмотреть файл

@ -1,8 +1,8 @@
# Introduction
In this course, we will implement a Continuous Integration (CI)/Continuous Delivery (CD) pipeline for Anomaly Detection and Predictive Maintenance applications. For developing an AI application, there are frequently two streams of work:
1. Data Scientists building machine learning models
1. Data Scientists building machine learning models
2. App developers building the application and exposing it to end users to consume
In short, the pipeline is designed to kick off for each new commit, run the test suite, if the test passes takes the latest build, packages it in a Docker container and then deploys to create a scoring service as shown below.
@ -17,26 +17,3 @@ The goal of this course is to cover the following modules:
* Create a CI/CD pipeline using Azure
* Customize a CI/CD pipeline using Azure
* Learn how to develop a Machine Learning pipeline to update models and create service
## How to Use this Site
*This site is intended to be the main resource to an instructor-led course, but anyone is welcome to learn here. The intent is to make this site self-guided and it is getting there.*
We recommend cloning this repository onto your local computer with a git-based program (like GitHub desktop for Windows) or you may download the site contents as a zip file by going to "Clone or Download" at the upper right of this repository.
It is recommended that you do the labs in the below order:
1. lab00.0_Setup
2. lab01.1_BuildPipeline
**For Instructor-Led:**
* We recommend dowloading the site contents or cloning it to your local computer.
* Follow along with the classroom instructions and training sessions.
* When there is a lab indicated, you may find the lab instructions in the Labs folder.
**For Self-Study:**
* We recommend dowloading the site contents or cloning it if you can do so to your local computer.
* Go to Decks folder and follow along with the slides.
* When there is a lab indicated, you may find the lab instructions in the Labs folder.

Просмотреть файл

@ -22,50 +22,50 @@ steps:
updateConda: false
createOptions: 'cython==0.29 urllib3<1.24'
- script: |
pip install --user azureml-sdk pandas
pip install --user azureml-sdk==1.0.17 pandas
displayName: 'Install prerequisites'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/pipeline.py'
displayName: 'Azure CLI devops/code/pipeline.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/pipeline.py'
inlineScript: 'python devops/code/pipeline.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/evaluate_model.py'
displayName: 'Azure CLI devops/code/evaluate_model.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/evaluate_model.py'
inlineScript: 'python devops/code/evaluate_model.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/register_model.py'
displayName: 'Azure CLI devops/code/register_model.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/register_model.py'
inlineScript: 'python devops/code/register_model.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/create_scoring_image.py'
displayName: 'Azure CLI devops/code/create_scoring_image.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/create_scoring_image.py'
inlineScript: 'python devops/code/create_scoring_image.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/deploy_aci.py'
displayName: 'Azure CLI devops/code/deploy_aci.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/deploy_aci.py'
inlineScript: 'python devops/code/deploy_aci.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/aci_service_test.py'
displayName: 'Azure CLI devops/code/aci_service_test.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/aci_service_test.py'
inlineScript: 'python devops/code/aci_service_test.py'
- script: |
python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv
displayName: 'Test Schema'
python devops/code/testing/data_test.py devops/data_sample/predmain_bad_schema.csv
displayName: 'Test Schema'

Просмотреть файл

@ -11,7 +11,7 @@ from azureml.core.webservice import Webservice
ws = Workspace.from_config()
# Get the AKS Details
os.chdir('./CICD')
os.chdir('./devops')
try:
with open("aml_config/aks_webservice.json") as f:
config = json.load(f)

Просмотреть файл

@ -44,8 +44,6 @@ def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_valu
:param direction:
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
"""
results = detect_ts(df,
max_anoms=max_anoms,
alpha=alpha,
@ -56,6 +54,7 @@ def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_valu
return results['anoms']['timestamp'].values
parser = argparse.ArgumentParser("anom_detect")
parser.add_argument("--output_directory", type=str, help="output directory")
@ -67,13 +66,12 @@ os.makedirs(args.output_directory, exist_ok=True)
# public store of telemetry data
data_dir = 'https://sethmottstore.blob.core.windows.net/predmaint/'
print("Reading data ... ", end="")
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
print("Done.")
print("Adding incremental data...")
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
telemetry_incremental = pd.read_csv(os.path.join('devops/data_sample/', 'telemetry_incremental.csv'))
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
print("Done.")
@ -81,7 +79,6 @@ print("Parsing datetime...", end="")
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
print("Done.")
window_size = 12 # how many measures to include in rolling average
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
@ -116,5 +113,3 @@ for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is
pickle.dump(obj, fp)
t.toc("Processing machine %s took" % machine_id)

Просмотреть файл

@ -46,6 +46,7 @@ import os
def download_data():
os.makedirs('../data', exist_ok = True)
container = 'https://sethmottstore.blob.core.windows.net/predmaint/'
urllib.request.urlretrieve(container + 'telemetry.csv', filename='../data/telemetry.csv')
urllib.request.urlretrieve(container + 'maintenance.csv', filename='../data/maintenance.csv')
urllib.request.urlretrieve(container + 'machines.csv', filename='../data/machines.csv')
@ -53,6 +54,7 @@ def download_data():
# we replace errors.csv with anoms.csv (results from running anomaly detection)
# urllib.request.urlretrieve(container + 'errors.csv', filename='../data/errors.csv')
urllib.request.urlretrieve(container + 'anoms.csv', filename='../data/anoms.csv')
df_telemetry = pd.read_csv('../data/telemetry.csv', header=0)
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
df_errors = pd.read_csv('../data/anoms.csv', header=0)
@ -69,8 +71,10 @@ def download_data():
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
return df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
keys = ['machineID', 'datetime']
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
@ -104,6 +108,7 @@ def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = No
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
return df
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
"""
calculates rolling averages and standard deviations
@ -137,7 +142,6 @@ def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon =
return df_res
parser = argparse.ArgumentParser("automl_train")
parser.add_argument("--input_directory", type=str, help="input directory")
@ -150,14 +154,9 @@ run = Run.get_context()
ws = run.experiment.workspace
def_data_store = ws.get_default_datastore()
# Choose a name for the experiment and specify the project folder.
experiment_name = 'automl-local-classification'
project_folder = '.'
experiment = Experiment(ws, experiment_name)
print("Location:", ws.location)
output = {}
@ -191,9 +190,9 @@ df_join.head()
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
# this will make it easier to automatically create features with the right column names
#df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
#df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
#df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
# df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
# df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
# df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
cols_to_average = df_telemetry.columns[-4:]
@ -266,23 +265,20 @@ X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
azureml.train.automl.constants.Metric.CLASSIFICATION_PRIMARY_SET
primary_metric = 'AUC_weighted'
automl_config = AutoMLConfig(task='classification',
preprocess=False,
name=experiment_name,
debug_log='automl_errors.log',
primary_metric=primary_metric,
max_time_sec=1200,
iterations=2,
n_cross_validations=2,
verbosity=logging.INFO,
automl_config = AutoMLConfig(task = 'classification',
preprocess = False,
name = experiment_name,
debug_log = 'automl_errors.log',
primary_metric = primary_metric,
max_time_sec = 1200,
iterations = 2,
n_cross_validations = 2,
verbosity = logging.INFO,
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
path=project_folder, )
path = project_folder, )
local_run = experiment.submit(automl_config, show_output = True)
@ -331,11 +327,11 @@ run_id['run_id'] = best_run.id
run_id['experiment_name'] = best_run.experiment.name
# save run info
os.makedirs('aml_config', exist_ok=True)
os.makedirs('aml_config', exist_ok = True)
with open('aml_config/run_id.json', 'w') as outfile:
json.dump(run_id, outfile)
# upload run info and model (pkl) to def_data_store, so that pipeline mast can access it
def_data_store.upload(src_dir='aml_config', target_path='aml_config', overwrite=True)
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
def_data_store.upload(src_dir='outputs', target_path='outputs', overwrite=True)
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)

Просмотреть файл

@ -24,7 +24,7 @@ model_list = Model.list(workspace=ws)
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
os.chdir('./CICD/code/scoring')
os.chdir('./devops/code/scoring')
image_name = "predmaintenance-model-score"
image_config = ContainerImage.image_configuration(execution_script = "score.py",

Просмотреть файл

@ -1,31 +1,33 @@
############################### load required libraries
import os
import pandas as pd
import json
import azureml.core
from azureml.core import Workspace, Run, Experiment, Datastore
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.runconfig import CondaDependencies, RunConfiguration
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
from azureml.telemetry import set_diagnostics_collection
from azureml.pipeline.steps import PythonScriptStep
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
import pandas as pd
import json
print("SDK Version:", azureml.core.VERSION)
############################### load workspace and create experiment
ws = Workspace.from_config()
print('Workspace name: ' + ws.name,
'Subscription id: ' + ws.subscription_id,
'Resource group: ' + ws.resource_group, sep = '\n')
experiment_name = 'aml-pipeline_cicd' # choose a name for experiment
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
project_folder = '.' # project folder
experiment=Experiment(ws, experiment_name)
experiment = Experiment(ws, experiment_name)
print("Location:", ws.location)
output = {}
output['SDK version'] = azureml.core.VERSION
@ -36,23 +38,22 @@ output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
pd.set_option('display.max_colwidth', -1)
pd.DataFrame(data=output, index=['']).T
pd.DataFrame(data = output, index = ['']).T
set_diagnostics_collection(send_diagnostics=True)
print("SDK Version:", azureml.core.VERSION)
############################### create a run config
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
# Runconfig
amlcompute_run_config = RunConfiguration(framework="python", conda_dependencies=cd)
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
amlcompute_run_config.environment.docker.enabled = False
amlcompute_run_config.environment.docker.gpu_support = False
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
amlcompute_run_config.environment.spark.precache_packages = False
############################### create AML compute
# create AML compute
aml_compute_target = "aml-compute"
try:
aml_compute = AmlCompute(ws, aml_compute_target)
@ -60,15 +61,17 @@ try:
except:
print("creating new compute target")
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
idle_seconds_before_scaledown=1800,
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
idle_seconds_before_scaledown=1800,
min_nodes = 0,
max_nodes = 4)
max_nodes = 4)
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
print("Azure Machine Learning Compute attached")
############################### point to data and scripts
# we use this for exchanging data between pipeline steps
def_data_store = ws.get_default_datastore()
@ -77,43 +80,42 @@ def_blob_store = Datastore(ws, "workspaceblobstore")
print("Blobstore's name: {}".format(def_blob_store.name))
# Naming the intermediate data as anomaly data and assigning it to a variable
anomaly_data = PipelineData("anomaly_data", datastore=def_blob_store)
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
print("Anomaly data object created")
# model = PipelineData("model", datastore=def_data_store)
# model = PipelineData("model", datastore = def_data_store)
# print("Model data object created")
anom_detect = PythonScriptStep(name="anomaly_detection",
anom_detect = PythonScriptStep(name = "anomaly_detection",
# script_name="anom_detect.py",
script_name="CICD/code/anom_detect.py",
arguments=["--output_directory", anomaly_data],
outputs=[anomaly_data],
compute_target=aml_compute,
source_directory=project_folder,
allow_reuse=True,
runconfig=amlcompute_run_config)
script_name = "devops/code/anom_detect.py",
arguments = ["--output_directory", anomaly_data],
outputs = [anomaly_data],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("Anomaly Detection Step created.")
automl_step = PythonScriptStep(name="automl_step",
# script_name="automl_step.py",
script_name="CICD/code/automl_step.py",
arguments=["--input_directory", anomaly_data],
inputs=[anomaly_data],
# outputs=[model],
compute_target=aml_compute,
source_directory=project_folder,
allow_reuse=True,
runconfig=amlcompute_run_config)
automl_step = PythonScriptStep(name = "automl_step",
# script_name = "automl_step.py",
script_name = "devops/code/automl_step.py",
arguments = ["--input_directory", anomaly_data],
inputs = [anomaly_data],
# outputs = [model],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("AutoML Training Step created.")
############################### set up, validate and run pipeline
steps = [anom_detect, automl_step]
print("Step lists created")
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline = Pipeline(workspace = ws, steps = steps)
print ("Pipeline is built")
pipeline.validate()
@ -126,16 +128,18 @@ print("Pipeline is submitted for execution")
pipeline_run.wait_for_completion(show_output = False)
print("Pipeline run completed")
# Download aml_config info and output of automl_step
def_data_store.download(target_path='.',
prefix='aml_config',
show_progress=True,
overwrite=True)
############################### upload artifacts to AML Workspace
def_data_store.download(target_path='.',
prefix='outputs',
show_progress=True,
overwrite=True)
# Download aml_config info and output of automl_step
def_data_store.download(target_path = '.',
prefix = 'aml_config',
show_progress = True,
overwrite = True)
def_data_store.download(target_path = '.',
prefix = 'outputs',
show_progress = True,
overwrite = True)
print("Updated aml_config and outputs folder")
model_fname = 'model.pkl'

Просмотреть файл

@ -1,71 +0,0 @@
# Setup
This lab allows you to perform setup for building a Continuous Integration/Continuous Deployment pipeline related to Anomoly Detection and Predictive Maintenance.
### Pre-requisites
- Azure account
- Azure DevOps Account
- Azure Machine Learning Service Workspace
- Basic knowledge of Python
After you launch your environment, follow the below steps:
### Azure Machine Learning Service Workspace
We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:
1. Login to Azure portal using the credentials provided with the environment.
2. Select `Create a Resource` and search the marketplace for `Machine Learning Service Workspace`.
![Market Place](../../images/marketplace.png)
3. Select `Machine Learning Service Workspace` followed by `Create`:
![Create Workspace](../../images/createWorkspace.png)
4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):
![Workspace Fields](../../images/workspaceFields.png)
### Sign in to Azure DevOps
Go to https://dev.azure.com and login using the username and password provided. After logging in, you should see the below:
![Get Started](../../images/getStarted.png)
### Create a Project
Create a Private project by providing a `Project name`. With private projects, only people you give access to will be able to view this project. Select `Create` to create the project.
### Create Service connection
The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a `Service Connection` and authorizing the build pipeline to use this connection.
> If we didn't set up this `service connection`, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.
Setting up a service connection involves the following steps:
1. Click on `Project settings` in the bottom-left corner of your screen.
1. On the next page, search for menu section `Pipelines` and select `Service Connection`.
1. Create a `New service connection`, of type `Azure Resource Manager`.
1. Properties of connection:
1. `Service Principal Authentication`
1. **Important!** Set `connection name` to "serviceConnection" (careful about capitalization).
1. `Scope level`: Subscription
1. `Subscription`: Select the same which you have been using throughout the course. You may already have a compute target in there (e.g. "aml-copute") and a AML workspace.
1. **Important!** Leave `Resource Group` empty.
1. Allow all pipelines to use this connection.
### Repository
After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to import using the `import` wizard found in Repos -> Files -> Import as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.
![import repository](../../images/importGit.png)
After running the above steps, your repo should now be populated and would look like below:
![Git Repo](../../images/gitRepo.png)

Просмотреть файл

@ -1,232 +0,0 @@
# Building the pipeline
Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated.
### A. Hosted Agents
With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care for you. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:
- Ubuntu 16.04
- Visual Studio 2017 on Windows Server 2016
- macOS 10.13
- Windows Server 1803 (win1803) - for running Windows containers
- Visual Studio 2015 on Windows Server 2012R2
YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use.
### B. Code Repository
The repo is organized as follows:
```
code
code/testing/
code/scoring/
code/aml_config/
data_sample
azure-pipelines.yml
```
The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`.
Sample data is created in `data_sample` that is used for testing. `azure-pipelines.yml` file at the root of your repository contains the instructions for the pipeline.
### C. Config
Create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:
```
{
"subscription_id": ".......",
"resource_group": ".......",
"workspace_name": ".......",
"workspace_region": "......."
}
```
You can get all of the info from the Machine Learning service workspace created in the portal as shown below:
![ML Workspace](../../images/mlworkspace.png)
### D. Secure Files
It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.
The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.
#### Upload Secure File
1. Select Pipelines, Library and Secure Files as shown below:
!Upload Secure File](../../images/uploadSecureFile.png)
2. Select `+Secure File` to upload config.json file.
3. Select the uploaded file `config.json` and ensure `Authorize for use in all pipelines` is ticked. Select `Save`:
![Authorize Pipeline](../../images/authorizePipeline.png)
### E. Build
Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.
#### New Pipeline
1. To create a new pipeline, select `New pipeline` from the Pipelines blade:
![New Pipeline](../../images/newPipeline.png)
2. You will be prompted with "Where is your code?". Select `Azure Repos` followed by your repo.
3. Select `Run`. Once the agent is allocated, you'll start seeing the live logs of the build.
#### Interactive Authentication
At the train step, you will recieve a message for interactive authentication as shown below. Open a web browser to open the page https://microsoft.com/devicelogin and enter the code to authenticate for the build to resume.
![Interactive Auth](../../images/interactiveAuth.png)
Eventually on success, the build status would appear as follows:
![Job](../../images/job.png)
#### Notification
The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification.
### F. Azure Pipelines with YAML
You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure.
The basic steps include:
1. Configure Azure Pipelines to use your Git repo.
2. Edit your `azure-pipelines.yml` file to define your build.
3. Push your code to your version control repository which kicks off the default trigger to build and deploy.
4. Code is now updated, built, tested, and packaged. It can be deployed to any target.
![Pipelines-Image-Yam](../../images/pipelines-image-yaml.png)
Open the yml file in the repo to understand the build steps.
### G. Test
In this workshop, multiple tests are included:
1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.
2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.
#### Exercise
- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types?
- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?
- Make sure `azure-pipelines.yml` captures the above changes
### H. Release
In this section, you will learn how to schedule release at specific times by defining one or more scheduled release triggers.
#### Create Release Pipeline
#### Time Trigger
1. Choose the schedule icon in the Artifacts section of your pipeline and enable scheduled release triggers. Note: you can configure multiple schedules.
![Release Time Trigger](../../images/releaseTimeTrigger.png)
2. Select a time to schedule release trigger. For viewing the trigger execution, you can choose a trigger time that's about 10 mins from now.
#### Build Trigger (Continuous deployment trigger)
Along with the time triggers, we cann can also create a release every time a new build is available.
1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:
![Release Build Trigger](../../images/releaseBuildTrigger.png)
2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting "master" will trigger a release for every build from the master branch.
#### Approvals
For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:
![Pending Approval](../../images/pendingApproval.png)
There is also provision to include comments with approval/reject:
![Approval Comments](../../images/approvalComments.png)
Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines:
![Release Passed](../../images/releasePassed.png)
#### I. Application Insights (Optional)
For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.
Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.
```python
import requests
import json
input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
data = json.dumps({'data': input_j})
test_sample = bytes(data, encoding = 'utf8')
url = '<scoring_url>'
api_key = '<key>'
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}
resp = requests.post(url, test_sample, headers=headers)
print(resp.text)
```
1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:
![ML Workspace](../../images/mlworkspace.png)
2. Select Analytics.
3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:
````
requests
| where timestamp > ago(3h)
````
![LogAnalytics Query](../../images/logAnalyticsQuery.png)
#### J. Service Principal Authentication (Optional)
ServicePrincipalAuthentication class allows for authentication using a service principle instead of users own identity. The class is ideal for automation and CI/CD scenarios where interactive authentication is not desired. The below snippet shows how you can create a workspace using ServicePrincipalAuthentication.
````python
from azureml.core.authentication import ServicePrincipalAuthentication
spa = ServicePrincipalAuthentication(<tenant_id>, <username>, <password>)
# Example: spa = ServicePrincipalAuthentication('0e4cb6d6-25c5-4c27-a1ca-42a112c18b71', '59d48937-6e62-4223-b07a-711b11ad24b6', 'zcnv77SNT*Vu')
ws = Workspace.from_config(auth=spa)
````
#### Exercise
1. Replace `tenant_id`, `username` and `password` with the values generated during the lab creation by reading these values from a secure file. Modify `ws = Workspace.from_config()` in the scripts to use `ServicePrincipalAuthentication`. Perform build again to avoid interactive authentication.
#### K. Data Changes
A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:
````python
print("Adding incremental data...")
telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
````
The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment.

Двоичные данные
images/changeToFullVersionServiceConnection.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 201 KiB

Двоичные данные
images/configFileOnPortal.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 210 KiB

Двоичные данные
images/createServiceConnection.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 137 KiB

Двоичные данные
images/fullDialogueServiceConnection.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 171 KiB

Двоичные данные
images/importGit.png

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 128 KiB

После

Ширина:  |  Высота:  |  Размер: 157 KiB

Двоичные данные
images/uploadSecureFile.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 107 KiB

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Просмотреть файл

@ -0,0 +1,367 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##![LearnAI Header](https://coursematerial.blob.core.windows.net/assets/LearnAI_header.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Applying a pipeline to structured streaming data\n",
"\n",
"## Overview (see also [Programming Guide](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html))\n",
"\n",
"Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the Dataset/DataFrame API in Scala, Java, Python or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write-Ahead Logs. In short, Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.\n",
"\n",
"Internally, by default, Structured Streaming queries are processed using a micro-batch processing engine, which processes data streams as a series of small batch jobs thereby achieving end-to-end latencies as low as 100 milliseconds and exactly-once fault-tolerance guarantees. However, since Spark 2.3, we have introduced a new low-latency processing mode called Continuous Processing, which can achieve end-to-end latencies as low as 1 millisecond with at-least-once guarantees. Without changing the Dataset/DataFrame operations in your queries, you will be able to choose the mode based on your application requirements."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load previously saved model\n",
"\n",
"Let's take in the model we saved earlier, and apply it to some streaming data!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .ansiout {\n",
" display: block;\n",
" unicode-bidi: embed;\n",
" white-space: pre-wrap;\n",
" word-wrap: break-word;\n",
" word-break: break-all;\n",
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
" font-size: 13px;\n",
" color: #555;\n",
" margin-left: 4px;\n",
" line-height: 19px;\n",
" }\n",
"</style>\n",
"<div class=\"ansiout\"></div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from pyspark.ml.pipeline import PipelineModel\n",
"\n",
"fileName = \"my_pipeline\"\n",
"pipelineModel = PipelineModel.load(fileName)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initiate Data Stream\n",
"\n",
"Here, we are going to simulate streaming data, by reading in the DataFrame from the previous lab, but serving it as a stream to our pipeline.\n",
"\n",
"**Note**: You must specify a schema when creating a streaming source DataFrame. Why!?"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .ansiout {\n",
" display: block;\n",
" unicode-bidi: embed;\n",
" white-space: pre-wrap;\n",
" word-wrap: break-word;\n",
" word-break: break-all;\n",
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
" font-size: 13px;\n",
" color: #555;\n",
" margin-left: 4px;\n",
" line-height: 19px;\n",
" }\n",
"</style>\n",
"<div class=\"ansiout\"></div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from pyspark.sql.types import *\n",
"\n",
"schema = StructType([\n",
" StructField(\"tweet\",StringType()), \n",
" StructField(\"existence\",IntegerType()),\n",
" StructField(\"confidence\",FloatType())])\n",
"\n",
"streamingData = (spark\n",
" .readStream\n",
" .schema(schema)\n",
" .option(\"maxFilesPerTrigger\", 1)\n",
" .parquet(\"dbfs:/gwDF\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are going to use our `pipelineModel` to transform the `streamingData`. The output will be called `stream`: a confusion matrix for evaluating the performance of the model."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .table-result-container {\n",
" max-height: 300px;\n",
" overflow: auto;\n",
" }\n",
" table, th, td {\n",
" border: 1px solid black;\n",
" border-collapse: collapse;\n",
" }\n",
" th, td {\n",
" padding: 5px;\n",
" }\n",
" th {\n",
" text-align: left;\n",
" }\n",
"</style><div class='table-result-container'><table class='table-result'><thead style='background-color: white'><tr><th>existence</th><th>prediction</th><th>count</th></tr></thead><tbody><tr><td>0</td><td>0.0</td><td>890</td></tr><tr><td>0</td><td>1.0</td><td>185</td></tr><tr><td>1</td><td>0.0</td><td>58</td></tr><tr><td>1</td><td>1.0</td><td>2997</td></tr></tbody></table></div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"stream = (pipelineModel\n",
" .transform(streamingData)\n",
" .groupBy(\"existence\", \"prediction\")\n",
" .count()\n",
" .sort(\"existence\", \"prediction\"))\n",
"\n",
"display(stream)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optimization\n",
"\n",
"Why is this stream taking so long? What configuration should we set?"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .ansiout {\n",
" display: block;\n",
" unicode-bidi: embed;\n",
" white-space: pre-wrap;\n",
" word-wrap: break-word;\n",
" word-break: break-all;\n",
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
" font-size: 13px;\n",
" color: #555;\n",
" margin-left: 4px;\n",
" line-height: 19px;\n",
" }\n",
"</style>\n",
"<div class=\"ansiout\"><span class=\"ansired\">Out[</span><span class=\"ansired\">4</span><span class=\"ansired\">]: </span>&apos;200&apos;\n",
"</div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"spark.conf.get(\"spark.sql.shuffle.partitions\")"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .ansiout {\n",
" display: block;\n",
" unicode-bidi: embed;\n",
" white-space: pre-wrap;\n",
" word-wrap: break-word;\n",
" word-break: break-all;\n",
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
" font-size: 13px;\n",
" color: #555;\n",
" margin-left: 4px;\n",
" line-height: 19px;\n",
" }\n",
"</style>\n",
"<div class=\"ansiout\"></div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"spark.conf.set(\"spark.sql.shuffle.partitions\", \"8\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> See this [post](https://umbertogriffo.gitbooks.io/apache-spark-best-practices-and-tuning/content/sparksqlshufflepartitions_draft.html) for a detailed look into how to estimate the size of your data and choosing the right number of partitions. \n",
"\n",
"Let's try this again"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .table-result-container {\n",
" max-height: 300px;\n",
" overflow: auto;\n",
" }\n",
" table, th, td {\n",
" border: 1px solid black;\n",
" border-collapse: collapse;\n",
" }\n",
" th, td {\n",
" padding: 5px;\n",
" }\n",
" th {\n",
" text-align: left;\n",
" }\n",
"</style><div class='table-result-container'><table class='table-result'><thead style='background-color: white'><tr><th>existence</th><th>prediction</th><th>count</th></tr></thead><tbody><tr><td>0</td><td>0.0</td><td>890</td></tr><tr><td>0</td><td>1.0</td><td>185</td></tr><tr><td>1</td><td>0.0</td><td>58</td></tr><tr><td>1</td><td>1.0</td><td>2997</td></tr></tbody></table></div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"stream = (pipelineModel\n",
" .transform(streamingData)\n",
" .groupBy(\"existence\", \"prediction\")\n",
" .count()\n",
" .sort(\"existence\", \"prediction\"))\n",
"\n",
"display(stream)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Save the output\n",
"\n",
"We can save the output of the processed stream to a file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"\n",
"streamingView = \"username\"\n",
"checkpointFile = \"checkPoint\"\n",
"dbutils.fs.rm(checkpointFile, True) # clear out the checkpointing directory\n",
"\n",
"(stream\n",
" .writeStream\n",
" .format(\"memory\")\n",
" .option(\"checkpointLocation\", checkpointFile)\n",
" .outputMode(\"complete\")\n",
" .queryName(streamingView)\n",
" .start())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"display(sql(\"select * from \" + streamingView))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"name": "05_structured_streaming",
"notebookId": 4057188818416178
},
"nbformat": 4,
"nbformat_minor": 1
}

Просмотреть файл

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Различия файлов скрыты, потому что одна или несколько строк слишком длинны

Просмотреть файл

@ -1,220 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exporting notebooks from ADB\n",
"\n",
"This notebook does two things:\n",
"1. It recursively exports a folder recursively as a dbc archive.\n",
"1. It recursively exports all notebooks in a folder as jupyter notebooks.\n",
"\n",
"We start with the setup"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"path = 'notebooks' # the path to folder in your ADB workspace\n",
"\n",
"region = 'westus'\n",
"username = 'wopauli@microsoft.com' "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We configure the personal access token we configured in ADB. We are reading it in here to reduce the odds of accidentally exposing it."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"with open('token.txt', 'r') as f:\n",
" token = f.read().strip()\n",
" \n",
"headers = {\n",
" 'Authorization': 'Bearer %s' % token\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we download the entire DBC archive. THis serves multiple purposes:\n",
"1. We have it exported.\n",
"1. We will list its contents so that we can export jupyter notebooks one by one."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Starting export of DBC archive. This might take a while, depending on your connection.\n",
"Done.\n",
"Writing to file.\n"
]
}
],
"source": [
"import requests\n",
"\n",
"url = 'https://%s.azuredatabricks.net/api/2.0/workspace/export?path=/Users/%s/%s&direct_download=true&format=DBC' % (region, username, path)\n",
"\n",
"print(\"Starting export of DBC archive. This might take a while, depending on your connection.\")\n",
"r = requests.get(url=url, headers=headers)\n",
"print(\"Done.\")\n",
"\n",
"if r.ok:\n",
" print(\"Writing to file.\")\n",
" with open(path + '.dbc', 'wb') as f:\n",
" f.write(r.content)\n",
"else:\n",
" print(\"Downloading notebook archive failed\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We list the notebooks contained in the archive."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import zipfile\n",
"\n",
"path_to_zip_file = './notebooks.dbc'\n",
"zip_ref = zipfile.ZipFile(path_to_zip_file, 'r')\n",
"\n",
"files = zip_ref.namelist()\n",
"\n",
"notebooks = [x for x in files if x.endswith('.python')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We iterate through the notebooks, and export one by one as a jupyter notebook."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Working on: notebooks/day_2/05_automated_ML\n",
"Working on: notebooks/day_2/03_aml_getting_started\n",
"Working on: notebooks/day_2/04_ml_experimentation\n",
"Working on: notebooks/day_2/01_logistic_regression\n",
"Working on: notebooks/day_2/06_deployment\n",
"Working on: notebooks/day_2/02_random_forests\n",
"Working on: notebooks/day_1/04_hyperparameter_tuning\n",
"Working on: notebooks/day_1/05_structured_streaming\n",
"Working on: notebooks/day_1/03_sentiment_analysis\n",
"Working on: notebooks/day_1/01_introduction\n",
"Working on: notebooks/day_1/02_feature_engineering\n",
"Working on: notebooks/tests/run_notebooks\n",
"Working on: notebooks/includes/mnt_blob_rw\n",
"Working on: notebooks/includes/mnt_blob\n"
]
}
],
"source": [
"import os\n",
"\n",
"for notebook in notebooks:\n",
" notebook = os.path.splitext(notebook)[0]\n",
" print(\"Working on: %s\" % notebook)\n",
" url = 'https://%s.azuredatabricks.net/api/2.0/workspace/export?path=/Users/%s/%s&direct_download=true&format=JUPYTER' % (region, username, notebook)\n",
"\n",
" r = requests.get(url=url, headers=headers)\n",
" if r.ok:\n",
" notebook_path, ipynb_notebook = os.path.split(notebook + \".ipynb\")\n",
" \n",
" if not os.path.exists(notebook_path):\n",
" os.makedirs(notebook_path)\n",
" \n",
" with open(os.path.join(notebook_path, ipynb_notebook), 'wb') as f:\n",
" f.write(r.content)\n",
" else:\n",
" print(\"Failed: %s\" % notebook)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'notebooks/includes'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"notebook_path"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Consider using the following command to clear the output of all notebooks. \n",
"\n",
"*Note:* this may require `git bash` or `bash`, and may not work in vania\n",
"\n",
"jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace Notebook.ipynb"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

Просмотреть файл

@ -1 +1,106 @@
{"cells":[{"cell_type":"code","source":["import os\n\nsource = \"wasbs://data@coursematerial.blob.core.windows.net\"\nmount_point = \"/mnt/data\"\nextra_configs = {\"fs.azure.sas.data.coursematerial.blob.core.windows.net\":\"?sv=2018-03-28&ss=bfqt&srt=sco&sp=rwdlacup&se=2019-07-01T02:17:07Z&st=2019-02-14T19:17:07Z&spr=https&sig=1%2FnXywpfU6%2FYNLl0Zs1t5M8PF5p8ES7SPFX78tPtmYY%3D\"}\n\ntry:\n if len(os.listdir('/dbfs/mnt/data/')) > 0:\n print(\"Already mounted.\")\n else:\n dbutils.fs.mount(\n source = source,\n mount_point = mount_point,\n extra_configs = extra_configs)\n print(\"Mounted: %s at %s\" % (source, mount_point))\nexcept:\n dbutils.fs.mount(\n source = source,\n mount_point = mount_point,\n extra_configs = extra_configs)\n print(\"Mounted: %s at %s\" % (source, mount_point))"],"metadata":{},"outputs":[{"metadata":{},"output_type":"display_data","data":{"text/html":["<style scoped>\n .ansiout {\n display: block;\n unicode-bidi: embed;\n white-space: pre-wrap;\n word-wrap: break-word;\n word-break: break-all;\n font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n font-size: 13px;\n color: #555;\n margin-left: 4px;\n line-height: 19px;\n }\n</style>\n<div class=\"ansiout\">Mounted: wasbs://data@coursematerial.blob.core.windows.net at /mnt/data\n</div>"]}}],"execution_count":1},{"cell_type":"code","source":["# dbutils.fs.unmount('/mnt/data')"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":["# os.listdir('/dbfs/mnt/data/')"],"metadata":{},"outputs":[{"metadata":{},"output_type":"display_data","data":{"text/html":["<style scoped>\n .ansiout {\n display: block;\n unicode-bidi: embed;\n white-space: pre-wrap;\n word-wrap: break-word;\n word-break: break-all;\n font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n font-size: 13px;\n color: #555;\n margin-left: 4px;\n line-height: 19px;\n }\n</style>\n<div class=\"ansiout\"><span class=\"ansired\">---------------------------------------------------------------------------</span>\n<span class=\"ansired\">FileNotFoundError</span> Traceback (most recent call last)\n<span class=\"ansigreen\">&lt;command-1350352043586502&gt;</span> in <span class=\"ansicyan\">&lt;module&gt;</span><span class=\"ansiblue\">()</span>\n<span class=\"ansigreen\">----&gt; 1</span><span class=\"ansiyellow\"> </span>os<span class=\"ansiyellow\">.</span>listdir<span class=\"ansiyellow\">(</span><span class=\"ansiblue\">&apos;/dbfs/mnt/data/&apos;</span><span class=\"ansiyellow\">)</span><span class=\"ansiyellow\"></span>\n\n<span class=\"ansired\">FileNotFoundError</span>: [Errno 2] No such file or directory: &apos;/dbfs/mnt/data/&apos;</div>"]}}],"execution_count":3},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":4}],"metadata":{"name":"mnt_blob","notebookId":4057188818416716},"nbformat":4,"nbformat_minor":0}
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style scoped>\n",
" .ansiout {\n",
" display: block;\n",
" unicode-bidi: embed;\n",
" white-space: pre-wrap;\n",
" word-wrap: break-word;\n",
" word-break: break-all;\n",
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
" font-size: 13px;\n",
" color: #555;\n",
" margin-left: 4px;\n",
" line-height: 19px;\n",
" }\n",
"</style>\n",
"<div class=\"ansiout\">Mounted: wasbs://data@coursematerial.blob.core.windows.net at /mnt/data\n",
"</div>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import os\n",
"\n",
"source = \"wasbs://data@coursematerial.blob.core.windows.net\"\n",
"mount_point = \"/mnt/data\"\n",
"extra_configs = {\"fs.azure.sas.data.coursematerial.blob.core.windows.net\":\"?sv=2018-03-28&ss=bfqt&srt=sco&sp=rwdlacup&se=2019-07-01T02:17:07Z&st=2019-02-14T19:17:07Z&spr=https&sig=1%2FnXywpfU6%2FYNLl0Zs1t5M8PF5p8ES7SPFX78tPtmYY%3D\"}\n",
"\n",
"try:\n",
" if len(os.listdir('/dbfs/mnt/data/')) > 0:\n",
" print(\"Already mounted.\")\n",
" else:\n",
" dbutils.fs.mount(\n",
" source = source,\n",
" mount_point = mount_point,\n",
" extra_configs = extra_configs)\n",
" print(\"Mounted: %s at %s\" % (source, mount_point))\n",
"except:\n",
" dbutils.fs.mount(\n",
" source = source,\n",
" mount_point = mount_point,\n",
" extra_configs = extra_configs)\n",
" print(\"Mounted: %s at %s\" % (source, mount_point))"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# dbutils.fs.unmount('/mnt/data')"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# os.listdir('/dbfs/mnt/data/')"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
},
"name": "mnt_blob",
"notebookId": 4057188818416716
},
"nbformat": 4,
"nbformat_minor": 1
}

Просмотреть файл

@ -0,0 +1,665 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure DevOps\n",
"\n",
"With Azure DevOps data scientists and application developers can work together to create and maintain AI-infused applications. Using a DevOps mindset is not new to software developers, who are used to running applications in production. However, data scientists in the past have often worked in silos and not followed best practices to facilitate the transition from development to production. With Azure DevOps data scientists can now develop with an eye toward production."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 1: Getting started\n",
"\n",
"This lab allows you to perform setup for building a **Continuous Integration/Continuous Deployment** pipeline related to Anomoly Detection and Predictive Maintenance.\n",
"\n",
"### Pre-requisites\n",
"\n",
"- Azure account\n",
"- Azure DevOps account\n",
"- Azure Machine Learning Service Workspace\n",
"- Basic knowledge of Python\n",
"\n",
"After you launch your environment, follow the below steps:\n",
"\n",
"### Azure Machine Learning Service Workspace\n",
"\n",
"We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:\n",
"\n",
"1. Login to Azure portal using the credentials provided with the environment.\n",
"\n",
"2. Select **Create a Resource** and search the marketplace for **Machine Learning Service Workspace**.\n",
"\n",
"![Market Place](../images/marketplace.png)\n",
"\n",
"3. Select **Machine Learning Service Workspace** followed by **Create**:\n",
"\n",
"![Create Workspace](../images/createWorkspace.png)\n",
"\n",
"4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):\n",
"\n",
"![Workspace Fields](../images/workspaceFields.png)\n",
"\n",
"### Sign in to Azure DevOps\n",
"\n",
"Go to **https://dev.azure.com** and login using your Azure username and password. You will be asked to provide a name and email. An organization is created for you based on the name you provide. Within the organization, you will be asked to create a project. Name your project \"ADPM\" and click on **Create project**. With private projects, only people you give access to will be able to view this project. After logging in, you should see the below:\n",
"\n",
"![Get Started](../images/getStarted.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Service connection\n",
"\n",
"The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a **Service Connection** and authorizing the build pipeline to use this connection.\n",
"\n",
"> If we didn't set up this **service connection**, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.\n",
"\n",
"Setting up a service connection involves the following steps:\n",
"1. Click on **Project settings** in the bottom-left corner of your screen.\n",
"2. On the next page, search for menu section **Pipelines** and select **Service Connection**.\n",
"3. Create a **New service connection**, of type **Azure Resource Manager**.\n",
"\n",
"![Get Started](../images/createServiceConnection.png)\n",
"\n",
"4. On the page you are presented with, scroll down and click on the link saying **use the full version of the service connection dialog**.\n",
"\n",
"![Get Started](../images/changeToFullVersionServiceConnection.png)\n",
"\n",
"5. Begin filling out the full version of the form. All the information you need is provided in the lab setup page. If you closed this page, a link to it was emailed to you. Look for emails from **No Reply (CloudLabs) <noreply@cloudlabs.ai>**.\n",
"\n",
"![Get Started](../images/fullDialogueServiceConnection.png \"width=50\")\n",
"\n",
" - **Important!** Set **connection name** to **serviceConnection** (careful about capitalization).\n",
" - For **Service principal client ID** paste the field called **Application/Client Id** in the lab setup page.\n",
" - Set **Scope level** to **Subscription**.\n",
" - For **Subscription**, select the same which you have been using throughout the course. You may already have a compute target in there (e.g. \"aml-copute\") and a AML workspace.\n",
" - **Important!** Leave **Resource Group** empty.\n",
" - For **Service principal key** paste the filed called **Application Secret Key** in the lab setup page.\n",
" - Allow all pipelines to use this connection.\n",
" - Click on **Verify connection** to make sure the connection is valid and then click on **OK**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Repository\n",
"\n",
"After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to go to **Repos > Files > Import** as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.\n",
"\n",
"![import repository](../images/importGit.png)\n",
"\n",
"You should now be able to see the git repo in your project."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part 2: Building a pipeline\n",
"\n",
"Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hosted Agents\n",
"\n",
"With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care of. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:\n",
"\n",
"- Ubuntu 16.04\n",
"- Visual Studio 2017 on Windows Server 2016\n",
"- macOS 10.13\n",
"- Windows Server 1803 (win1803) - for running Windows containers\n",
"- Visual Studio 2015 on Windows Server 2012R2\n",
"\n",
"YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Code Repository\n",
"\n",
"The repo is organized as follows:\n",
"\n",
"```\n",
" code\n",
" code/testing/\n",
" code/scoring/\n",
" code/aml_config/\n",
" data_sample\n",
" azure-pipelines.yml\n",
"```\n",
"\n",
"The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`.\n",
"\n",
"Sample data is created in `data_sample` that is used for testing. `azure-pipelines.yml` file at the root of your repository contains the instructions for the pipeline."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## About the scripts\n",
"\n",
"For the purpose of DevOps, it's best not to use a Notebook because it can be error-prone. Instead, we have all the code sitting in individual Python scripts. This means that if we used a Notebook to develop our scripts, like we did throughout this course, we have some work to do to refactor the code and turn it into a series of modular Python scripts. We would also add scripts for running various tests everytime our build is triggered, such as unit tests, integration tests, tests to measure **drift** (a degradation over time of the predictions returned by the model on incoming data), etc.\n",
"\n",
"Let's take a look at the different scripts we have to deal with and what each does."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %load ./code/pipeline.py\n",
"\n",
"############################### load required libraries\n",
"\n",
"import os\n",
"import pandas as pd\n",
"import json\n",
"\n",
"import azureml.core\n",
"from azureml.core import Workspace, Run, Experiment, Datastore\n",
"from azureml.core.compute import AmlCompute\n",
"from azureml.core.compute import ComputeTarget\n",
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
"from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
"from azureml.telemetry import set_diagnostics_collection\n",
"from azureml.pipeline.steps import PythonScriptStep\n",
"from azureml.pipeline.core import Pipeline, PipelineData, StepSequence\n",
"\n",
"print(\"SDK Version:\", azureml.core.VERSION)\n",
"\n",
"############################### load workspace and create experiment\n",
"\n",
"ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')\n",
"\n",
"experiment_name = 'aml-pipeline-cicd' # choose a name for experiment\n",
"project_folder = '.' # project folder\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"print(\"Location:\", ws.location)\n",
"output = {}\n",
"output['SDK version'] = azureml.core.VERSION\n",
"output['Subscription ID'] = ws.subscription_id\n",
"output['Workspace'] = ws.name\n",
"output['Resource Group'] = ws.resource_group\n",
"output['Location'] = ws.location\n",
"output['Project Directory'] = project_folder\n",
"output['Experiment Name'] = experiment.name\n",
"pd.set_option('display.max_colwidth', -1)\n",
"pd.DataFrame(data = output, index = ['']).T\n",
"\n",
"set_diagnostics_collection(send_diagnostics=True)\n",
"\n",
"############################### create a run config\n",
"\n",
"cd = CondaDependencies.create(pip_packages=[\"azureml-sdk==1.0.17\", \"azureml-train-automl==1.0.17\", \"pyculiarity\", \"pytictoc\", \"cryptography==2.5\", \"pandas\"])\n",
"\n",
"amlcompute_run_config = RunConfiguration(framework = \"python\", conda_dependencies = cd)\n",
"amlcompute_run_config.environment.docker.enabled = False\n",
"amlcompute_run_config.environment.docker.gpu_support = False\n",
"amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE\n",
"amlcompute_run_config.environment.spark.precache_packages = False\n",
"\n",
"############################### create AML compute\n",
"\n",
"aml_compute_target = \"aml-compute\"\n",
"try:\n",
" aml_compute = AmlCompute(ws, aml_compute_target)\n",
" print(\"found existing compute target.\")\n",
"except:\n",
" print(\"creating new compute target\")\n",
" \n",
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", \n",
" idle_seconds_before_scaledown=1800, \n",
" min_nodes = 0, \n",
" max_nodes = 4)\n",
" aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)\n",
" aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
" \n",
"print(\"Azure Machine Learning Compute attached\")\n",
"\n",
"############################### point to data and scripts\n",
"\n",
"# we use this for exchanging data between pipeline steps\n",
"def_data_store = ws.get_default_datastore()\n",
"\n",
"# get pointer to default blob store\n",
"def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
"print(\"Blobstore's name: {}\".format(def_blob_store.name))\n",
"\n",
"# Naming the intermediate data as anomaly data and assigning it to a variable\n",
"anomaly_data = PipelineData(\"anomaly_data\", datastore = def_blob_store)\n",
"print(\"Anomaly data object created\")\n",
"\n",
"# model = PipelineData(\"model\", datastore = def_data_store)\n",
"# print(\"Model data object created\")\n",
"\n",
"anom_detect = PythonScriptStep(name = \"anomaly_detection\",\n",
" # script_name=\"anom_detect.py\",\n",
" script_name = \"CICD/code/anom_detect.py\",\n",
" arguments = [\"--output_directory\", anomaly_data],\n",
" outputs = [anomaly_data],\n",
" compute_target = aml_compute, \n",
" source_directory = project_folder,\n",
" allow_reuse = True,\n",
" runconfig = amlcompute_run_config)\n",
"print(\"Anomaly Detection Step created.\")\n",
"\n",
"automl_step = PythonScriptStep(name = \"automl_step\",\n",
" # script_name = \"automl_step.py\", \n",
" script_name = \"CICD/code/automl_step.py\", \n",
" arguments = [\"--input_directory\", anomaly_data],\n",
" inputs = [anomaly_data],\n",
" # outputs = [model],\n",
" compute_target = aml_compute, \n",
" source_directory = project_folder,\n",
" allow_reuse = True,\n",
" runconfig = amlcompute_run_config)\n",
"\n",
"print(\"AutoML Training Step created.\")\n",
"\n",
"############################### set up, validate and run pipeline\n",
"\n",
"steps = [anom_detect, automl_step]\n",
"print(\"Step lists created\")\n",
"\n",
"pipeline = Pipeline(workspace = ws, steps = steps)\n",
"print (\"Pipeline is built\")\n",
"\n",
"pipeline.validate()\n",
"print(\"Pipeline validation complete\")\n",
"\n",
"pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)\n",
"print(\"Pipeline is submitted for execution\")\n",
"\n",
"# Wait until the run finishes.\n",
"pipeline_run.wait_for_completion(show_output = False)\n",
"print(\"Pipeline run completed\")\n",
"\n",
"############################### upload artifacts to AML Workspace\n",
"\n",
"# Download aml_config info and output of automl_step\n",
"def_data_store.download(target_path = '.',\n",
" prefix = 'aml_config',\n",
" show_progress = True,\n",
" overwrite = True)\n",
"\n",
"def_data_store.download(target_path = '.',\n",
" prefix = 'outputs',\n",
" show_progress = True,\n",
" overwrite = True)\n",
"print(\"Updated aml_config and outputs folder\")\n",
"\n",
"model_fname = 'model.pkl'\n",
"model_path = os.path.join(\"outputs\", model_fname)\n",
"\n",
"# Upload the model file explicitly into artifacts (for CI/CD)\n",
"pipeline_run.upload_file(name = model_path, path_or_stream = model_path)\n",
"print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The script `pipeline.py` run `anom_detect.py` and `automl_step.py` in that order. Let's see what these two scripts contain."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %load ./code/anom_detect.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %load ./code/automl_step.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"%load ./code/evaluate_model.py\n",
"%load ./code/register_model.py\n",
"%load ./code/create_scoring_image.py\n",
"%load ./code/deploy_aci.py\n",
"%load ./code/aci_service_test.py\n",
"\n",
"%load ./code/deploy_aks.py\n",
"%load ./code/aks_service_test.py\n",
"%load ./code/data_prep.py\n",
"%load ./code/scoring/score.py\n",
"%load ./code/testing/data_test.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# %load ./azure-pipelines.yml\n",
"pool:\n",
" vmImage: 'Ubuntu 16.04'\n",
"steps:\n",
"- task: UsePythonVersion@0\n",
" inputs:\n",
" versionSpec: 3.5\n",
" architecture: 'x64'\n",
"\n",
"- task: DownloadSecureFile@1\n",
" inputs:\n",
" name: configFile\n",
" secureFile: config.json\n",
"- script: echo \"Printing the secure file path\" \n",
"- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)\n",
"\n",
"- task: CondaEnvironment@1\n",
" displayName: 'Create Conda Environment '\n",
" inputs:\n",
" createCustomEnvironment: true\n",
" environmentName: azuremlsdk\n",
" packageSpecs: 'python=3.6'\n",
" updateConda: false\n",
" createOptions: 'cython==0.29 urllib3<1.24'\n",
"- script: |\n",
" pip install --user azureml-sdk==1.0.17 pandas\n",
" displayName: 'Install prerequisites'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/pipeline.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/pipeline.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/evaluate_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/evaluate_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/register_model.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/register_model.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/create_scoring_image.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/create_scoring_image.py'\n",
"\n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/deploy_aci.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/deploy_aci.py'\n",
" \n",
"- task: AzureCLI@1\n",
" displayName: 'Azure CLI CICD/code/aci_service_test.py'\n",
" inputs:\n",
" azureSubscription: 'serviceConnection'\n",
" scriptLocation: inlineScript\n",
" inlineScript: 'python CICD/code/aci_service_test.py'\n",
"- script: |\n",
" python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv\n",
" displayName: 'Test Schema'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a config file and uploading it as a Secure File\n",
"\n",
"On your own labtop, create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:\n",
"\n",
"```\n",
"{\n",
" \"subscription_id\": \".......\",\n",
" \"resource_group\": \".......\",\n",
" \"workspace_name\": \".......\",\n",
" \"workspace_region\": \".......\"\n",
"}\n",
"```\n",
"\n",
"You can get all of the info from the Machine Learning Service Workspace created in the portal as shown below. **Attention:** For `workspace_region` use one word and all lowercase, e.g. `westus2`.\n",
"\n",
"![ML Workspace](../images/configFileOnPortal.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.\n",
"\n",
"The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.\n",
"\n",
"#### Upload Secure File\n",
"\n",
"1. Select **Pipelines**, **Library** and **Secure Files**, then **+Secure File** to upload `config.json` file.\n",
"\n",
"![Upload Secure File](../images/uploadSecureFile.png)\n",
"\n",
"2. Select the uploaded file `config.json` and ensure **Authorize for use in all pipelines** is ticked and click on **Save**. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating a build\n",
"\n",
"Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.\n",
"\n",
"#### New Pipeline\n",
"\n",
"1. To create a new pipeline, select **New pipeline** from the Pipelines blade:\n",
"\n",
" ![New Pipeline](../images/newPipeline.png)\n",
"\n",
"2. You will be prompted with **Where is your code?**. Select **Azure Repos** followed by your repo.\n",
"\n",
"3. Select **Run**. Once the agent is allocated, you'll start seeing the live logs of the build.\n",
"\n",
"#### Notification\n",
"\n",
"The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure Pipelines with YAML\n",
"\n",
"You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure. \n",
"\n",
"The basic steps include:\n",
"\n",
"1. Configure Azure Pipelines to use your Git repo.\n",
"2. Edit your `azure-pipelines.yml` file to define your build.\n",
"3. Push your code to your version control repository which kicks off the default trigger to build and deploy.\n",
"4. Code is now updated, built, tested, and packaged. It can be deployed to any target.\n",
"\n",
"![Pipelines-Image-Yam](../images/pipelines-image-yaml.png)\n",
"\n",
"\n",
"Open the yml file in the repo to understand the build steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating test scripts\n",
"\n",
"In this workshop, multiple tests are included:\n",
"\n",
"1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.\n",
"\n",
"2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.\n",
"\n",
"#### Exercise\n",
"\n",
"- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types? \n",
"\n",
"- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?\n",
"\n",
"- Make sure `azure-pipelines.yml` captures the above changes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Build trigger (continuous deployment trigger)\n",
"\n",
"Along with the time triggers, we cann can also create a release every time a new build is available.\n",
"\n",
"1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:\n",
"\n",
"![Release Build Trigger](../images/releaseBuildTrigger.png)\n",
"\n",
"2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting \"master\" will trigger a release for every build from the master branch.\n",
"\n",
"#### Approvals\n",
"\n",
"For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:\n",
"\n",
"![Pending Approval](../images/pendingApproval.png)\n",
"\n",
"There is also provision to include comments with approval/reject:\n",
"\n",
"![Approval Comments](../images/approvalComments.png)\n",
"\n",
"Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines: \n",
"\n",
"![Release Passed](../images/releasePassed.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Application Insights (optional)\n",
"\n",
"For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.\n",
"\n",
"Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.\n",
"\n",
"```python\n",
"import requests\n",
"import json\n",
"\n",
"input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]\n",
"\n",
"data = json.dumps({'data': input_j})\n",
"test_sample = bytes(data, encoding = 'utf8')\n",
"\n",
"url = '<scoring_url>'\n",
"api_key = '<key>' \n",
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}\n",
"\n",
"resp = requests.post(url, test_sample, headers=headers)\n",
"print(resp.text)\n",
"\n",
"```\n",
"\n",
"1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:\n",
"\n",
"![ML Workspace](../images/mlworkspace.png)\n",
"\n",
"2. Select Analytics.\n",
"\n",
"3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:\n",
"\n",
"````\n",
" requests\n",
" | where timestamp > ago(3h)\n",
"````\n",
"\n",
"![LogAnalytics Query](../images/logAnalyticsQuery.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Changes\n",
"\n",
"A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:\n",
"\n",
"````python\n",
" print(\"Adding incremental data...\")\n",
" telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))\n",
" telemetry = telemetry.append(telemetry_incremental, ignore_index=True)\n",
"````\n",
"\n",
"The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

19
temp/devops/README.md Normal file
Просмотреть файл

@ -0,0 +1,19 @@
# Introduction
In this course, we will implement a Continuous Integration (CI)/Continuous Delivery (CD) pipeline for Anomaly Detection and Predictive Maintenance applications. For developing an AI application, there are frequently two streams of work:
1. Data Scientists building machine learning models
2. App developers building the application and exposing it to end users to consume
In short, the pipeline is designed to kick off for each new commit, run the test suite, if the test passes takes the latest build, packages it in a Docker container and then deploys to create a scoring service as shown below.
![Architecture](images/architecture.png)
## Modules Covered
The goal of this course is to cover the following modules:
* Introduction to CI/CD
* Create a CI/CD pipeline using Azure
* Customize a CI/CD pipeline using Azure
* Learn how to develop a Machine Learning pipeline to update models and create service

Просмотреть файл

@ -0,0 +1,71 @@
pool:
vmImage: 'Ubuntu 16.04'
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: 3.5
architecture: 'x64'
- task: DownloadSecureFile@1
inputs:
name: configFile
secureFile: config.json
- script: echo "Printing the secure file path"
- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)
- task: CondaEnvironment@1
displayName: 'Create Conda Environment '
inputs:
createCustomEnvironment: true
environmentName: azuremlsdk
packageSpecs: 'python=3.6'
updateConda: false
createOptions: 'cython==0.29 urllib3<1.24'
- script: |
pip install --user azureml-sdk pandas
displayName: 'Install prerequisites'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/pipeline.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/pipeline.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/evaluate_model.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/evaluate_model.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/register_model.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/register_model.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/create_scoring_image.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/create_scoring_image.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/deploy_aci.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/deploy_aci.py'
- task: AzureCLI@1
displayName: 'Azure CLI CICD/code/aci_service_test.py'
inputs:
azureSubscription: 'serviceConnection'
scriptLocation: inlineScript
inlineScript: 'python CICD/code/aci_service_test.py'
- script: |
python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv
displayName: 'Test Schema'

Просмотреть файл

@ -0,0 +1,7 @@
.ipynb_checkpoints
azureml-logs
.azureml
.git
outputs
azureml-setup
docs

Просмотреть файл

@ -0,0 +1,38 @@
import numpy
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
from azureml.core.webservice import AciWebservice
# Get workspace
ws = Workspace.from_config()
# Get the ACI Details
try:
with open("aml_config/aci_webservice.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
service_name = config['aci_name']
# Get the hosted web service
service=Webservice(name = service_name, workspace =ws)
# Input for Model with all features
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
print(input_j)
test_sample = json.dumps({'data': input_j})
test_sample = bytes(test_sample,encoding = 'utf8')
try:
prediction = service.run(input_data = test_sample)
print(prediction)
except Exception as e:
result = str(e)
print(result)
raise Exception('ACI service is not working as expected')

Просмотреть файл

@ -0,0 +1,43 @@
import numpy
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
# Get workspace
ws = Workspace.from_config()
# Get the AKS Details
os.chdir('./CICD')
try:
with open("aml_config/aks_webservice.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
service_name = config['aks_service_name']
# Get the hosted web service
service=Webservice(workspace=ws, name=service_name)
# Input for Model with all features
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
print(input_j)
test_sample = json.dumps({'data': input_j})
test_sample = bytes(test_sample,encoding = 'utf8')
try:
prediction = service.run(input_data = test_sample)
print(prediction)
except Exception as e:
result = str(e)
print(result)
raise Exception('AKS service is not working as expected')
# Delete aks after test
#service.delete()

Просмотреть файл

@ -0,0 +1,115 @@
import argparse
import pickle
import pandas as pd
import os
from pyculiarity import detect_ts # python port of Twitter AD lib
from pytictoc import TicToc # so we can time our operations
def rolling_average(df, column, n=24):
"""
Calculates rolling average according to Welford's online algorithm (Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition).
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_Online_algorithm
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
:param df: a dataframe with time series in columns
:param column: name of the column of interest
:param n: number of measurements to consider
:return: None
"""
ra = [0] * df.shape[0]
ra[0] = df[column].values[0]
for r in range(1, df.shape[0]):
curr_n = float(min(n, r))
ra[r] = ra[r-1] + (df[column].values[r] - ra[r-1])/curr_n
df = pd.DataFrame(data={'datetime': df['datetime'], 'value': ra})
return df
def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_value=False, direction='both'):
"""
This method performs the actual anomaly detection. Expecting the a dataframe with multiple sensors,
and a specification of which sensor to use for anomaly detection.
:param df: a dataframe with a timestamp column and one more columns with telemetry data
:param column: name of the column on which to perform AD
:param alpha: see pyculiarity documentation for the meaning of these parameters
:param max_anoms:
:param only_last:
:param longterm:
:param e_value:
:param direction:
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
"""
results = detect_ts(df,
max_anoms=max_anoms,
alpha=alpha,
direction=direction,
e_value=e_value,
longterm=longterm,
only_last=only_last)
return results['anoms']['timestamp'].values
parser = argparse.ArgumentParser("anom_detect")
parser.add_argument("--output_directory", type=str, help="output directory")
args = parser.parse_args()
print("output directory: %s" % args.output_directory)
os.makedirs(args.output_directory, exist_ok=True)
# public store of telemetry data
data_dir = 'https://sethmottstore.blob.core.windows.net/predmaint/'
print("Reading data ... ", end="")
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
print("Done.")
print("Adding incremental data...")
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
print("Done.")
print("Parsing datetime...", end="")
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
print("Done.")
window_size = 12 # how many measures to include in rolling average
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
machine_ids = telemetry['machineID'].unique()
t = TicToc()
for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is just here to allow us to test this
df = telemetry.loc[telemetry.loc[:, 'machineID'] == machine_id, :]
t.tic()
print("Working on sensor: ")
for s, sensor in enumerate(sensors):
N = window_sizes[s]
print(" %s " % sensor)
df_ra = rolling_average(df, sensor, N)
anoms_timestamps = do_ad(df_ra)
df_anoms = pd.DataFrame(data={'datetime': anoms_timestamps, 'machineID': [machine_id] * len(anoms_timestamps), 'errorID': [sensor] * len(anoms_timestamps)})
# if this is the first machine and sensor, we initialize a new dataframe
if machine_id == 1 and s == 0:
df_anoms_all = df_anoms
else: # otherwise we append the newly detected anomalies to the existing dataframe
df_anoms_all = df_anoms_all.append(df_anoms, ignore_index=True)
# store of output
obj = {}
obj["df_anoms"] = df_anoms_all
out_file = os.path.join(args.output_directory, "anoms.pkl")
with open(out_file, "wb") as fp:
pickle.dump(obj, fp)
t.toc("Processing machine %s took" % machine_id)

Просмотреть файл

@ -0,0 +1,337 @@
import json
import logging
import os
import random
import pandas as pd
from sklearn import datasets
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_auc_score
from sklearn.externals import joblib
import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun
from azureml.telemetry import set_diagnostics_collection
import azureml.core
import numpy as np
import argparse
import pickle
import json
import numpy as np
from sklearn import datasets
import logging
import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun
from azureml.core.run import Run
from azureml.telemetry import set_diagnostics_collection
import pandas as pd
import numpy as np
import urllib.request
import os
def download_data():
os.makedirs('../data', exist_ok = True)
container = 'https://sethmottstore.blob.core.windows.net/predmaint/'
urllib.request.urlretrieve(container + 'telemetry.csv', filename='../data/telemetry.csv')
urllib.request.urlretrieve(container + 'maintenance.csv', filename='../data/maintenance.csv')
urllib.request.urlretrieve(container + 'machines.csv', filename='../data/machines.csv')
urllib.request.urlretrieve(container + 'failures.csv', filename='../data/failures.csv')
# we replace errors.csv with anoms.csv (results from running anomaly detection)
# urllib.request.urlretrieve(container + 'errors.csv', filename='../data/errors.csv')
urllib.request.urlretrieve(container + 'anoms.csv', filename='../data/anoms.csv')
df_telemetry = pd.read_csv('../data/telemetry.csv', header=0)
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
df_errors = pd.read_csv('../data/anoms.csv', header=0)
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
df_errors = df_errors.replace({"errorID": rep_dir})
df_subset = df_errors.loc[(df_errors.datetime.between('2015-01-01', '2016-01-01')) & (df_errors.machineID == 1)]
df_subset.head()
df_fails = pd.read_csv('../data/failures.csv', header=0)
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format="%m/%d/%Y %I:%M:%S %p")
df_maint = pd.read_csv('../data/maintenance.csv', header=0)
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format="%m/%d/%Y %I:%M:%S %p")
df_machines = pd.read_csv('../data/machines.csv', header=0)
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
return df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
keys = ['machineID', 'datetime']
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
df_wide = df_wide.groupby(keys).sum().reset_index()
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
# run a rolling window through event flags to aggregate data
dummy_col_names = df_dummies.columns
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
df.reset_index(inplace=True)
df = df.loc[df.index % on == on-1]
df.reset_index(inplace=True, drop=True)
df_first = df.groupby('machineID', as_index=False).nth(0)
# calculate the time of the last event and the time elapsed since
for col in dummy_col_names:
whenlast, diffcol = 'last_' + col, 'd' + col
df.loc[:, col].fillna(value = 0, inplace=True)
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
# for the first occurence we don't know when it last happened, so we assume it happened then
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
df[whenlast].fillna(method='ffill', inplace=True)
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
df.drop(columns = whenlast, inplace=True)
if show_example == True:
col = np.random.choice(dummy_col_names, size = 1)[0]
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
print('Example:\n')
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
return df
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
"""
calculates rolling averages and standard deviations
Arguments:
df -- dataframe to run it on
colnames -- names of columns we want rolling statistics for
suffixes -- suffixes attached to the new columns (provide a list with strings)
window -- the lag over which rolling statistics are calculated
on -- the interval at which rolling statistics are calculated
groupby -- the column used to group results by
lagon -- the name of the datetime column used to compute lags (if none specified it defaults to row number)
Returns:
a dataframe with rolling statistics over a specified lag calculated over a specified interval
"""
rolling_colnames = [c + suffixes[0] for c in colnames]
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
df_rolling_mean.columns = rolling_colnames
df_rolling_mean.reset_index(inplace=True)
rolling_colnames = [c + suffixes[1] for c in colnames]
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
df_rolling_sd.columns = rolling_colnames
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
df_rolling_sd.reset_index(inplace=True, drop=True)
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
df_res = df_res.loc[df_res.index % on == on-1]
return df_res
parser = argparse.ArgumentParser("automl_train")
parser.add_argument("--input_directory", type=str, help="input directory")
args = parser.parse_args()
print("input directory: %s" % args.input_directory)
run = Run.get_context()
ws = run.experiment.workspace
def_data_store = ws.get_default_datastore()
# Choose a name for the experiment and specify the project folder.
experiment_name = 'automl-local-classification'
project_folder = '.'
experiment = Experiment(ws, experiment_name)
print("Location:", ws.location)
output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
pd.set_option('display.max_colwidth', -1)
pd.DataFrame(data=output, index=['']).T
set_diagnostics_collection(send_diagnostics=True)
print("SDK Version:", azureml.core.VERSION)
df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines = download_data()
with open(os.path.join(args.input_directory, "anoms.pkl"), "rb") as fp:
obj = pickle.load(fp)
df_errors = obj['df_anoms']
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
df_errors = df_errors.replace({"errorID": rep_dir})
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
df_join = pd.merge(left=df_maint, right=df_fails.rename(columns={'failure':'comp'}), how = 'outer', indicator=True,
on=['datetime', 'machineID', 'comp'], validate='one_to_one')
df_join.head()
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
# this will make it easier to automatically create features with the right column names
# df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
# df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
# df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
cols_to_average = df_telemetry.columns[-4:]
df_telemetry_rolling_3h = get_rolling_aggregates(df_telemetry, cols_to_average,
suffixes = ['_ma_3', '_sd_3'],
window = 3, on = 3,
groupby = 'machineID', lagon = 'datetime')
df_telemetry_rolling_12h = get_rolling_aggregates(df_telemetry, cols_to_average,
suffixes = ['_ma_12', '_sd_12'],
window = 12, on = 3,
groupby = 'machineID', lagon = 'datetime')
df_telemetry_rolling = pd.concat([df_telemetry_rolling_3h, df_telemetry_rolling_12h.drop(['machineID', 'datetime'], axis=1)], axis=1)
df_telemetry_feat_roll = df_left.merge(df_telemetry_rolling, how="inner", on=['machineID', 'datetime'], validate = "one_to_one")
df_telemetry_feat_roll.fillna(method='bfill', inplace=True)
df_telemetry_feat_roll.head()
del df_telemetry_rolling, df_telemetry_rolling_3h, df_telemetry_rolling_12h
df_errors_feat_roll = get_datetime_diffs(df_left, df_errors, catvar='errorID', prefix='e', window = 6, lagon = 'datetime', on = 3)
df_errors_feat_roll.tail()
df_errors_feat_roll.loc[df_errors_feat_roll['machineID'] == 2, :].head()
df_maint_feat_roll = get_datetime_diffs(df_left, df_maint, catvar='comp', prefix='m',
window = 6, lagon = 'datetime', on = 3, show_example=False)
df_maint_feat_roll.tail()
df_maint_feat_roll.loc[df_maint_feat_roll['machineID'] == 2, :].head()
df_fails_feat_roll = get_datetime_diffs(df_left, df_fails, catvar='failure', prefix='f',
window = 6, lagon = 'datetime', on = 3, show_example=False)
df_fails_feat_roll.tail()
assert(df_errors_feat_roll.shape[0] == df_fails_feat_roll.shape[0] == df_maint_feat_roll.shape[0] == df_telemetry_feat_roll.shape[0])
df_all = pd.concat([df_telemetry_feat_roll,
df_errors_feat_roll.drop(columns=['machineID', 'datetime']),
df_maint_feat_roll.drop(columns=['machineID', 'datetime']),
df_fails_feat_roll.drop(columns=['machineID', 'datetime'])], axis = 1, verify_integrity=True)
# df_all = pd.merge(left=df_telemetry_feat_roll, right=df_all, on = ['machineID', 'datetime'], validate='one_to_one')
df_all = pd.merge(left=df_all, right=df_machines, how="left", on='machineID', validate = 'many_to_one')
del df_join, df_left
del df_telemetry_feat_roll, df_errors_feat_roll, df_fails_feat_roll, df_maint_feat_roll
for i in range(1, 5): # iterate over the four components
# find all the times a component failed for a given machine
df_temp = df_all.loc[df_all['f_' + str(i)] == 1, ['machineID', 'datetime']]
label = 'y_' + str(i) # name of target column (one per component)
df_all[label] = 0
for n in range(df_temp.shape[0]): # iterate over all the failure times
machineID, datetime = df_temp.iloc[n, :]
dt_end = datetime - pd.Timedelta('3 hours') # 3 hours prior to failure
dt_start = datetime - pd.Timedelta('2 days') # n days prior to failure
if n % 500 == 0:
print("a failure occured on machine {0} at {1}, so {2} is set to 1 between {4} and {3}".format(machineID, datetime, label, dt_end, dt_start))
df_all.loc[(df_all['machineID'] == machineID) &
(df_all['datetime'].between(dt_start, dt_end)), label] = 1
df_all.columns
X_drop = ['datetime', 'machineID', 'f_1', 'f_2', 'f_3', 'f_4', 'y_1', 'y_2', 'y_3', 'y_4', 'model']
Y_keep = ['y_1', 'y_2', 'y_3', 'y_4']
X_train = df_all.loc[df_all['datetime'] < '2015-10-01', ].drop(X_drop, axis=1)
y_train = df_all.loc[df_all['datetime'] < '2015-10-01', Y_keep]
X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
primary_metric = 'AUC_weighted'
automl_config = AutoMLConfig(task = 'classification',
preprocess = False,
name = experiment_name,
debug_log = 'automl_errors.log',
primary_metric = primary_metric,
max_time_sec = 1200,
iterations = 2,
n_cross_validations = 2,
verbosity = logging.INFO,
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
path = project_folder, )
local_run = experiment.submit(automl_config, show_output = True)
# Wait until the run finishes.
local_run.wait_for_completion(show_output = True)
# create new AutoMLRun object to ensure everything is in order
ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)
# aux function for comparing performance of runs (quick workaround for automl's _get_max_min_comparator)
def maximize(x, y):
if x >= y:
return x
else:
return y
# next couple of lines are stripped down version of automl's get_output
children = list(ml_run.get_children())
best_run = None # will be child run with best performance
best_score = None # performance of that child run
for child in children:
candidate_score = child.get_metrics()[primary_metric]
if not np.isnan(candidate_score):
if best_score is None:
best_score = candidate_score
best_run = child
else:
new_score = maximize(best_score, candidate_score)
if new_score != best_score:
best_score = new_score
best_run = child
# print accuracy
best_accuracy = best_run.get_metrics()['accuracy']
print("Best run accuracy:", best_accuracy)
# download model and save to pkl
model_path = "outputs/model.pkl"
best_run.download_file(name=model_path, output_file_path=model_path)
# Writing the run id to /aml_config/run_id.json
run_id = {}
run_id['run_id'] = best_run.id
run_id['experiment_name'] = best_run.experiment.name
# save run info
os.makedirs('aml_config', exist_ok = True)
with open('aml_config/run_id.json', 'w') as outfile:
json.dump(run_id, outfile)
# upload run info and model (pkl) to def_data_store, so that pipeline mast can access it
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)

Просмотреть файл

@ -0,0 +1,57 @@
import os, json, sys
from azureml.core import Workspace
from azureml.core.image import ContainerImage, Image
from azureml.core.model import Model
# Get workspace
ws = Workspace.from_config()
# Get the latest model details
try:
with open("aml_config/model.json") as f:
config = json.load(f)
except:
print('No new model to register thus no need to create new scoring image')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
model_name = config['model_name']
model_version = config['model_version']
model_list = Model.list(workspace=ws)
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
os.chdir('./CICD/code/scoring')
image_name = "predmaintenance-model-score"
image_config = ContainerImage.image_configuration(execution_script = "score.py",
runtime = "python-slim",
conda_file = "conda_dependencies.yml",
description = "Image with predictive maintenance model",
tags = {'area': "diabetes", 'type': "regression"}
)
image = Image.create(name = image_name,
models = [model],
image_config = image_config,
workspace = ws)
image.wait_for_creation(show_output = True)
os.chdir('../../../')
if image.creation_state != 'Succeeded':
raise Exception('Image creation status: {image.creation_state}')
print('{}(v.{} [{}]) stored at {} with build log {}'.format(image.name, image.version, image.creation_state, image.image_location, image.image_build_log_uri))
# Writing the image details to /aml_config/image.json
image_json = {}
image_json['image_name'] = image.name
image_json['image_version'] = image.version
image_json['image_location'] = image.image_location
with open('aml_config/image.json', 'w') as outfile:
json.dump(image_json,outfile)

Просмотреть файл

Просмотреть файл

@ -0,0 +1,51 @@
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
from azureml.core.webservice import AciWebservice
# Get workspace
ws = Workspace.from_config()
# Get the Image to deploy details
try:
with open("aml_config/image.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
image_name = config['image_name']
image_version = config['image_version']
images = Image.list(workspace=ws)
image, = (m for m in images if m.version==image_version and m.name == image_name)
print('From image.json, Image used to deploy webservice on ACI: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,
memory_gb=1,
tags={'area': "pred-maintenance", 'type': "automl"},
description='A sample description')
aci_service_name='aciwebservice'+ datetime.datetime.now().strftime('%m%d%H')
service = Webservice.deploy_from_image(deployment_config=aciconfig,
image=image,
name=aci_service_name,
workspace=ws)
service.wait_for_deployment()
print('Deployed ACI Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
#service=Webservice(name ='aciws0622', workspace =ws)
# Writing the ACI details to /aml_config/aci_webservice.json
aci_webservice = {}
aci_webservice['aci_name'] = service.name
aci_webservice['aci_url'] = service.scoring_uri
with open('aml_config/aci_webservice.json', 'w') as outfile:
json.dump(aci_webservice,outfile)

Просмотреть файл

@ -0,0 +1,76 @@
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.webservice import Webservice, AksWebservice
# Get workspace
ws = Workspace.from_config()
# Get the Image to deploy details
try:
with open("aml_config/image.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
image_name = config['image_name']
image_version = config['image_version']
images = Image.list(workspace=ws)
image, = (m for m in images if m.version==image_version and m.name == image_name)
print('From image.json, Image used to deploy webservice: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
# Check if AKS already Available
try:
with open("aml_config/aks_webservice.json") as f:
config = json.load(f)
aks_name = config['aks_name']
aks_service_name = config['aks_service_name']
compute_list = ws.compute_targets()
aks_target, =(c for c in compute_list if c.name ==aks_name)
service=Webservice(name =aks_service_name, workspace =ws)
print('Updating AKS service {} with image: {}'.format(aks_service_name,image.image_location))
service.update(image=image)
except:
aks_name = 'aks'+ datetime.datetime.now().strftime('%m%d%H')
aks_service_name = 'akswebservice'+ datetime.datetime.now().strftime('%m%d%H')
prov_config = AksCompute.provisioning_configuration(agent_count = 6, vm_size = 'Standard_F2', location='eastus')
print('No AKS found in aks_webservice.json. Creating new Aks: {} and AKS Webservice: {}'.format(aks_name,aks_service_name))
# Create the cluster
aks_target = ComputeTarget.create(workspace = ws,
name = aks_name,
provisioning_configuration = prov_config)
aks_target.wait_for_completion(show_output = True)
print(aks_target.provisioning_state)
print(aks_target.provisioning_errors)
# Use the default configuration (can also provide parameters to customize)
aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)
service = Webservice.deploy_from_image(workspace = ws,
name = aks_service_name,
image = image,
deployment_config = aks_config,
deployment_target = aks_target)
service.wait_for_deployment(show_output = True)
print(service.state)
print('Deployed AKS Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
# Writing the AKS details to /aml_config/aks_webservice.json
aks_webservice = {}
aks_webservice['aks_name'] = aks_name
aks_webservice['aks_service_name'] = service.name
aks_webservice['aks_url'] = service.scoring_uri
aks_webservice['aks_keys'] = service.get_keys()
with open('aml_config/aks_webservice.json', 'w') as outfile:
json.dump(aks_webservice,outfile)

Просмотреть файл

@ -0,0 +1,59 @@
import os, json
from azureml.core import Workspace
from azureml.core import Experiment
from azureml.core.model import Model
import azureml.core
from azureml.core import Run
# Get workspace
ws = Workspace.from_config()
# Paramaterize the matrics on which the models should be compared
# Add golden data set on which all the model performance can be evaluated
# Get the latest run_id
with open("aml_config/run_id.json") as f:
config = json.load(f)
new_model_run_id = config["run_id"]
experiment_name = config["experiment_name"]
exp = Experiment(workspace = ws, name = experiment_name)
try:
# Get most recently registered model, we assume that is the model in production. Download this model and compare it with the recently trained model by running test with same data set.
model_list = Model.list(ws)
production_model = next(filter(lambda x: x.created_time == max(model.created_time for model in model_list), model_list))
production_model_run_id = production_model.tags.get('run_id')
run_list = exp.get_runs()
# production_model_run = next(filter(lambda x: x.id == production_model_run_id, run_list))
# Get the run history for both production model and newly trained model and compare mse
production_model_run = Run(exp,run_id=production_model_run_id)
new_model_run = Run(exp,run_id=new_model_run_id)
production_model_metric = production_model_run.get_metrics().get('accuracy')
new_model_metric = new_model_run.get_metrics().get('accuracy')
print('Current Production model accuracy: {}, New trained model accuracy: {}'.format(production_model_metric, new_model_metric))
promote_new_model=False
if new_model_metric < production_model_metric:
promote_new_model=True
print('New trained model performs better, thus it will be registered')
except:
promote_new_model=True
print('This is the first model to be trained, thus nothing to evaluate for now')
run_id = {}
run_id['run_id'] = ''
# Writing the run id to /aml_config/run_id.json
if promote_new_model:
run_id['run_id'] = new_model_run_id
run_id['experiment_name'] = experiment_name
with open('aml_config/run_id.json', 'w') as outfile:
json.dump(run_id,outfile)

Просмотреть файл

@ -0,0 +1,150 @@
############################### load required libraries
import os
import pandas as pd
import json
import azureml.core
from azureml.core import Workspace, Run, Experiment, Datastore
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.runconfig import CondaDependencies, RunConfiguration
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
from azureml.telemetry import set_diagnostics_collection
from azureml.pipeline.steps import PythonScriptStep
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
print("SDK Version:", azureml.core.VERSION)
############################### load workspace and create experiment
ws = Workspace.from_config()
print('Workspace name: ' + ws.name,
'Subscription id: ' + ws.subscription_id,
'Resource group: ' + ws.resource_group, sep = '\n')
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
project_folder = '.' # project folder
experiment = Experiment(ws, experiment_name)
print("Location:", ws.location)
output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
pd.set_option('display.max_colwidth', -1)
pd.DataFrame(data = output, index = ['']).T
set_diagnostics_collection(send_diagnostics=True)
############################### create a run config
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
amlcompute_run_config.environment.docker.enabled = False
amlcompute_run_config.environment.docker.gpu_support = False
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
amlcompute_run_config.environment.spark.precache_packages = False
############################### create AML compute
aml_compute_target = "aml-compute"
try:
aml_compute = AmlCompute(ws, aml_compute_target)
print("found existing compute target.")
except:
print("creating new compute target")
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
idle_seconds_before_scaledown=1800,
min_nodes = 0,
max_nodes = 4)
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
print("Azure Machine Learning Compute attached")
############################### point to data and scripts
# we use this for exchanging data between pipeline steps
def_data_store = ws.get_default_datastore()
# get pointer to default blob store
def_blob_store = Datastore(ws, "workspaceblobstore")
print("Blobstore's name: {}".format(def_blob_store.name))
# Naming the intermediate data as anomaly data and assigning it to a variable
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
print("Anomaly data object created")
# model = PipelineData("model", datastore = def_data_store)
# print("Model data object created")
anom_detect = PythonScriptStep(name = "anomaly_detection",
# script_name="anom_detect.py",
script_name = "CICD/code/anom_detect.py",
arguments = ["--output_directory", anomaly_data],
outputs = [anomaly_data],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("Anomaly Detection Step created.")
automl_step = PythonScriptStep(name = "automl_step",
# script_name = "automl_step.py",
script_name = "CICD/code/automl_step.py",
arguments = ["--input_directory", anomaly_data],
inputs = [anomaly_data],
# outputs = [model],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("AutoML Training Step created.")
############################### set up, validate and run pipeline
steps = [anom_detect, automl_step]
print("Step lists created")
pipeline = Pipeline(workspace = ws, steps = steps)
print ("Pipeline is built")
pipeline.validate()
print("Pipeline validation complete")
pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)
print("Pipeline is submitted for execution")
# Wait until the run finishes.
pipeline_run.wait_for_completion(show_output = False)
print("Pipeline run completed")
############################### upload artifacts to AML Workspace
# Download aml_config info and output of automl_step
def_data_store.download(target_path = '.',
prefix = 'aml_config',
show_progress = True,
overwrite = True)
def_data_store.download(target_path = '.',
prefix = 'outputs',
show_progress = True,
overwrite = True)
print("Updated aml_config and outputs folder")
model_fname = 'model.pkl'
model_path = os.path.join("outputs", model_fname)
# Upload the model file explicitly into artifacts (for CI/CD)
pipeline_run.upload_file(name = model_path, path_or_stream = model_path)
print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))

Просмотреть файл

@ -0,0 +1,58 @@
import os, json,sys
from azureml.core import Workspace
from azureml.core import Run
from azureml.core import Experiment
from azureml.core.model import Model
from azureml.core.runconfig import RunConfiguration
# Get workspace
ws = Workspace.from_config()
# Get the latest evaluation result
try:
with open("aml_config/run_id.json") as f:
config = json.load(f)
if not config["run_id"]:
raise Exception('No new model to register as production model perform better')
except:
print('No new model to register as production model perform better')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
run_id = config["run_id"]
experiment_name = config["experiment_name"]
exp = Experiment(workspace = ws, name = experiment_name)
run = Run(experiment = exp, run_id = run_id)
names=run.get_file_names
names()
print('Run ID for last run: {}'.format(run_id))
model_local_dir="model"
os.makedirs(model_local_dir,exist_ok=True)
# Download Model to Project root directory
model_name= 'model.pkl'
run.download_file(name = './outputs/'+model_name,
output_file_path = './model/'+model_name)
print('Downloaded model {} to Project root directory'.format(model_name))
os.chdir('./model')
model = Model.register(model_path = model_name, # this points to a local file
model_name = model_name, # this is the name the model is registered as
tags = {'area': "predictive maintenance", 'type': "automl", 'run_id' : run_id},
description="Model for predictive maintenance dataset",
workspace = ws)
os.chdir('..')
print('Model registered: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
# Remove the evaluate.json as we no longer need it
# os.remove("aml_config/evaluate.json")
# Writing the registered model details to /aml_config/model.json
model_json = {}
model_json['model_name'] = model.name
model_json['model_version'] = model.version
model_json['run_id'] = run_id
with open('aml_config/model.json', 'w') as outfile:
json.dump(model_json,outfile)

Просмотреть файл

@ -0,0 +1,13 @@
name: myenv
channels:
- defaults
dependencies:
- python=3.6.2
- pip:
- scikit-learn==0.19.1
- azureml-sdk[automl]
- azureml-monitoring
- pyculiarity
- scipy
- numpy
- pandas

Просмотреть файл

@ -0,0 +1,301 @@
import datetime
import pandas as pd
from pyculiarity import detect_ts
import os
import pickle
import json
from sklearn.externals import joblib
from azureml.core.model import Model
import azureml.train.automl
from azureml.monitoring import ModelDataCollector
import time
import glob
import numpy as np
import scipy
def create_data_dict(data, sensors):
"""
:param data:
:return:
"""
data_dict = {}
for column in data.columns:
data_dict[column] = [data[column].values[0]]
if column in sensors:
data_dict[column + '_avg'] = [0.0]
data_dict[column + '_an'] = [False]
return data_dict
def init_df():
"""
Init DataFrame from one row of data
:param data:
:return:
"""
# data_dict = create_data_dict(data)
df = pd.DataFrame() #data=data_dict, index=data_dict['timestamp'])
return df
def append_data(df, data, sensors):
"""
We either add the data and the results (res_dict) of the anomaly detection to the existing data frame,
or create a new one if the data frame is empty
"""
data_dict = create_data_dict(data, sensors)
#todo, this is only necessary, because currently the webservice doesn't get any timestamps
if df.shape[0] == 0:
prv_timestamp = datetime.datetime(2015, 1, 1, 5, 0) # 1/1/2015 6:00:00 AM
else:
prv_timestamp = df['timestamp'].max()
data_dict['timestamp'] = [prv_timestamp + datetime.timedelta(hours=1)]
df = df.append(pd.DataFrame(data=data_dict, index=data_dict['timestamp']))
return df
def generate_stream(telemetry, n=None):
"""
n is the number of sensor readings we are simulating
"""
if not n:
n = telemetry.shape[0]
machine_ids = [1] # telemetry['machineID'].unique()
timestamps = telemetry['timestamp'].unique()
# sort test_data by timestamp
# on every iteration, shuffle machine IDs
# then loop over machine IDs
#t = TicToc()
for timestamp in timestamps:
#t.tic()
np.random.shuffle(machine_ids)
for machine_id in machine_ids:
data = telemetry.loc[(telemetry['timestamp'] == timestamp) & (telemetry['machineID'] == machine_id), :]
run(data)
#t.toc("Processing all machines took")
def load_df(data):
machineID = data['machineID'].values[0]
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machineID)
if os.path.exists(filename):
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format="%Y-%m-%d %H:%M:%S")
else:
df = pd.DataFrame()
return df
def save_df(df):
"""
:param df:
:return:
"""
machine_id = df.ix[0, 'machineID']
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machine_id)
df.to_csv(filename, index=False)
def running_avgs(df, sensors, window_size=24, only_copy=False):
"""
Calculates rolling average according to Welford's online algorithm.
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
:param df: a dataframe with time series in columns
:param column: name of the column of interest
:param n: number of measurements to consider
:return: None
"""
curr_n = df.shape[0]
row_index = curr_n - 1
window_size = min(window_size, curr_n)
for sensor in sensors:
val_col_index = df.columns.get_loc(sensor)
avg_col_index = df.columns.get_loc(sensor + "_avg")
curr_value = df.ix[row_index, val_col_index]
if curr_n == 0 or only_copy:
df.ix[row_index, avg_col_index] = curr_value
else:
prv_avg = df.ix[(row_index -1), avg_col_index]
df.ix[row_index, avg_col_index] = prv_avg + (curr_value - prv_avg) / window_size
def init():
global model
global prediction_dc
global storage_location
storage_location = "/tmp/output"
if not os.path.exists(storage_location):
os.makedirs(storage_location)
# next, we delete previous output files
files = glob.glob(os.path.join(storage_location,'*'))
for f in files:
os.remove(f)
model_name = "model.pkl"
model_path = Model.get_model_path(model_name = model_name)
# deserialize the model file back into a sklearn model
model = joblib.load(model_path)
prediction_dc = ModelDataCollector("automl_model", identifier="predictions", feature_names=["prediction"])
def run(rawdata, window=14 * 24):
"""
:param data:
:param window:
:return:
"""
try:
# set some parameters for the AD algorithm
alpha = 0.1
max_anoms = 0.05
only_last = None # alternative, we can set this to 'hr' or 'day'
json_data = json.loads(rawdata)['data']
# this is the beginning of anomaly detection code
# TODO: the anomaly detection service expected one row of a pd.DataFrame w/ a timestamp and machine id, but here we only get a list of values
# we therefore create a time stamp ourselves
# and create a data frame that the anomaly detection code can understand
# eventually, we want this to be harmonized!
timestamp = time.strftime("%m/%d/%Y %H:%M:%S", time.localtime())
machineID = 1 # TODO scipy.random.choice(100)
telemetry_data = json_data[0][8:16:2]
sensors = ['volt','pressure','vibration', 'rotate']
data_dict = {}
data_dict['timestamp'] = [timestamp]
data_dict['machineID'] = [machineID]
for i in range(0,4):
data_dict[sensors[i]] = [telemetry_data[i]]
telemetry_df = pd.DataFrame(data=data_dict)
telemetry_df['timestamp'] = pd.to_datetime(telemetry_df['timestamp'])
# load dataframe
df = load_df(telemetry_df)
# add current sensor readings to data frame, also adds fields for anomaly detection results
df = append_data(df, telemetry_df, sensors)
# # calculate running averages (no need to do this here, because we are already sending preprocessed data)
# # TODO: this is disabled for now, because we are dealing with pre-processed data
# running_avgs(df, sensors, only_copy=True)
# note timestamp so that we can update the correct row of the dataframe later
timestamp = df['timestamp'].max()
# we get a copy of the current (also last) row of the dataframe
current_row = df.loc[df['timestamp'] == timestamp, :]
# determine how many sensor readings we already have
rows = df.shape[0]
# if the data frame doesn't have enough rows for our sliding window size, we just return (setting that we have no
# anomalies)
if rows < window:
save_df(df)
json_data = current_row.to_json()
return json.dumps({"result": [0]})
# determine the first row of the data frame that falls into the sliding window
start_row = rows - window
# a flag to indicate whether we detected an anomaly in any of the sensors after this reading
detected_an_anomaly = False
anom_list = []
# we loop over the sensor columns
for column in sensors:
df_s = df.ix[start_row:rows, ('timestamp', column + "_avg")]
# pyculiarity expects two columns with particular names
df_s.columns = ['timestamp', 'value']
# we reset the timestamps, so that the current measurement is the last within the sliding time window
# df_s = reset_time(df_s)
# calculate the median value within each time sliding window
# values = df_s.groupby(df_s.index.date)['value'].median()
# create dataframe with median values etc.
# df_agg = pd.DataFrame(data={'timestamp': pd.to_datetime(values.index), 'value': values})
# find anomalies
results = detect_ts(df_s, max_anoms=max_anoms,
alpha=alpha,
direction='both',
e_value=False,
only_last=only_last)
# create a data frame where we mark for each day whether it was an anomaly
df_s = df_s.merge(results['anoms'], on='timestamp', how='left')
# mark the current sensor reading as anomaly Specifically, if we get an anomaly in the the sliding window
# leading up (including) the current sensor reading, we mark the current sensor reading as anomaly note,
# alternatively one could mark all the sensor readings that fall within the sliding window as anomalies.
# However, we prefer our approach, because without the current sensor reading the other sensor readings in
# this sliding window may not have been an anomaly
# current_row[column + '_an'] = not np.isnan(df_agg.tail(1)['anoms'].iloc[0])
if not np.isnan(df_s.tail(1)['anoms'].iloc[0]):
current_row.ix[0,column + '_an'] = True
detected_an_anomaly = True
anom_list.append(1.0)
else:
anom_list.append(0.0)
# It's only necessary to update the current row in the data frame, if we detected an anomaly
if detected_an_anomaly:
df.loc[df['timestamp'] == timestamp, :] = current_row
save_df(df)
json_data[0][8:16:2] = anom_list
# # this is the end of anomaly detection code
data = np.array(json_data)
result = model.predict(data)
prediction_dc.collect(result)
print ("saving prediction data" + time.strftime("%H:%M:%S"))
except Exception as e:
result = str(e)
return json.dumps({"error": result})
return json.dumps({"result":result.tolist()})

Просмотреть файл

@ -0,0 +1,32 @@
# test integrity of the input data
import sys
import os
import numpy as np
import pandas as pd
# number of features
n_columns = 37
def check_schema(X):
n_actual_columns = X.shape[1]
if n_actual_columns != n_columns:
print("Error: found {} feature columns. The data should have {} feature columns.".format(n_actual_columns, n_columns))
return False
return True
def main():
filename = sys.argv[1]
if not os.path.exists(filename):
print("Error: The file {} does not exist".format(filename))
return
dataset = pd.read_csv(filename)
if check_schema(dataset[dataset.columns[:-1]]):
print("Data schema test succeeded")
else:
print("Data schema test failed")
if __name__ == "__main__":
main()

Просмотреть файл

@ -0,0 +1,7 @@
.ipynb_checkpoints
azureml-logs
.azureml
.git
outputs
azureml-setup
docs

Просмотреть файл

@ -0,0 +1,38 @@
import numpy
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
from azureml.core.webservice import AciWebservice
# Get workspace
ws = Workspace.from_config()
# Get the ACI Details
try:
with open("aml_config/aci_webservice.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
service_name = config['aci_name']
# Get the hosted web service
service=Webservice(name = service_name, workspace =ws)
# Input for Model with all features
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
print(input_j)
test_sample = json.dumps({'data': input_j})
test_sample = bytes(test_sample,encoding = 'utf8')
try:
prediction = service.run(input_data = test_sample)
print(prediction)
except Exception as e:
result = str(e)
print(result)
raise Exception('ACI service is not working as expected')

Просмотреть файл

@ -0,0 +1,43 @@
import numpy
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
# Get workspace
ws = Workspace.from_config()
# Get the AKS Details
os.chdir('./devops')
try:
with open("aml_config/aks_webservice.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
service_name = config['aks_service_name']
# Get the hosted web service
service=Webservice(workspace=ws, name=service_name)
# Input for Model with all features
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
print(input_j)
test_sample = json.dumps({'data': input_j})
test_sample = bytes(test_sample,encoding = 'utf8')
try:
prediction = service.run(input_data = test_sample)
print(prediction)
except Exception as e:
result = str(e)
print(result)
raise Exception('AKS service is not working as expected')
# Delete aks after test
#service.delete()

Просмотреть файл

@ -0,0 +1,65 @@
import argparse
import pickle
import pandas as pd
import os
from pyculiarity import detect_ts # python port of Twitter AD lib
from pytictoc import TicToc # so we can time our operations
parser = argparse.ArgumentParser("anom_detect")
parser.add_argument("--output_directory", type = str, help = "output directory")
args = parser.parse_args()
print("output directory: %s" % args.output_directory)
os.makedirs(args.output_directory, exist_ok = True)
# public store of telemetry data
data_dir = 'https://coursematerial.blob.core.windows.net/data/telemetry'
print("Reading data ... ", end = "")
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
print("Done.")
print("Adding incremental data...")
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
telemetry = telemetry.append(telemetry_incremental, ignore_index = True)
print("Done.")
print("Parsing datetime...", end = "")
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
print("Done.")
window_size = 12 # how many measures to include in rolling average
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
machine_ids = telemetry['machineID'].unique()
t = TicToc()
for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is just here to allow us to test this
df = telemetry.loc[telemetry.loc[:, 'machineID'] == machine_id, :]
t.tic()
print("Working on sensor: ")
for s, sensor in enumerate(sensors):
N = window_sizes[s]
print(" %s " % sensor)
df_ra = rolling_average(df, sensor, N)
anoms_timestamps = do_ad(df_ra)
df_anoms = pd.DataFrame(data = {'datetime': anoms_timestamps, 'machineID': [machine_id] * len(anoms_timestamps), 'errorID': [sensor] * len(anoms_timestamps)})
# if this is the first machine and sensor, we initialize a new dataframe
if machine_id == 1 and s == 0:
df_anoms_all = df_anoms
else: # otherwise we append the newly detected anomalies to the existing dataframe
df_anoms_all = df_anoms_all.append(df_anoms, ignore_index = True)
# store of output
obj = {}
obj["df_anoms"] = df_anoms_all
out_file = os.path.join(args.output_directory, "anoms.pkl")
with open(out_file, "wb") as fp:
pickle.dump(obj, fp)
t.toc("Processing machine %s took" % machine_id)

Просмотреть файл

@ -0,0 +1,326 @@
############################### load required libraries
import argparse
import json
import logging
import numpy as np
import os
import pandas as pd
import pickle
import random
import urllib.request
from sklearn import datasets
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_auc_score
from sklearn.externals import joblib
import azureml.core
from azureml.core.run import Run
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun
from azureml.telemetry import set_diagnostics_collection
############################### set up experiment
parser = argparse.ArgumentParser("automl_train")
parser.add_argument("--input_directory", default = "data", type = str, help = "input directory")
args = parser.parse_args()
print("input directory: %s" % args.input_directory)
run = Run.get_context()
ws = run.experiment.workspace
# Choose a name for the experiment and specify the project folder.
experiment_name = 'automl-local-classification'
project_folder = '.'
experiment = Experiment(ws, experiment_name)
output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
print("Run info:", output)
set_diagnostics_collection(send_diagnostics = True)
############################### define functions
def download_data():
"""
download the anomaly detection and predictive maintenance data
:return: all the data
"""
os.makedirs('../data', exist_ok = True)
container = 'https://coursematerial.blob.core.windows.net/data/telemetry/'
urllib.request.urlretrieve(container + 'telemetry.csv', filename = '../data/telemetry.csv')
urllib.request.urlretrieve(container + 'maintenance.csv', filename = '../data/maintenance.csv')
urllib.request.urlretrieve(container + 'machines.csv', filename = '../data/machines.csv')
urllib.request.urlretrieve(container + 'failures.csv', filename = '../data/failures.csv')
# we replace errors.csv with anoms.csv (results from running anomaly detection)
# urllib.request.urlretrieve(container + 'errors.csv', filename = '../data/errors.csv')
urllib.request.urlretrieve(container + 'anoms.csv', filename = '../data/anoms.csv')
df_telemetry = pd.read_csv('../data/telemetry.csv', header = 0)
df_errors = pd.read_csv('../data/anoms.csv', header = 0)
df_fails = pd.read_csv('../data/failures.csv', header = 0)
df_maint = pd.read_csv('../data/maintenance.csv', header = 0)
df_machines = pd.read_csv('../data/machines.csv', header = 0)
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
df_errors = df_errors.replace({"errorID": rep_dir})
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
return df_telemetry, df_errors, df_fails, df_maint, df_machines
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
"""
calculates rolling averages and standard deviations
:param df: dataframe to run it on
:param colnames: names of columns we want rolling statistics for
:param suffixes: suffixes attached to the new columns (provide a list with strings)
:param window: the lag over which rolling statistics are calculated
:param on: the interval at which rolling statistics are calculated
:param groupby: the column used to group results by
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
"""
rolling_colnames = [c + suffixes[0] for c in colnames]
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
df_rolling_mean.columns = rolling_colnames
df_rolling_mean.reset_index(inplace=True)
rolling_colnames = [c + suffixes[1] for c in colnames]
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
df_rolling_sd.columns = rolling_colnames
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
df_rolling_sd.reset_index(inplace=True, drop=True)
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
df_res = df_res.loc[df_res.index % on == on-1]
return df_res
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
"""
calculates for every timestamp the time elapsed since the last time an event occured where an event is either an error or anomaly, maintenance, or failure
:param df_left: the telemetry data collected at regular intervals
:param df_right: the event data collected at irregular intervals
:param catvar: the name of the categorical column that encodes the event
:param prefix: the prefix for the new column showing time elapsed
:param window: window size for detecting event
:param on: frequency we want the results to be in
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
:param diff_type: the unit we want time differences to be measured in (hour by default)
:param validate: whether we should validate results
:param show_example: whether we should show an example to check that things are working
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
"""
keys = ['machineID', 'datetime']
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
df_wide = df_wide.groupby(keys).sum().reset_index()
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
# run a rolling window through event flags to aggregate data
dummy_col_names = df_dummies.columns
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
df.reset_index(inplace=True)
df = df.loc[df.index % on == on-1]
df.reset_index(inplace=True, drop=True)
df_first = df.groupby('machineID', as_index=False).nth(0)
# calculate the time of the last event and the time elapsed since
for col in dummy_col_names:
whenlast, diffcol = 'last_' + col, 'd' + col
df.loc[:, col].fillna(value = 0, inplace=True)
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
# for the first occurence we don't know when it last happened, so we assume it happened then
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
df[whenlast].fillna(method='ffill', inplace=True)
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
df.drop(columns = whenlast, inplace=True)
if show_example == True:
col = np.random.choice(dummy_col_names, size = 1)[0]
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
print('Example:\n')
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
return df
############################### get and preprocess data
df_telemetry, df_errors, df_fails, df_maint, df_machines = download_data()
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
cols_to_average = df_telemetry.columns[-4:]
df_telemetry_rolling_3h = get_rolling_aggregates(df_telemetry, cols_to_average,
suffixes = ['_ma_3', '_sd_3'],
window = 3, on = 3,
groupby = 'machineID', lagon = 'datetime')
df_telemetry_rolling_12h = get_rolling_aggregates(df_telemetry, cols_to_average,
suffixes = ['_ma_12', '_sd_12'],
window = 12, on = 3,
groupby = 'machineID', lagon = 'datetime')
df_telemetry_rolling = pd.concat([df_telemetry_rolling_3h, df_telemetry_rolling_12h.drop(['machineID', 'datetime'], axis=1)], axis=1)
df_telemetry_feat_roll = df_left.merge(df_telemetry_rolling, how = "inner", on = ['machineID', 'datetime'], validate = "one_to_one")
df_telemetry_feat_roll.fillna(method = 'bfill', inplace = True)
# df_telemetry_feat_roll.head()
del df_telemetry_rolling, df_telemetry_rolling_3h, df_telemetry_rolling_12h
df_errors_feat_roll = get_datetime_diffs(df_left, df_errors, catvar = 'errorID', prefix = 'e', window = 6, lagon = 'datetime', on = 3)
# df_errors_feat_roll.tail()
df_errors_feat_roll.loc[df_errors_feat_roll['machineID'] == 2, :].head()
df_maint_feat_roll = get_datetime_diffs(df_left, df_maint, catvar = 'comp', prefix = 'm',
window = 6, lagon = 'datetime', on = 3, show_example = False)
# df_maint_feat_roll.tail()
df_maint_feat_roll.loc[df_maint_feat_roll['machineID'] == 2, :].head()
df_fails_feat_roll = get_datetime_diffs(df_left, df_fails, catvar = 'failure', prefix = 'f',
window = 6, lagon = 'datetime', on = 3, show_example = False)
# df_fails_feat_roll.tail()
assert(df_errors_feat_roll.shape[0] == df_fails_feat_roll.shape[0] == df_maint_feat_roll.shape[0] == df_telemetry_feat_roll.shape[0])
df_all = pd.concat([df_telemetry_feat_roll,
df_errors_feat_roll.drop(columns = ['machineID', 'datetime']),
df_maint_feat_roll.drop(columns = ['machineID', 'datetime']),
df_fails_feat_roll.drop(columns = ['machineID', 'datetime'])], axis = 1, verify_integrity = True)
df_all = pd.merge(left = df_all, right = df_machines, how = "left", on = 'machineID', validate = 'many_to_one')
del df_left, df_telemetry_feat_roll, df_errors_feat_roll, df_fails_feat_roll, df_maint_feat_roll
for i in range(1, 5): # iterate over the four components
# find all the times a component failed for a given machine
df_temp = df_all.loc[df_all['f_' + str(i)] == 1, ['machineID', 'datetime']]
label = 'y_' + str(i) # name of target column (one per component)
df_all[label] = 0
for n in range(df_temp.shape[0]): # iterate over all the failure times
machineID, datetime = df_temp.iloc[n, :]
dt_end = datetime - pd.Timedelta('3 hours') # 3 hours prior to failure
dt_start = datetime - pd.Timedelta('2 days') # n days prior to failure
if n % 500 == 0:
print("a failure occured on machine {0} at {1}, so {2} is set to 1 between {4} and {3}".format(machineID, datetime, label, dt_end, dt_start))
df_all.loc[(df_all['machineID'] == machineID) &
(df_all['datetime'].between(dt_start, dt_end)), label] = 1
############################### run automl experiment
X_drop = ['datetime', 'machineID', 'f_1', 'f_2', 'f_3', 'f_4', 'y_1', 'y_2', 'y_3', 'y_4', 'model']
Y_keep = ['y_1', 'y_2', 'y_3', 'y_4']
X_train = df_all.loc[df_all['datetime'] < '2015-10-01', ].drop(X_drop, axis=1)
y_train = df_all.loc[df_all['datetime'] < '2015-10-01', Y_keep]
X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
primary_metric = 'AUC_weighted'
automl_config = AutoMLConfig(task = 'classification',
preprocess = False,
name = experiment_name,
debug_log = 'automl_errors.log',
primary_metric = primary_metric,
max_time_sec = 1200,
iterations = 2,
n_cross_validations = 2,
verbosity = logging.INFO,
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
path = project_folder, )
local_run = experiment.submit(automl_config, show_output = True)
# Wait until the run finishes.
local_run.wait_for_completion(show_output = True)
# create new AutoMLRun object to ensure everything is in order
ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)
# aux function for comparing performance of runs (quick workaround for automl's _get_max_min_comparator)
def maximize(x, y):
if x >= y:
return x
else:
return y
# next couple of lines are stripped down version of automl's get_output
children = list(ml_run.get_children())
best_run = None # will be child run with best performance
best_score = None # performance of that child run
for child in children:
candidate_score = child.get_metrics()[primary_metric]
if not np.isnan(candidate_score):
if best_score is None:
best_score = candidate_score
best_run = child
else:
new_score = maximize(best_score, candidate_score)
if new_score != best_score:
best_score = new_score
best_run = child
# print accuracy
best_accuracy = best_run.get_metrics()['accuracy']
print("Best run accuracy:", best_accuracy)
# download model and save to pkl
model_path = "outputs/model.pkl"
best_run.download_file(name = model_path, output_file_path = model_path)
# Writing the run id to /aml_config/run_id.json
run_id = {}
run_id['run_id'] = best_run.id
run_id['experiment_name'] = best_run.experiment.name
# save run info
os.makedirs('aml_config', exist_ok = True)
with open('aml_config/run_id.json', 'w') as outfile:
json.dump(run_id, outfile)
############################### upload run info and model pkl to def_data_store
def_data_store = ws.get_default_datastore()
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)

Просмотреть файл

@ -0,0 +1,57 @@
import os, json, sys
from azureml.core import Workspace
from azureml.core.image import ContainerImage, Image
from azureml.core.model import Model
# Get workspace
ws = Workspace.from_config()
# Get the latest model details
try:
with open("aml_config/model.json") as f:
config = json.load(f)
except:
print('No new model to register thus no need to create new scoring image')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
model_name = config['model_name']
model_version = config['model_version']
model_list = Model.list(workspace=ws)
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
os.chdir('./devops/code/scoring')
image_name = "predmaintenance-model-score"
image_config = ContainerImage.image_configuration(execution_script = "score.py",
runtime = "python-slim",
conda_file = "conda_dependencies.yml",
description = "Image with predictive maintenance model",
tags = {'area': "diabetes", 'type': "regression"}
)
image = Image.create(name = image_name,
models = [model],
image_config = image_config,
workspace = ws)
image.wait_for_creation(show_output = True)
os.chdir('../../../')
if image.creation_state != 'Succeeded':
raise Exception('Image creation status: {image.creation_state}')
print('{}(v.{} [{}]) stored at {} with build log {}'.format(image.name, image.version, image.creation_state, image.image_location, image.image_build_log_uri))
# Writing the image details to /aml_config/image.json
image_json = {}
image_json['image_name'] = image.name
image_json['image_version'] = image.version
image_json['image_location'] = image.image_location
with open('aml_config/image.json', 'w') as outfile:
json.dump(image_json,outfile)

Просмотреть файл

@ -0,0 +1,51 @@
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
from azureml.core.webservice import AciWebservice
# Get workspace
ws = Workspace.from_config()
# Get the Image to deploy details
try:
with open("aml_config/image.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
image_name = config['image_name']
image_version = config['image_version']
images = Image.list(workspace=ws)
image, = (m for m in images if m.version==image_version and m.name == image_name)
print('From image.json, Image used to deploy webservice on ACI: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,
memory_gb=1,
tags={'area': "pred-maintenance", 'type': "automl"},
description='A sample description')
aci_service_name='aciwebservice'+ datetime.datetime.now().strftime('%m%d%H')
service = Webservice.deploy_from_image(deployment_config=aciconfig,
image=image,
name=aci_service_name,
workspace=ws)
service.wait_for_deployment()
print('Deployed ACI Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
#service=Webservice(name ='aciws0622', workspace =ws)
# Writing the ACI details to /aml_config/aci_webservice.json
aci_webservice = {}
aci_webservice['aci_name'] = service.name
aci_webservice['aci_url'] = service.scoring_uri
with open('aml_config/aci_webservice.json', 'w') as outfile:
json.dump(aci_webservice,outfile)

Просмотреть файл

@ -0,0 +1,76 @@
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.webservice import Webservice, AksWebservice
# Get workspace
ws = Workspace.from_config()
# Get the Image to deploy details
try:
with open("aml_config/image.json") as f:
config = json.load(f)
except:
print('No new model, thus no deployment on ACI')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
image_name = config['image_name']
image_version = config['image_version']
images = Image.list(workspace=ws)
image, = (m for m in images if m.version==image_version and m.name == image_name)
print('From image.json, Image used to deploy webservice: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
# Check if AKS already Available
try:
with open("aml_config/aks_webservice.json") as f:
config = json.load(f)
aks_name = config['aks_name']
aks_service_name = config['aks_service_name']
compute_list = ws.compute_targets()
aks_target, =(c for c in compute_list if c.name ==aks_name)
service=Webservice(name =aks_service_name, workspace =ws)
print('Updating AKS service {} with image: {}'.format(aks_service_name,image.image_location))
service.update(image=image)
except:
aks_name = 'aks'+ datetime.datetime.now().strftime('%m%d%H')
aks_service_name = 'akswebservice'+ datetime.datetime.now().strftime('%m%d%H')
prov_config = AksCompute.provisioning_configuration(agent_count = 6, vm_size = 'Standard_F2', location='eastus')
print('No AKS found in aks_webservice.json. Creating new Aks: {} and AKS Webservice: {}'.format(aks_name,aks_service_name))
# Create the cluster
aks_target = ComputeTarget.create(workspace = ws,
name = aks_name,
provisioning_configuration = prov_config)
aks_target.wait_for_completion(show_output = True)
print(aks_target.provisioning_state)
print(aks_target.provisioning_errors)
# Use the default configuration (can also provide parameters to customize)
aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)
service = Webservice.deploy_from_image(workspace = ws,
name = aks_service_name,
image = image,
deployment_config = aks_config,
deployment_target = aks_target)
service.wait_for_deployment(show_output = True)
print(service.state)
print('Deployed AKS Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
# Writing the AKS details to /aml_config/aks_webservice.json
aks_webservice = {}
aks_webservice['aks_name'] = aks_name
aks_webservice['aks_service_name'] = service.name
aks_webservice['aks_url'] = service.scoring_uri
aks_webservice['aks_keys'] = service.get_keys()
with open('aml_config/aks_webservice.json', 'w') as outfile:
json.dump(aks_webservice,outfile)

Просмотреть файл

@ -0,0 +1,56 @@
import os, json
import azureml.core
from azureml.core import Workspace
from azureml.core import Experiment
from azureml.core import Run
from azureml.core.model import Model
# Get workspace
ws = Workspace.from_config()
# Paramaterize the matrics on which the models should be compared
# Add golden data set on which all the model performance can be evaluated
# Get the latest run_id
with open("aml_config/run_id.json") as f:
config = json.load(f)
new_model_run_id = config["run_id"]
experiment_name = config["experiment_name"]
exp = Experiment(workspace = ws, name = experiment_name)
try:
# Get most recently registered model, we assume that is the model in production. Download this model and compare it with the recently trained model by running test with same data set.
model_list = Model.list(ws)
production_model = next(filter(lambda x: x.created_time == max(model.created_time for model in model_list), model_list))
production_model_run_id = production_model.tags.get('run_id')
run_list = exp.get_runs()
# production_model_run = next(filter(lambda x: x.id == production_model_run_id, run_list))
# Get the run history for both production model and newly trained model and compare mse
production_model_run = Run(exp,run_id=production_model_run_id)
new_model_run = Run(exp,run_id=new_model_run_id)
production_model_metric = production_model_run.get_metrics().get('accuracy')
new_model_metric = new_model_run.get_metrics().get('accuracy')
print('Current Production model accuracy: {}, New trained model accuracy: {}'.format(production_model_metric, new_model_metric))
promote_new_model=False
if new_model_metric < production_model_metric:
promote_new_model = True
print('New trained model performs better, thus it will be registered')
except:
promote_new_model = True
print('This is the first model to be trained, thus nothing to evaluate for now')
run_id = {}
run_id['run_id'] = ''
# Writing the run id to /aml_config/run_id.json
if promote_new_model:
run_id['run_id'] = new_model_run_id
run_id['experiment_name'] = experiment_name
with open('aml_config/run_id.json', 'w') as outfile:
json.dump(run_id,outfile)

Просмотреть файл

@ -0,0 +1,173 @@
def download_data():
"""
download the anomaly detection and predictive maintenance data
:return: all the data
"""
os.makedirs('../data', exist_ok = True)
container = 'https://coursematerial.blob.core.windows.net/data/telemetry/'
urllib.request.urlretrieve(container + 'telemetry.csv', filename = '../data/telemetry.csv')
urllib.request.urlretrieve(container + 'maintenance.csv', filename = '../data/maintenance.csv')
urllib.request.urlretrieve(container + 'machines.csv', filename = '../data/machines.csv')
urllib.request.urlretrieve(container + 'failures.csv', filename = '../data/failures.csv')
# we replace errors.csv with anoms.csv (results from running anomaly detection)
# urllib.request.urlretrieve(container + 'errors.csv', filename = '../data/errors.csv')
urllib.request.urlretrieve(container + 'anoms.csv', filename = '../data/anoms.csv')
df_telemetry = pd.read_csv('../data/telemetry.csv', header = 0)
df_errors = pd.read_csv('../data/anoms.csv', header = 0)
df_fails = pd.read_csv('../data/failures.csv', header = 0)
df_maint = pd.read_csv('../data/maintenance.csv', header = 0)
df_machines = pd.read_csv('../data/machines.csv', header = 0)
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
df_errors = df_errors.replace({"errorID": rep_dir})
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
return df_telemetry, df_errors, df_fails, df_maint, df_machines
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
"""
calculates rolling averages and standard deviations
:param df: dataframe to run it on
:param colnames: names of columns we want rolling statistics for
:param suffixes: suffixes attached to the new columns (provide a list with strings)
:param window: the lag over which rolling statistics are calculated
:param on: the interval at which rolling statistics are calculated
:param groupby: the column used to group results by
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
"""
rolling_colnames = [c + suffixes[0] for c in colnames]
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
df_rolling_mean.columns = rolling_colnames
df_rolling_mean.reset_index(inplace=True)
rolling_colnames = [c + suffixes[1] for c in colnames]
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
df_rolling_sd.columns = rolling_colnames
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
df_rolling_sd.reset_index(inplace=True, drop=True)
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
df_res = df_res.loc[df_res.index % on == on-1]
return df_res
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
"""
calculates for every timestamp the time elapsed since the last time an event occured where an event is either an error or anomaly, maintenance, or failure
:param df_left: the telemetry data collected at regular intervals
:param df_right: the event data collected at irregular intervals
:param catvar: the name of the categorical column that encodes the event
:param prefix: the prefix for the new column showing time elapsed
:param window: window size for detecting event
:param on: frequency we want the results to be in
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
:param diff_type: the unit we want time differences to be measured in (hour by default)
:param validate: whether we should validate results
:param show_example: whether we should show an example to check that things are working
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
"""
keys = ['machineID', 'datetime']
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
df_wide = df_wide.groupby(keys).sum().reset_index()
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
# run a rolling window through event flags to aggregate data
dummy_col_names = df_dummies.columns
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
df.reset_index(inplace=True)
df = df.loc[df.index % on == on-1]
df.reset_index(inplace=True, drop=True)
df_first = df.groupby('machineID', as_index=False).nth(0)
# calculate the time of the last event and the time elapsed since
for col in dummy_col_names:
whenlast, diffcol = 'last_' + col, 'd' + col
df.loc[:, col].fillna(value = 0, inplace=True)
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
# for the first occurence we don't know when it last happened, so we assume it happened then
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
df[whenlast].fillna(method='ffill', inplace=True)
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
df.drop(columns = whenlast, inplace=True)
if show_example == True:
col = np.random.choice(dummy_col_names, size = 1)[0]
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
print('Example:\n')
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
return df
def rolling_average(df, column, n = 24):
"""
Calculates rolling average according to Welford's online algorithm (Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition).
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_Online_algorithm
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
:param df: a dataframe with time series in columns
:param column: name of the column of interest
:param n: number of measurements to consider
:return: None
"""
ra = [0] * df.shape[0]
ra[0] = df[column].values[0]
for r in range(1, df.shape[0]):
curr_n = float(min(n, r))
ra[r] = ra[r-1] + (df[column].values[r] - ra[r-1])/curr_n
df = pd.DataFrame(data = {'datetime': df['datetime'], 'value': ra})
return df
def do_ad(df, alpha = 0.005, max_anoms = 0.1, only_last = None, longterm = False, e_value = False, direction = 'both'):
"""
This method performs the actual anomaly detection. Expecting the a dataframe with multiple sensors,
and a specification of which sensor to use for anomaly detection.
:param df: a dataframe with a timestamp column and one more columns with telemetry data
:param column: name of the column on which to perform AD
:param alpha: see pyculiarity documentation for the meaning of these parameters
:param max_anoms:
:param only_last:
:param longterm:
:param e_value:
:param direction:
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
"""
results = detect_ts(df,
max_anoms = max_anoms,
alpha = alpha,
direction = direction,
e_value = e_value,
longterm = longterm,
only_last = only_last)
return results['anoms']['timestamp'].values

Просмотреть файл

@ -0,0 +1,148 @@
############################### load required libraries
import os
import pandas as pd
import json
import azureml.core
print("SDK Version:", azureml.core.VERSION)
from azureml.core import Workspace, Run, Experiment, Datastore
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.runconfig import CondaDependencies, RunConfiguration
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
from azureml.telemetry import set_diagnostics_collection
from azureml.pipeline.steps import PythonScriptStep
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
############################### load workspace and create experiment
ws = Workspace.from_config()
print('Workspace name: ' + ws.name,
'Subscription id: ' + ws.subscription_id,
'Resource group: ' + ws.resource_group, sep = '\n')
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
project_folder = '.' # project folder
experiment = Experiment(ws, experiment_name)
output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
pd.set_option('display.max_colwidth', -1)
print(pd.DataFrame(data = output, index = ['']).T)
set_diagnostics_collection(send_diagnostics = True)
############################### create a run config
cd = CondaDependencies.create(pip_packages = ["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
amlcompute_run_config.environment.docker.enabled = False
amlcompute_run_config.environment.docker.gpu_support = False
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
amlcompute_run_config.environment.spark.precache_packages = False
############################### create AML compute
aml_compute_target = "aml-compute"
try:
aml_compute = AmlCompute(ws, aml_compute_target)
print("found existing compute target.")
except:
print("creating new compute target")
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
idle_seconds_before_scaledown=1800,
min_nodes = 0,
max_nodes = 4)
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
aml_compute.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)
print("Azure Machine Learning Compute attached")
############################### point to data and scripts
# we use this for exchanging data between pipeline steps
def_data_store = ws.get_default_datastore()
# get pointer to default blob store
def_blob_store = Datastore(ws, "workspaceblobstore")
print("Blobstore's name: {}".format(def_blob_store.name))
# Naming the intermediate data as anomaly data and assigning it to a variable
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
print("Anomaly data object created")
# model = PipelineData("model", datastore = def_data_store)
# print("Model data object created")
anom_detect = PythonScriptStep(name = "anomaly_detection",
# script_name="anom_detect.py",
script_name = "CICD/code/anom_detect.py",
arguments = ["--output_directory", anomaly_data],
outputs = [anomaly_data],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("Anomaly Detection Step created.")
automl_step = PythonScriptStep(name = "automl_step",
# script_name = "automl_step.py",
script_name = "CICD/code/automl_step.py",
arguments = ["--input_directory", anomaly_data],
inputs = [anomaly_data],
# outputs = [model],
compute_target = aml_compute,
source_directory = project_folder,
allow_reuse = True,
runconfig = amlcompute_run_config)
print("AutoML Training Step created.")
############################### set up, validate and run pipeline
steps = [anom_detect, automl_step]
print("Step lists created")
pipeline = Pipeline(workspace = ws, steps = steps)
print ("Pipeline is built")
pipeline.validate()
print("Pipeline validation complete")
pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)
print("Pipeline is submitted for execution")
# Wait until the run finishes.
pipeline_run.wait_for_completion(show_output = False)
print("Pipeline run completed")
############################### upload artifacts to AML Workspace
# Download aml_config info and output of automl_step
def_data_store.download(target_path = '.',
prefix = 'aml_config',
show_progress = True,
overwrite = True)
def_data_store.download(target_path = '.',
prefix = 'outputs',
show_progress = True,
overwrite = True)
print("Updated aml_config and outputs folder")
model_fname = 'model.pkl'
model_path = os.path.join("outputs", model_fname)
# Upload the model file explicitly into artifacts (for CI/CD)
pipeline_run.upload_file(name = model_path, path_or_stream = model_path)
print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))

Просмотреть файл

@ -0,0 +1,56 @@
import os, json,sys
from azureml.core import Workspace
from azureml.core import Run
from azureml.core import Experiment
from azureml.core.model import Model
from azureml.core.runconfig import RunConfiguration
# Get workspace
ws = Workspace.from_config()
# Get the latest evaluation result
try:
with open("aml_config/run_id.json") as f:
config = json.load(f)
if not config["run_id"]:
raise Exception('No new model to register as production model perform better')
except:
print('No new model to register as production model perform better')
#raise Exception('No new model to register as production model perform better')
sys.exit(0)
run_id = config["run_id"]
experiment_name = config["experiment_name"]
exp = Experiment(workspace = ws, name = experiment_name)
run = Run(experiment = exp, run_id = run_id)
names = run.get_file_names
names()
print('Run ID for last run: {}'.format(run_id))
model_local_dir = "model"
os.makedirs(model_local_dir, exist_ok = True)
# Download Model to Project root directory
model_name = 'model.pkl'
run.download_file(name = './outputs/' + model_name,
output_file_path = './model/' + model_name)
print('Downloaded model {} to Project root directory'.format(model_name))
os.chdir('./model')
model = Model.register(model_path = model_name, # this points to a local file
model_name = model_name, # this is the name the model is registered as
tags = {'area': "predictive maintenance", 'type': "automl", 'run_id' : run_id},
description = "Model for predictive maintenance dataset",
workspace = ws)
os.chdir('..')
print('Model registered: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
# Remove the evaluate.json as we no longer need it
# os.remove("aml_config/evaluate.json")
# Writing the registered model details to /aml_config/model.json
model_json = {}
model_json['model_name'] = model.name
model_json['model_version'] = model.version
model_json['run_id'] = run_id
with open('aml_config/model.json', 'w') as outfile:
json.dump(model_json,outfile)

Просмотреть файл

@ -0,0 +1,13 @@
name: myenv
channels:
- defaults
dependencies:
- python=3.6.2
- pip:
- scikit-learn==0.19.1
- azureml-sdk[automl]
- azureml-monitoring
- pyculiarity
- scipy
- numpy
- pandas

Просмотреть файл

@ -0,0 +1,301 @@
import datetime
import pandas as pd
from pyculiarity import detect_ts
import os
import pickle
import json
from sklearn.externals import joblib
from azureml.core.model import Model
import azureml.train.automl
from azureml.monitoring import ModelDataCollector
import time
import glob
import numpy as np
import scipy
def create_data_dict(data, sensors):
"""
:param data:
:return:
"""
data_dict = {}
for column in data.columns:
data_dict[column] = [data[column].values[0]]
if column in sensors:
data_dict[column + '_avg'] = [0.0]
data_dict[column + '_an'] = [False]
return data_dict
def init_df():
"""
Init DataFrame from one row of data
:param data:
:return:
"""
# data_dict = create_data_dict(data)
df = pd.DataFrame() #data=data_dict, index=data_dict['timestamp'])
return df
def append_data(df, data, sensors):
"""
We either add the data and the results (res_dict) of the anomaly detection to the existing data frame,
or create a new one if the data frame is empty
"""
data_dict = create_data_dict(data, sensors)
#todo, this is only necessary, because currently the webservice doesn't get any timestamps
if df.shape[0] == 0:
prv_timestamp = datetime.datetime(2015, 1, 1, 5, 0) # 1/1/2015 6:00:00 AM
else:
prv_timestamp = df['timestamp'].max()
data_dict['timestamp'] = [prv_timestamp + datetime.timedelta(hours=1)]
df = df.append(pd.DataFrame(data=data_dict, index=data_dict['timestamp']))
return df
def generate_stream(telemetry, n=None):
"""
n is the number of sensor readings we are simulating
"""
if not n:
n = telemetry.shape[0]
machine_ids = [1] # telemetry['machineID'].unique()
timestamps = telemetry['timestamp'].unique()
# sort test_data by timestamp
# on every iteration, shuffle machine IDs
# then loop over machine IDs
#t = TicToc()
for timestamp in timestamps:
#t.tic()
np.random.shuffle(machine_ids)
for machine_id in machine_ids:
data = telemetry.loc[(telemetry['timestamp'] == timestamp) & (telemetry['machineID'] == machine_id), :]
run(data)
#t.toc("Processing all machines took")
def load_df(data):
machineID = data['machineID'].values[0]
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machineID)
if os.path.exists(filename):
df = pd.read_csv(filename)
df['timestamp'] = pd.to_datetime(df['timestamp'], format="%Y-%m-%d %H:%M:%S")
else:
df = pd.DataFrame()
return df
def save_df(df):
"""
:param df:
:return:
"""
machine_id = df.ix[0, 'machineID']
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machine_id)
df.to_csv(filename, index=False)
def running_avgs(df, sensors, window_size=24, only_copy=False):
"""
Calculates rolling average according to Welford's online algorithm.
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
:param df: a dataframe with time series in columns
:param column: name of the column of interest
:param n: number of measurements to consider
:return: None
"""
curr_n = df.shape[0]
row_index = curr_n - 1
window_size = min(window_size, curr_n)
for sensor in sensors:
val_col_index = df.columns.get_loc(sensor)
avg_col_index = df.columns.get_loc(sensor + "_avg")
curr_value = df.ix[row_index, val_col_index]
if curr_n == 0 or only_copy:
df.ix[row_index, avg_col_index] = curr_value
else:
prv_avg = df.ix[(row_index -1), avg_col_index]
df.ix[row_index, avg_col_index] = prv_avg + (curr_value - prv_avg) / window_size
def init():
global model
global prediction_dc
global storage_location
storage_location = "/tmp/output"
if not os.path.exists(storage_location):
os.makedirs(storage_location)
# next, we delete previous output files
files = glob.glob(os.path.join(storage_location,'*'))
for f in files:
os.remove(f)
model_name = "model.pkl"
model_path = Model.get_model_path(model_name = model_name)
# deserialize the model file back into a sklearn model
model = joblib.load(model_path)
prediction_dc = ModelDataCollector("automl_model", identifier="predictions", feature_names=["prediction"])
def run(rawdata, window=14 * 24):
"""
:param data:
:param window:
:return:
"""
try:
# set some parameters for the AD algorithm
alpha = 0.1
max_anoms = 0.05
only_last = None # alternative, we can set this to 'hr' or 'day'
json_data = json.loads(rawdata)['data']
# this is the beginning of anomaly detection code
# TODO: the anomaly detection service expected one row of a pd.DataFrame w/ a timestamp and machine id, but here we only get a list of values
# we therefore create a time stamp ourselves
# and create a data frame that the anomaly detection code can understand
# eventually, we want this to be harmonized!
timestamp = time.strftime("%m/%d/%Y %H:%M:%S", time.localtime())
machineID = 1 # TODO scipy.random.choice(100)
telemetry_data = json_data[0][8:16:2]
sensors = ['volt','pressure','vibration', 'rotate']
data_dict = {}
data_dict['timestamp'] = [timestamp]
data_dict['machineID'] = [machineID]
for i in range(0,4):
data_dict[sensors[i]] = [telemetry_data[i]]
telemetry_df = pd.DataFrame(data=data_dict)
telemetry_df['timestamp'] = pd.to_datetime(telemetry_df['timestamp'])
# load dataframe
df = load_df(telemetry_df)
# add current sensor readings to data frame, also adds fields for anomaly detection results
df = append_data(df, telemetry_df, sensors)
# # calculate running averages (no need to do this here, because we are already sending preprocessed data)
# # TODO: this is disabled for now, because we are dealing with pre-processed data
# running_avgs(df, sensors, only_copy=True)
# note timestamp so that we can update the correct row of the dataframe later
timestamp = df['timestamp'].max()
# we get a copy of the current (also last) row of the dataframe
current_row = df.loc[df['timestamp'] == timestamp, :]
# determine how many sensor readings we already have
rows = df.shape[0]
# if the data frame doesn't have enough rows for our sliding window size, we just return (setting that we have no
# anomalies)
if rows < window:
save_df(df)
json_data = current_row.to_json()
return json.dumps({"result": [0]})
# determine the first row of the data frame that falls into the sliding window
start_row = rows - window
# a flag to indicate whether we detected an anomaly in any of the sensors after this reading
detected_an_anomaly = False
anom_list = []
# we loop over the sensor columns
for column in sensors:
df_s = df.ix[start_row:rows, ('timestamp', column + "_avg")]
# pyculiarity expects two columns with particular names
df_s.columns = ['timestamp', 'value']
# we reset the timestamps, so that the current measurement is the last within the sliding time window
# df_s = reset_time(df_s)
# calculate the median value within each time sliding window
# values = df_s.groupby(df_s.index.date)['value'].median()
# create dataframe with median values etc.
# df_agg = pd.DataFrame(data={'timestamp': pd.to_datetime(values.index), 'value': values})
# find anomalies
results = detect_ts(df_s, max_anoms=max_anoms,
alpha=alpha,
direction='both',
e_value=False,
only_last=only_last)
# create a data frame where we mark for each day whether it was an anomaly
df_s = df_s.merge(results['anoms'], on='timestamp', how='left')
# mark the current sensor reading as anomaly Specifically, if we get an anomaly in the the sliding window
# leading up (including) the current sensor reading, we mark the current sensor reading as anomaly note,
# alternatively one could mark all the sensor readings that fall within the sliding window as anomalies.
# However, we prefer our approach, because without the current sensor reading the other sensor readings in
# this sliding window may not have been an anomaly
# current_row[column + '_an'] = not np.isnan(df_agg.tail(1)['anoms'].iloc[0])
if not np.isnan(df_s.tail(1)['anoms'].iloc[0]):
current_row.ix[0,column + '_an'] = True
detected_an_anomaly = True
anom_list.append(1.0)
else:
anom_list.append(0.0)
# It's only necessary to update the current row in the data frame, if we detected an anomaly
if detected_an_anomaly:
df.loc[df['timestamp'] == timestamp, :] = current_row
save_df(df)
json_data[0][8:16:2] = anom_list
# # this is the end of anomaly detection code
data = np.array(json_data)
result = model.predict(data)
prediction_dc.collect(result)
print ("saving prediction data" + time.strftime("%H:%M:%S"))
except Exception as e:
result = str(e)
return json.dumps({"error": result})
return json.dumps({"result":result.tolist()})

Просмотреть файл

@ -0,0 +1,32 @@
# test integrity of the input data
import sys
import os
import numpy as np
import pandas as pd
# number of features
n_columns = 37
def check_schema(X):
n_actual_columns = X.shape[1]
if n_actual_columns != n_columns:
print("Error: found {} feature columns. The data should have {} feature columns.".format(n_actual_columns, n_columns))
return False
return True
def main():
filename = sys.argv[1]
if not os.path.exists(filename):
print("Error: The file {} does not exist".format(filename))
return
dataset = pd.read_csv(filename)
if check_schema(dataset[dataset.columns[:-1]]):
print("Data schema test succeeded")
else:
print("Data schema test failed")
if __name__ == "__main__":
main()

Просмотреть файл

@ -0,0 +1,6 @@
{
"subscription_id": ".......",
"resource_group": ".......",
"workspace_name": ".......",
"workspace_region": "......."
}

Просмотреть файл

@ -0,0 +1,2 @@
1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01
1 1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01
2

Просмотреть файл

@ -0,0 +1,33 @@
datetime,machineID,volt,rotate,pressure,vibration
8/9/2015 5:00:00 AM,89,156.022596809483,499.186773543787,94.6935081356238,47.9299454212229
8/9/2015 6:00:00 AM,89,189.338289348546,481.699267606406,115.672119136436,33.2653137178549
8/9/2015 7:00:00 AM,89,161.157427445239,428.727765607777,98.9656750339584,41.4441610087944
8/9/2015 8:00:00 AM,89,161.502408348608,453.436246296314,103.372133704982,39.2582621215621
8/9/2015 9:00:00 AM,89,162.711527502725,474.580253904076,106.913159242991,38.3757632773898
8/9/2015 10:00:00 AM,89,166.443990032135,431.671589345972,103.886570207659,39.9456973566939
8/9/2015 11:00:00 AM,89,178.877688597966,375.234725956535,84.5290039772805,33.7262250941115
8/9/2015 12:00:00 PM,89,152.710305382723,528.240939612068,117.482500743972,42.2221279079703
8/9/2015 1:00:00 PM,89,178.297689547913,439.9008196176,86.5297382410314,41.8368689980528
8/9/2015 2:00:00 PM,89,178.449967910224,461.03136701172,98.3063510756247,43.8636647313615
8/9/2015 3:00:00 PM,89,173.639372721488,450.261022137965,112.418628429993,37.4570141147091
8/9/2015 4:00:00 PM,89,153.189370579857,352.018187762502,98.9312630483397,29.0874981562648
8/9/2015 5:00:00 PM,89,199.945957715423,421.809350524228,91.2802844059766,39.2105928790095
8/9/2015 6:00:00 PM,89,166.408336082299,466.800808863573,117.552067959932,43.5182645195745
8/9/2015 7:00:00 PM,89,167.376369450821,522.687921580833,95.3470267846314,38.6090811803213
8/9/2015 8:00:00 PM,89,138.101762905172,431.361050254412,94.9800280124435,36.9840404537683
8/9/2015 9:00:00 PM,89,149.29819536088,488.138545310211,110.176286869331,44.7414170785692
8/9/2015 10:00:00 PM,89,169.349342103404,420.718947026669,90.2096031418544,44.4246012177021
8/9/2015 11:00:00 PM,89,157.884780585546,374.46347545389,105.531747266353,37.4048802607342
8/10/2015 12:00:00 AM,89,174.92102638734,393.681456167383,94.2123283383687,38.4380787184679
8/10/2015 1:00:00 AM,89,173.859334040321,474.998720872934,114.831449991881,26.9997142587449
8/10/2015 2:00:00 AM,89,147.507135631244,434.592467073247,109.14774266869,38.0553522602426
8/10/2015 3:00:00 AM,89,182.508384464887,475.127724817095,88.7916931828417,36.9715744818552
8/10/2015 4:00:00 AM,89,196.365633856682,392.765937285152,72.888759644257,45.5800850607391
8/10/2015 5:00:00 AM,89,188.078669648455,407.441122417009,97.5390134742126,34.3381545913848
8/10/2015 6:00:00 AM,89,146.16908298412,526.089383569558,97.3899974138672,33.3395802812222
8/10/2015 7:00:00 AM,89,180.966858727778,539.805309324902,98.3638631679925,39.6400150497035
8/10/2015 8:00:00 AM,89,173.114942080223,475.993018274367,94.3389221073905,44.808235501154
8/10/2015 9:00:00 AM,89,165.710025826903,541.353097455816,97.6247228539178,35.4394473823794
8/10/2015 10:00:00 AM,89,203.685406201796,473.07005430003,76.5413087938538,45.7219151497101
8/10/2015 11:00:00 AM,89,193.493560935218,441.351844215338,89.5960554969496,40.9367542256887
8/10/2015 12:00:00 PM,89,177.320149053588,285.642227983577,87.9600045132346,45.7532914751573
1 datetime machineID volt rotate pressure vibration
2 8/9/2015 5:00:00 AM 89 156.022596809483 499.186773543787 94.6935081356238 47.9299454212229
3 8/9/2015 6:00:00 AM 89 189.338289348546 481.699267606406 115.672119136436 33.2653137178549
4 8/9/2015 7:00:00 AM 89 161.157427445239 428.727765607777 98.9656750339584 41.4441610087944
5 8/9/2015 8:00:00 AM 89 161.502408348608 453.436246296314 103.372133704982 39.2582621215621
6 8/9/2015 9:00:00 AM 89 162.711527502725 474.580253904076 106.913159242991 38.3757632773898
7 8/9/2015 10:00:00 AM 89 166.443990032135 431.671589345972 103.886570207659 39.9456973566939
8 8/9/2015 11:00:00 AM 89 178.877688597966 375.234725956535 84.5290039772805 33.7262250941115
9 8/9/2015 12:00:00 PM 89 152.710305382723 528.240939612068 117.482500743972 42.2221279079703
10 8/9/2015 1:00:00 PM 89 178.297689547913 439.9008196176 86.5297382410314 41.8368689980528
11 8/9/2015 2:00:00 PM 89 178.449967910224 461.03136701172 98.3063510756247 43.8636647313615
12 8/9/2015 3:00:00 PM 89 173.639372721488 450.261022137965 112.418628429993 37.4570141147091
13 8/9/2015 4:00:00 PM 89 153.189370579857 352.018187762502 98.9312630483397 29.0874981562648
14 8/9/2015 5:00:00 PM 89 199.945957715423 421.809350524228 91.2802844059766 39.2105928790095
15 8/9/2015 6:00:00 PM 89 166.408336082299 466.800808863573 117.552067959932 43.5182645195745
16 8/9/2015 7:00:00 PM 89 167.376369450821 522.687921580833 95.3470267846314 38.6090811803213
17 8/9/2015 8:00:00 PM 89 138.101762905172 431.361050254412 94.9800280124435 36.9840404537683
18 8/9/2015 9:00:00 PM 89 149.29819536088 488.138545310211 110.176286869331 44.7414170785692
19 8/9/2015 10:00:00 PM 89 169.349342103404 420.718947026669 90.2096031418544 44.4246012177021
20 8/9/2015 11:00:00 PM 89 157.884780585546 374.46347545389 105.531747266353 37.4048802607342
21 8/10/2015 12:00:00 AM 89 174.92102638734 393.681456167383 94.2123283383687 38.4380787184679
22 8/10/2015 1:00:00 AM 89 173.859334040321 474.998720872934 114.831449991881 26.9997142587449
23 8/10/2015 2:00:00 AM 89 147.507135631244 434.592467073247 109.14774266869 38.0553522602426
24 8/10/2015 3:00:00 AM 89 182.508384464887 475.127724817095 88.7916931828417 36.9715744818552
25 8/10/2015 4:00:00 AM 89 196.365633856682 392.765937285152 72.888759644257 45.5800850607391
26 8/10/2015 5:00:00 AM 89 188.078669648455 407.441122417009 97.5390134742126 34.3381545913848
27 8/10/2015 6:00:00 AM 89 146.16908298412 526.089383569558 97.3899974138672 33.3395802812222
28 8/10/2015 7:00:00 AM 89 180.966858727778 539.805309324902 98.3638631679925 39.6400150497035
29 8/10/2015 8:00:00 AM 89 173.114942080223 475.993018274367 94.3389221073905 44.808235501154
30 8/10/2015 9:00:00 AM 89 165.710025826903 541.353097455816 97.6247228539178 35.4394473823794
31 8/10/2015 10:00:00 AM 89 203.685406201796 473.07005430003 76.5413087938538 45.7219151497101
32 8/10/2015 11:00:00 AM 89 193.493560935218 441.351844215338 89.5960554969496 40.9367542256887
33 8/10/2015 12:00:00 PM 89 177.320149053588 285.642227983577 87.9600045132346 45.7532914751573