Clean up before delivery
|
@ -3,4 +3,6 @@ config.json
|
|||
train.csv
|
||||
test.csv
|
||||
sample_submission.csv
|
||||
mnt_blob_rw.ipynb
|
||||
mnt_blob_rw.ipynb
|
||||
data/*
|
||||
.ipynb_checkpoints
|
|
@ -0,0 +1,497 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Azure DevOps\n",
|
||||
"\n",
|
||||
"With Azure DevOps data scientists and application developers can work together to create and maintain AI-infused applications. Using a DevOps mindset is not new to software developers, who are used to running applications in production. However, data scientists in the past have often worked in silos and not followed best practices to facilitate the transition from development to production. With Azure DevOps data scientists can now develop with an eye toward production."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Part 1: Getting started\n",
|
||||
"\n",
|
||||
"This lab allows you to perform setup for building a **Continuous Integration/Continuous Deployment** pipeline related to Anomoly Detection and Predictive Maintenance.\n",
|
||||
"\n",
|
||||
"### Pre-requisites\n",
|
||||
"\n",
|
||||
"- Azure account\n",
|
||||
"- Azure DevOps account\n",
|
||||
"- Azure Machine Learning Service Workspace\n",
|
||||
"- Basic knowledge of Python\n",
|
||||
"\n",
|
||||
"After you launch your environment, follow the below steps:\n",
|
||||
"\n",
|
||||
"### Azure Machine Learning Service Workspace\n",
|
||||
"\n",
|
||||
"We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:\n",
|
||||
"\n",
|
||||
"1. Login to Azure portal using the credentials provided with the environment.\n",
|
||||
"\n",
|
||||
"2. Select **Create a Resource** and search the marketplace for **Machine Learning Service Workspace**.\n",
|
||||
"\n",
|
||||
"![Market Place](../images/marketplace.png)\n",
|
||||
"\n",
|
||||
"3. Select **Machine Learning Service Workspace** followed by **Create**:\n",
|
||||
"\n",
|
||||
"![Create Workspace](../images/createWorkspace.png)\n",
|
||||
"\n",
|
||||
"4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):\n",
|
||||
"\n",
|
||||
"![Workspace Fields](../images/workspaceFields.png)\n",
|
||||
"\n",
|
||||
"### Sign in to Azure DevOps\n",
|
||||
"\n",
|
||||
"Go to **https://dev.azure.com** and login using your Azure username and password. You will be asked to provide a name and email. An organization is created for you based on the name you provide. Within the organization, you will be asked to create a project. Name your project \"ADPM\" and click on **Create project**. With private projects, only people you give access to will be able to view this project. After logging in, you should see the below:\n",
|
||||
"\n",
|
||||
"![Get Started](../images/getStarted.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Create Service connection\n",
|
||||
"\n",
|
||||
"The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a **Service Connection** and authorizing the build pipeline to use this connection.\n",
|
||||
"\n",
|
||||
"> If we didn't set up this **service connection**, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.\n",
|
||||
"\n",
|
||||
"Setting up a service connection involves the following steps:\n",
|
||||
"1. Click on **Project settings** in the bottom-left corner of your screen.\n",
|
||||
"2. On the next page, search for menu section **Pipelines** and select **Service Connection**.\n",
|
||||
"3. Create a **New service connection**, of type **Azure Resource Manager**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/createServiceConnection.png)\n",
|
||||
"\n",
|
||||
"4. On the page you are presented with, scroll down and click on the link saying **use the full version of the service connection dialog**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/changeToFullVersionServiceConnection.png)\n",
|
||||
"\n",
|
||||
"5. Begin filling out the full version of the form. All the information you need is provided in the lab setup page. If you closed this page, a link to it was emailed to you. Look for emails from **No Reply (CloudLabs) <noreply@cloudlabs.ai>**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/fullDialogueServiceConnection.png \"width=50\")\n",
|
||||
"\n",
|
||||
" - **Important!** Set **connection name** to **serviceConnection** (careful about capitalization).\n",
|
||||
" - For **Service principal client ID** paste the field called **Application/Client Id** in the lab setup page.\n",
|
||||
" - Set **Scope level** to **Subscription**.\n",
|
||||
" - For **Subscription**, select the same which you have been using throughout the course. You may already have a compute target in there (e.g. \"aml-copute\") and a AML workspace.\n",
|
||||
" - **Important!** Leave **Resource Group** empty.\n",
|
||||
" - For **Service principal key** paste the filed called **Application Secret Key** in the lab setup page.\n",
|
||||
" - Allow all pipelines to use this connection.\n",
|
||||
" - Click on **Verify connection** to make sure the connection is valid and then click on **OK**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Repository\n",
|
||||
"\n",
|
||||
"After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to go to **Repos > Files > Import** as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.\n",
|
||||
"\n",
|
||||
"![import repository](../images/importGit.png)\n",
|
||||
"\n",
|
||||
"You should now be able to see the git repo in your project."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Part 2: Building a pipeline\n",
|
||||
"\n",
|
||||
"Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Hosted Agents\n",
|
||||
"\n",
|
||||
"With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care of. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:\n",
|
||||
"\n",
|
||||
"- Ubuntu 16.04\n",
|
||||
"- Visual Studio 2017 on Windows Server 2016\n",
|
||||
"- macOS 10.13\n",
|
||||
"- Windows Server 1803 (win1803) - for running Windows containers\n",
|
||||
"- Visual Studio 2015 on Windows Server 2012R2\n",
|
||||
"\n",
|
||||
"YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Code Repository\n",
|
||||
"\n",
|
||||
"The repo is organized as follows:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
" code\n",
|
||||
" code/testing/\n",
|
||||
" code/scoring/\n",
|
||||
" code/aml_config/\n",
|
||||
" data_sample\n",
|
||||
" azure-pipelines.yml\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## About the scripts\n",
|
||||
"\n",
|
||||
"For the purpose of DevOps, it's best not to use a Notebook because it can be error-prone. Instead, we have all the code sitting in individual Python scripts. This means that if we used a Notebook to develop our scripts, like we did throughout this course, we have some work to do to refactor the code and turn it into a series of modular Python scripts. We would also add scripts for running various tests everytime our build is triggered, such as unit tests, integration tests, tests to measure **drift** (a degradation over time of the predictions returned by the model on incoming data), etc."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's take a look at a brief overview of what each script does:\n",
|
||||
"\n",
|
||||
"| num | script | what it does |\n",
|
||||
"| --- | ------------------------ | ----------------------------------------------- |\n",
|
||||
"| 1 | anom_detect.py | detect anomalies in data and output them |\n",
|
||||
"| 2 | automl_step.py | train a PdM model using automated ML |\n",
|
||||
"| 3 | pipeline.py | runs 1 and 2 against a remote compute target |\n",
|
||||
"| 4 | evaluate_model.py | evaluates the result of 2 |\n",
|
||||
"| 5 | register_model.py | registeres the best model |\n",
|
||||
"| 6 | scoring/score.py | scoring script |\n",
|
||||
"| 7 | create_scoring_image.py | creates a scoring image from the scoring script |\n",
|
||||
"| 8 | deploy_aci.py | deploys scoring image to ACI |\n",
|
||||
"| 9 | aci_service_test.py | tests the ACI deployment |\n",
|
||||
"| 10 | testing/data_test.py | used to test the ACI deployment |\n",
|
||||
"| 11 | deploy_aks.py | deploys the AKS deployment |\n",
|
||||
"| 12 | aks_service_test.py | tests the AKS deployment |\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In addition to the Python scripts. We have another script called `azure-pipeline.yml`, which contains in it the logic for our build. Like a **conda config** file or a **dockerfile**, this file allows us to set in place *infrastructure as code*. Let's take a look at its content:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 35,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Overwriting ./azure-pipelines.yml\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# %load ./azure-pipelines.yml\n",
|
||||
"pool:\n",
|
||||
" vmImage: 'Ubuntu 16.04'\n",
|
||||
"steps:\n",
|
||||
"- task: UsePythonVersion@0\n",
|
||||
" inputs:\n",
|
||||
" versionSpec: 3.5\n",
|
||||
" architecture: 'x64'\n",
|
||||
"\n",
|
||||
"- task: DownloadSecureFile@1\n",
|
||||
" inputs:\n",
|
||||
" name: configFile\n",
|
||||
" secureFile: config.json\n",
|
||||
"- script: echo \"Printing the secure file path\" \n",
|
||||
"- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)\n",
|
||||
"\n",
|
||||
"- task: CondaEnvironment@1\n",
|
||||
" displayName: 'Create Conda Environment '\n",
|
||||
" inputs:\n",
|
||||
" createCustomEnvironment: true\n",
|
||||
" environmentName: azuremlsdk\n",
|
||||
" packageSpecs: 'python=3.6'\n",
|
||||
" updateConda: false\n",
|
||||
" createOptions: 'cython==0.29 urllib3<1.24'\n",
|
||||
"- script: |\n",
|
||||
" pip install --user azureml-sdk==1.0.17 pandas\n",
|
||||
" displayName: 'Install prerequisites'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/pipeline.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/pipeline.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/evaluate_model.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/evaluate_model.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/register_model.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/register_model.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/create_scoring_image.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/create_scoring_image.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/deploy_aci.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/deploy_aci.py'\n",
|
||||
" \n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI devops/code/aci_service_test.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python devops/code/aci_service_test.py'\n",
|
||||
"- script: |\n",
|
||||
" python devops/code/testing/data_test.py devops/data_sample/predmain_bad_schema.csv\n",
|
||||
" displayName: 'Test Schema'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating a config file and uploading it as a Secure File\n",
|
||||
"\n",
|
||||
"On your own labtop, create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"{\n",
|
||||
" \"subscription_id\": \".......\",\n",
|
||||
" \"resource_group\": \".......\",\n",
|
||||
" \"workspace_name\": \".......\",\n",
|
||||
" \"workspace_region\": \".......\"\n",
|
||||
"}\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"You can get all of the info from the Machine Learning Service Workspace created in the portal as shown below. **Attention:** For `workspace_region` use one word and all lowercase, e.g. `westus2`.\n",
|
||||
"\n",
|
||||
"![ML Workspace](../images/configFileOnPortal.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.\n",
|
||||
"\n",
|
||||
"The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.\n",
|
||||
"\n",
|
||||
"#### Upload Secure File\n",
|
||||
"\n",
|
||||
"1. Select **Pipelines**, **Library** and **Secure Files**, then **+Secure File** to upload `config.json` file.\n",
|
||||
"\n",
|
||||
"![Upload Secure File](../images/uploadSecureFile.png)\n",
|
||||
"\n",
|
||||
"2. Select the uploaded file `config.json` and ensure **Authorize for use in all pipelines** is ticked and click on **Save**. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating a build\n",
|
||||
"\n",
|
||||
"Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.\n",
|
||||
"\n",
|
||||
"#### New Pipeline\n",
|
||||
"\n",
|
||||
"1. To create a new pipeline, select **New pipeline** from the Pipelines blade:\n",
|
||||
"\n",
|
||||
" ![New Pipeline](../images/newPipeline.png)\n",
|
||||
"\n",
|
||||
"2. You will be prompted with **Where is your code?**. Select **Azure Repos** followed by your repo.\n",
|
||||
"\n",
|
||||
"3. Select **Run**. Once the agent is allocated, you'll start seeing the live logs of the build.\n",
|
||||
"\n",
|
||||
"#### Notification\n",
|
||||
"\n",
|
||||
"The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Azure Pipelines with YAML\n",
|
||||
"\n",
|
||||
"You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure. \n",
|
||||
"\n",
|
||||
"The basic steps include:\n",
|
||||
"\n",
|
||||
"1. Configure Azure Pipelines to use your Git repo.\n",
|
||||
"2. Edit your `azure-pipelines.yml` file to define your build.\n",
|
||||
"3. Push your code to your version control repository which kicks off the default trigger to build and deploy.\n",
|
||||
"4. Code is now updated, built, tested, and packaged. It can be deployed to any target.\n",
|
||||
"\n",
|
||||
"![Pipelines-Image-Yam](../images/pipelines-image-yaml.png)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Open the yml file in the repo to understand the build steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating test scripts\n",
|
||||
"\n",
|
||||
"In this workshop, multiple tests are included:\n",
|
||||
"\n",
|
||||
"1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.\n",
|
||||
"\n",
|
||||
"2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.\n",
|
||||
"\n",
|
||||
"#### Exercise\n",
|
||||
"\n",
|
||||
"- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types? \n",
|
||||
"\n",
|
||||
"- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?\n",
|
||||
"\n",
|
||||
"- Make sure `azure-pipelines.yml` captures the above changes"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Build trigger (continuous deployment trigger)\n",
|
||||
"\n",
|
||||
"Along with the time triggers, we cann can also create a release every time a new build is available.\n",
|
||||
"\n",
|
||||
"1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:\n",
|
||||
"\n",
|
||||
"![Release Build Trigger](../images/releaseBuildTrigger.png)\n",
|
||||
"\n",
|
||||
"2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting \"master\" will trigger a release for every build from the master branch.\n",
|
||||
"\n",
|
||||
"#### Approvals\n",
|
||||
"\n",
|
||||
"For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:\n",
|
||||
"\n",
|
||||
"![Pending Approval](../images/pendingApproval.png)\n",
|
||||
"\n",
|
||||
"There is also provision to include comments with approval/reject:\n",
|
||||
"\n",
|
||||
"![Approval Comments](../images/approvalComments.png)\n",
|
||||
"\n",
|
||||
"Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines: \n",
|
||||
"\n",
|
||||
"![Release Passed](../images/releasePassed.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Application Insights (optional)\n",
|
||||
"\n",
|
||||
"For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.\n",
|
||||
"\n",
|
||||
"Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"import requests\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]\n",
|
||||
"\n",
|
||||
"data = json.dumps({'data': input_j})\n",
|
||||
"test_sample = bytes(data, encoding = 'utf8')\n",
|
||||
"\n",
|
||||
"url = '<scoring_url>'\n",
|
||||
"api_key = '<key>' \n",
|
||||
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}\n",
|
||||
"\n",
|
||||
"resp = requests.post(url, test_sample, headers=headers)\n",
|
||||
"print(resp.text)\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:\n",
|
||||
"\n",
|
||||
"![ML Workspace](../images/mlworkspace.png)\n",
|
||||
"\n",
|
||||
"2. Select Analytics.\n",
|
||||
"\n",
|
||||
"3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:\n",
|
||||
"\n",
|
||||
"````\n",
|
||||
" requests\n",
|
||||
" | where timestamp > ago(3h)\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"![LogAnalytics Query](../images/logAnalyticsQuery.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Data Changes\n",
|
||||
"\n",
|
||||
"A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:\n",
|
||||
"\n",
|
||||
"````python\n",
|
||||
" print(\"Adding incremental data...\")\n",
|
||||
" telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))\n",
|
||||
" telemetry = telemetry.append(telemetry_incremental, ignore_index=True)\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [conda env:learnai-adpm]",
|
||||
"language": "python",
|
||||
"name": "conda-env-learnai-adpm-py"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1,8 +1,8 @@
|
|||
# Introduction
|
||||
|
||||
In this course, we will implement a Continuous Integration (CI)/Continuous Delivery (CD) pipeline for Anomaly Detection and Predictive Maintenance applications. For developing an AI application, there are frequently two streams of work:
|
||||
1. Data Scientists building machine learning models
|
||||
|
||||
1. Data Scientists building machine learning models
|
||||
2. App developers building the application and exposing it to end users to consume
|
||||
|
||||
In short, the pipeline is designed to kick off for each new commit, run the test suite, if the test passes takes the latest build, packages it in a Docker container and then deploys to create a scoring service as shown below.
|
||||
|
@ -17,26 +17,3 @@ The goal of this course is to cover the following modules:
|
|||
* Create a CI/CD pipeline using Azure
|
||||
* Customize a CI/CD pipeline using Azure
|
||||
* Learn how to develop a Machine Learning pipeline to update models and create service
|
||||
|
||||
|
||||
## How to Use this Site
|
||||
|
||||
*This site is intended to be the main resource to an instructor-led course, but anyone is welcome to learn here. The intent is to make this site self-guided and it is getting there.*
|
||||
|
||||
We recommend cloning this repository onto your local computer with a git-based program (like GitHub desktop for Windows) or you may download the site contents as a zip file by going to "Clone or Download" at the upper right of this repository.
|
||||
|
||||
|
||||
It is recommended that you do the labs in the below order:
|
||||
|
||||
1. lab00.0_Setup
|
||||
2. lab01.1_BuildPipeline
|
||||
|
||||
**For Instructor-Led:**
|
||||
* We recommend dowloading the site contents or cloning it to your local computer.
|
||||
* Follow along with the classroom instructions and training sessions.
|
||||
* When there is a lab indicated, you may find the lab instructions in the Labs folder.
|
||||
|
||||
**For Self-Study:**
|
||||
* We recommend dowloading the site contents or cloning it if you can do so to your local computer.
|
||||
* Go to Decks folder and follow along with the slides.
|
||||
* When there is a lab indicated, you may find the lab instructions in the Labs folder.
|
|
@ -22,50 +22,50 @@ steps:
|
|||
updateConda: false
|
||||
createOptions: 'cython==0.29 urllib3<1.24'
|
||||
- script: |
|
||||
pip install --user azureml-sdk pandas
|
||||
pip install --user azureml-sdk==1.0.17 pandas
|
||||
displayName: 'Install prerequisites'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/pipeline.py'
|
||||
displayName: 'Azure CLI devops/code/pipeline.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/pipeline.py'
|
||||
inlineScript: 'python devops/code/pipeline.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/evaluate_model.py'
|
||||
displayName: 'Azure CLI devops/code/evaluate_model.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/evaluate_model.py'
|
||||
inlineScript: 'python devops/code/evaluate_model.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/register_model.py'
|
||||
displayName: 'Azure CLI devops/code/register_model.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/register_model.py'
|
||||
inlineScript: 'python devops/code/register_model.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/create_scoring_image.py'
|
||||
displayName: 'Azure CLI devops/code/create_scoring_image.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/create_scoring_image.py'
|
||||
inlineScript: 'python devops/code/create_scoring_image.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/deploy_aci.py'
|
||||
displayName: 'Azure CLI devops/code/deploy_aci.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/deploy_aci.py'
|
||||
inlineScript: 'python devops/code/deploy_aci.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/aci_service_test.py'
|
||||
displayName: 'Azure CLI devops/code/aci_service_test.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/aci_service_test.py'
|
||||
inlineScript: 'python devops/code/aci_service_test.py'
|
||||
- script: |
|
||||
python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv
|
||||
displayName: 'Test Schema'
|
||||
python devops/code/testing/data_test.py devops/data_sample/predmain_bad_schema.csv
|
||||
displayName: 'Test Schema'
|
||||
|
|
|
@ -11,7 +11,7 @@ from azureml.core.webservice import Webservice
|
|||
ws = Workspace.from_config()
|
||||
|
||||
# Get the AKS Details
|
||||
os.chdir('./CICD')
|
||||
os.chdir('./devops')
|
||||
try:
|
||||
with open("aml_config/aks_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
|
|
|
@ -44,8 +44,6 @@ def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_valu
|
|||
:param direction:
|
||||
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
|
||||
"""
|
||||
|
||||
|
||||
results = detect_ts(df,
|
||||
max_anoms=max_anoms,
|
||||
alpha=alpha,
|
||||
|
@ -56,6 +54,7 @@ def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_valu
|
|||
|
||||
return results['anoms']['timestamp'].values
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser("anom_detect")
|
||||
|
||||
parser.add_argument("--output_directory", type=str, help="output directory")
|
||||
|
@ -67,13 +66,12 @@ os.makedirs(args.output_directory, exist_ok=True)
|
|||
# public store of telemetry data
|
||||
data_dir = 'https://sethmottstore.blob.core.windows.net/predmaint/'
|
||||
|
||||
|
||||
print("Reading data ... ", end="")
|
||||
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
|
||||
print("Done.")
|
||||
|
||||
print("Adding incremental data...")
|
||||
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
|
||||
telemetry_incremental = pd.read_csv(os.path.join('devops/data_sample/', 'telemetry_incremental.csv'))
|
||||
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
|
||||
print("Done.")
|
||||
|
||||
|
@ -81,7 +79,6 @@ print("Parsing datetime...", end="")
|
|||
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
print("Done.")
|
||||
|
||||
|
||||
window_size = 12 # how many measures to include in rolling average
|
||||
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
|
||||
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
|
||||
|
@ -116,5 +113,3 @@ for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is
|
|||
pickle.dump(obj, fp)
|
||||
|
||||
t.toc("Processing machine %s took" % machine_id)
|
||||
|
||||
|
||||
|
|
|
@ -46,6 +46,7 @@ import os
|
|||
def download_data():
|
||||
os.makedirs('../data', exist_ok = True)
|
||||
container = 'https://sethmottstore.blob.core.windows.net/predmaint/'
|
||||
|
||||
urllib.request.urlretrieve(container + 'telemetry.csv', filename='../data/telemetry.csv')
|
||||
urllib.request.urlretrieve(container + 'maintenance.csv', filename='../data/maintenance.csv')
|
||||
urllib.request.urlretrieve(container + 'machines.csv', filename='../data/machines.csv')
|
||||
|
@ -53,6 +54,7 @@ def download_data():
|
|||
# we replace errors.csv with anoms.csv (results from running anomaly detection)
|
||||
# urllib.request.urlretrieve(container + 'errors.csv', filename='../data/errors.csv')
|
||||
urllib.request.urlretrieve(container + 'anoms.csv', filename='../data/anoms.csv')
|
||||
|
||||
df_telemetry = pd.read_csv('../data/telemetry.csv', header=0)
|
||||
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
df_errors = pd.read_csv('../data/anoms.csv', header=0)
|
||||
|
@ -69,8 +71,10 @@ def download_data():
|
|||
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
return df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines
|
||||
|
||||
|
||||
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
|
||||
keys = ['machineID', 'datetime']
|
||||
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
|
||||
|
@ -104,6 +108,7 @@ def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = No
|
|||
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
|
||||
return df
|
||||
|
||||
|
||||
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
|
||||
"""
|
||||
calculates rolling averages and standard deviations
|
||||
|
@ -137,7 +142,6 @@ def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon =
|
|||
return df_res
|
||||
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser("automl_train")
|
||||
|
||||
parser.add_argument("--input_directory", type=str, help="input directory")
|
||||
|
@ -150,14 +154,9 @@ run = Run.get_context()
|
|||
ws = run.experiment.workspace
|
||||
def_data_store = ws.get_default_datastore()
|
||||
|
||||
|
||||
# Choose a name for the experiment and specify the project folder.
|
||||
experiment_name = 'automl-local-classification'
|
||||
project_folder = '.'
|
||||
|
||||
|
||||
|
||||
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
print("Location:", ws.location)
|
||||
output = {}
|
||||
|
@ -191,9 +190,9 @@ df_join.head()
|
|||
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
|
||||
|
||||
# this will make it easier to automatically create features with the right column names
|
||||
#df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
#df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
#df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
# df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
# df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
# df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
cols_to_average = df_telemetry.columns[-4:]
|
||||
|
||||
|
@ -266,23 +265,20 @@ X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
|
|||
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
|
||||
|
||||
|
||||
azureml.train.automl.constants.Metric.CLASSIFICATION_PRIMARY_SET
|
||||
|
||||
primary_metric = 'AUC_weighted'
|
||||
|
||||
automl_config = AutoMLConfig(task='classification',
|
||||
preprocess=False,
|
||||
name=experiment_name,
|
||||
debug_log='automl_errors.log',
|
||||
primary_metric=primary_metric,
|
||||
max_time_sec=1200,
|
||||
iterations=2,
|
||||
n_cross_validations=2,
|
||||
verbosity=logging.INFO,
|
||||
automl_config = AutoMLConfig(task = 'classification',
|
||||
preprocess = False,
|
||||
name = experiment_name,
|
||||
debug_log = 'automl_errors.log',
|
||||
primary_metric = primary_metric,
|
||||
max_time_sec = 1200,
|
||||
iterations = 2,
|
||||
n_cross_validations = 2,
|
||||
verbosity = logging.INFO,
|
||||
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
|
||||
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
|
||||
path=project_folder, )
|
||||
|
||||
path = project_folder, )
|
||||
|
||||
local_run = experiment.submit(automl_config, show_output = True)
|
||||
|
||||
|
@ -331,11 +327,11 @@ run_id['run_id'] = best_run.id
|
|||
run_id['experiment_name'] = best_run.experiment.name
|
||||
|
||||
# save run info
|
||||
os.makedirs('aml_config', exist_ok=True)
|
||||
os.makedirs('aml_config', exist_ok = True)
|
||||
with open('aml_config/run_id.json', 'w') as outfile:
|
||||
json.dump(run_id, outfile)
|
||||
|
||||
# upload run info and model (pkl) to def_data_store, so that pipeline mast can access it
|
||||
def_data_store.upload(src_dir='aml_config', target_path='aml_config', overwrite=True)
|
||||
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
|
||||
|
||||
def_data_store.upload(src_dir='outputs', target_path='outputs', overwrite=True)
|
||||
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)
|
||||
|
|
|
@ -24,7 +24,7 @@ model_list = Model.list(workspace=ws)
|
|||
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
|
||||
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
|
||||
|
||||
os.chdir('./CICD/code/scoring')
|
||||
os.chdir('./devops/code/scoring')
|
||||
image_name = "predmaintenance-model-score"
|
||||
|
||||
image_config = ContainerImage.image_configuration(execution_script = "score.py",
|
||||
|
|
|
@ -1,31 +1,33 @@
|
|||
|
||||
############################### load required libraries
|
||||
|
||||
import os
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
import azureml.core
|
||||
from azureml.core import Workspace, Run, Experiment, Datastore
|
||||
from azureml.core.compute import AmlCompute
|
||||
from azureml.core.compute import ComputeTarget
|
||||
from azureml.core.runconfig import CondaDependencies, RunConfiguration
|
||||
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
|
||||
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
|
||||
from azureml.pipeline.steps import PythonScriptStep
|
||||
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
|
||||
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
print("SDK Version:", azureml.core.VERSION)
|
||||
|
||||
############################### load workspace and create experiment
|
||||
|
||||
ws = Workspace.from_config()
|
||||
print('Workspace name: ' + ws.name,
|
||||
'Subscription id: ' + ws.subscription_id,
|
||||
'Resource group: ' + ws.resource_group, sep = '\n')
|
||||
|
||||
|
||||
experiment_name = 'aml-pipeline_cicd' # choose a name for experiment
|
||||
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
|
||||
project_folder = '.' # project folder
|
||||
|
||||
experiment=Experiment(ws, experiment_name)
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
print("Location:", ws.location)
|
||||
output = {}
|
||||
output['SDK version'] = azureml.core.VERSION
|
||||
|
@ -36,23 +38,22 @@ output['Location'] = ws.location
|
|||
output['Project Directory'] = project_folder
|
||||
output['Experiment Name'] = experiment.name
|
||||
pd.set_option('display.max_colwidth', -1)
|
||||
pd.DataFrame(data=output, index=['']).T
|
||||
pd.DataFrame(data = output, index = ['']).T
|
||||
|
||||
set_diagnostics_collection(send_diagnostics=True)
|
||||
|
||||
print("SDK Version:", azureml.core.VERSION)
|
||||
############################### create a run config
|
||||
|
||||
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
|
||||
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
|
||||
|
||||
# Runconfig
|
||||
amlcompute_run_config = RunConfiguration(framework="python", conda_dependencies=cd)
|
||||
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
|
||||
amlcompute_run_config.environment.docker.enabled = False
|
||||
amlcompute_run_config.environment.docker.gpu_support = False
|
||||
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
|
||||
amlcompute_run_config.environment.spark.precache_packages = False
|
||||
|
||||
############################### create AML compute
|
||||
|
||||
# create AML compute
|
||||
aml_compute_target = "aml-compute"
|
||||
try:
|
||||
aml_compute = AmlCompute(ws, aml_compute_target)
|
||||
|
@ -60,15 +61,17 @@ try:
|
|||
except:
|
||||
print("creating new compute target")
|
||||
|
||||
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
|
||||
idle_seconds_before_scaledown=1800,
|
||||
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
|
||||
idle_seconds_before_scaledown=1800,
|
||||
min_nodes = 0,
|
||||
max_nodes = 4)
|
||||
max_nodes = 4)
|
||||
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
|
||||
aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
|
||||
|
||||
print("Azure Machine Learning Compute attached")
|
||||
|
||||
############################### point to data and scripts
|
||||
|
||||
# we use this for exchanging data between pipeline steps
|
||||
def_data_store = ws.get_default_datastore()
|
||||
|
||||
|
@ -77,43 +80,42 @@ def_blob_store = Datastore(ws, "workspaceblobstore")
|
|||
print("Blobstore's name: {}".format(def_blob_store.name))
|
||||
|
||||
# Naming the intermediate data as anomaly data and assigning it to a variable
|
||||
anomaly_data = PipelineData("anomaly_data", datastore=def_blob_store)
|
||||
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
|
||||
print("Anomaly data object created")
|
||||
|
||||
# model = PipelineData("model", datastore=def_data_store)
|
||||
# model = PipelineData("model", datastore = def_data_store)
|
||||
# print("Model data object created")
|
||||
|
||||
|
||||
|
||||
anom_detect = PythonScriptStep(name="anomaly_detection",
|
||||
anom_detect = PythonScriptStep(name = "anomaly_detection",
|
||||
# script_name="anom_detect.py",
|
||||
script_name="CICD/code/anom_detect.py",
|
||||
arguments=["--output_directory", anomaly_data],
|
||||
outputs=[anomaly_data],
|
||||
compute_target=aml_compute,
|
||||
source_directory=project_folder,
|
||||
allow_reuse=True,
|
||||
runconfig=amlcompute_run_config)
|
||||
script_name = "devops/code/anom_detect.py",
|
||||
arguments = ["--output_directory", anomaly_data],
|
||||
outputs = [anomaly_data],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
print("Anomaly Detection Step created.")
|
||||
|
||||
|
||||
automl_step = PythonScriptStep(name="automl_step",
|
||||
# script_name="automl_step.py",
|
||||
script_name="CICD/code/automl_step.py",
|
||||
arguments=["--input_directory", anomaly_data],
|
||||
inputs=[anomaly_data],
|
||||
# outputs=[model],
|
||||
compute_target=aml_compute,
|
||||
source_directory=project_folder,
|
||||
allow_reuse=True,
|
||||
runconfig=amlcompute_run_config)
|
||||
automl_step = PythonScriptStep(name = "automl_step",
|
||||
# script_name = "automl_step.py",
|
||||
script_name = "devops/code/automl_step.py",
|
||||
arguments = ["--input_directory", anomaly_data],
|
||||
inputs = [anomaly_data],
|
||||
# outputs = [model],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
|
||||
print("AutoML Training Step created.")
|
||||
|
||||
############################### set up, validate and run pipeline
|
||||
|
||||
steps = [anom_detect, automl_step]
|
||||
print("Step lists created")
|
||||
|
||||
pipeline = Pipeline(workspace=ws, steps=steps)
|
||||
pipeline = Pipeline(workspace = ws, steps = steps)
|
||||
print ("Pipeline is built")
|
||||
|
||||
pipeline.validate()
|
||||
|
@ -126,16 +128,18 @@ print("Pipeline is submitted for execution")
|
|||
pipeline_run.wait_for_completion(show_output = False)
|
||||
print("Pipeline run completed")
|
||||
|
||||
# Download aml_config info and output of automl_step
|
||||
def_data_store.download(target_path='.',
|
||||
prefix='aml_config',
|
||||
show_progress=True,
|
||||
overwrite=True)
|
||||
############################### upload artifacts to AML Workspace
|
||||
|
||||
def_data_store.download(target_path='.',
|
||||
prefix='outputs',
|
||||
show_progress=True,
|
||||
overwrite=True)
|
||||
# Download aml_config info and output of automl_step
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'aml_config',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'outputs',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
print("Updated aml_config and outputs folder")
|
||||
|
||||
model_fname = 'model.pkl'
|
||||
|
|
|
@ -1,71 +0,0 @@
|
|||
# Setup
|
||||
|
||||
This lab allows you to perform setup for building a Continuous Integration/Continuous Deployment pipeline related to Anomoly Detection and Predictive Maintenance.
|
||||
|
||||
### Pre-requisites
|
||||
|
||||
- Azure account
|
||||
- Azure DevOps Account
|
||||
- Azure Machine Learning Service Workspace
|
||||
- Basic knowledge of Python
|
||||
|
||||
After you launch your environment, follow the below steps:
|
||||
|
||||
### Azure Machine Learning Service Workspace
|
||||
|
||||
We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:
|
||||
|
||||
1. Login to Azure portal using the credentials provided with the environment.
|
||||
|
||||
2. Select `Create a Resource` and search the marketplace for `Machine Learning Service Workspace`.
|
||||
|
||||
![Market Place](../../images/marketplace.png)
|
||||
|
||||
3. Select `Machine Learning Service Workspace` followed by `Create`:
|
||||
|
||||
![Create Workspace](../../images/createWorkspace.png)
|
||||
|
||||
4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):
|
||||
|
||||
![Workspace Fields](../../images/workspaceFields.png)
|
||||
|
||||
### Sign in to Azure DevOps
|
||||
|
||||
Go to https://dev.azure.com and login using the username and password provided. After logging in, you should see the below:
|
||||
|
||||
![Get Started](../../images/getStarted.png)
|
||||
|
||||
### Create a Project
|
||||
|
||||
Create a Private project by providing a `Project name`. With private projects, only people you give access to will be able to view this project. Select `Create` to create the project.
|
||||
|
||||
### Create Service connection
|
||||
|
||||
The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a `Service Connection` and authorizing the build pipeline to use this connection.
|
||||
|
||||
> If we didn't set up this `service connection`, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.
|
||||
|
||||
Setting up a service connection involves the following steps:
|
||||
1. Click on `Project settings` in the bottom-left corner of your screen.
|
||||
1. On the next page, search for menu section `Pipelines` and select `Service Connection`.
|
||||
1. Create a `New service connection`, of type `Azure Resource Manager`.
|
||||
1. Properties of connection:
|
||||
1. `Service Principal Authentication`
|
||||
1. **Important!** Set `connection name` to "serviceConnection" (careful about capitalization).
|
||||
1. `Scope level`: Subscription
|
||||
1. `Subscription`: Select the same which you have been using throughout the course. You may already have a compute target in there (e.g. "aml-copute") and a AML workspace.
|
||||
1. **Important!** Leave `Resource Group` empty.
|
||||
1. Allow all pipelines to use this connection.
|
||||
|
||||
|
||||
|
||||
|
||||
### Repository
|
||||
|
||||
After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to import using the `import` wizard found in Repos -> Files -> Import as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.
|
||||
|
||||
![import repository](../../images/importGit.png)
|
||||
|
||||
After running the above steps, your repo should now be populated and would look like below:
|
||||
|
||||
![Git Repo](../../images/gitRepo.png)
|
|
@ -1,232 +0,0 @@
|
|||
# Building the pipeline
|
||||
|
||||
Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated.
|
||||
|
||||
### A. Hosted Agents
|
||||
|
||||
With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care for you. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:
|
||||
|
||||
- Ubuntu 16.04
|
||||
- Visual Studio 2017 on Windows Server 2016
|
||||
- macOS 10.13
|
||||
- Windows Server 1803 (win1803) - for running Windows containers
|
||||
- Visual Studio 2015 on Windows Server 2012R2
|
||||
|
||||
YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use.
|
||||
|
||||
### B. Code Repository
|
||||
|
||||
The repo is organized as follows:
|
||||
|
||||
```
|
||||
code
|
||||
code/testing/
|
||||
code/scoring/
|
||||
code/aml_config/
|
||||
data_sample
|
||||
azure-pipelines.yml
|
||||
```
|
||||
|
||||
The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`.
|
||||
|
||||
Sample data is created in `data_sample` that is used for testing. `azure-pipelines.yml` file at the root of your repository contains the instructions for the pipeline.
|
||||
|
||||
### C. Config
|
||||
|
||||
Create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:
|
||||
|
||||
```
|
||||
{
|
||||
"subscription_id": ".......",
|
||||
"resource_group": ".......",
|
||||
"workspace_name": ".......",
|
||||
"workspace_region": "......."
|
||||
}
|
||||
```
|
||||
|
||||
You can get all of the info from the Machine Learning service workspace created in the portal as shown below:
|
||||
|
||||
![ML Workspace](../../images/mlworkspace.png)
|
||||
|
||||
### D. Secure Files
|
||||
|
||||
It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.
|
||||
|
||||
The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.
|
||||
|
||||
#### Upload Secure File
|
||||
|
||||
1. Select Pipelines, Library and Secure Files as shown below:
|
||||
|
||||
!Upload Secure File](../../images/uploadSecureFile.png)
|
||||
|
||||
2. Select `+Secure File` to upload config.json file.
|
||||
|
||||
3. Select the uploaded file `config.json` and ensure `Authorize for use in all pipelines` is ticked. Select `Save`:
|
||||
|
||||
![Authorize Pipeline](../../images/authorizePipeline.png)
|
||||
|
||||
### E. Build
|
||||
|
||||
Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.
|
||||
|
||||
#### New Pipeline
|
||||
|
||||
1. To create a new pipeline, select `New pipeline` from the Pipelines blade:
|
||||
|
||||
![New Pipeline](../../images/newPipeline.png)
|
||||
|
||||
2. You will be prompted with "Where is your code?". Select `Azure Repos` followed by your repo.
|
||||
|
||||
3. Select `Run`. Once the agent is allocated, you'll start seeing the live logs of the build.
|
||||
|
||||
#### Interactive Authentication
|
||||
|
||||
At the train step, you will recieve a message for interactive authentication as shown below. Open a web browser to open the page https://microsoft.com/devicelogin and enter the code to authenticate for the build to resume.
|
||||
|
||||
![Interactive Auth](../../images/interactiveAuth.png)
|
||||
|
||||
Eventually on success, the build status would appear as follows:
|
||||
|
||||
![Job](../../images/job.png)
|
||||
|
||||
#### Notification
|
||||
|
||||
The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification.
|
||||
|
||||
### F. Azure Pipelines with YAML
|
||||
|
||||
You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure.
|
||||
|
||||
The basic steps include:
|
||||
|
||||
1. Configure Azure Pipelines to use your Git repo.
|
||||
2. Edit your `azure-pipelines.yml` file to define your build.
|
||||
3. Push your code to your version control repository which kicks off the default trigger to build and deploy.
|
||||
4. Code is now updated, built, tested, and packaged. It can be deployed to any target.
|
||||
|
||||
![Pipelines-Image-Yam](../../images/pipelines-image-yaml.png)
|
||||
|
||||
|
||||
Open the yml file in the repo to understand the build steps.
|
||||
|
||||
### G. Test
|
||||
|
||||
In this workshop, multiple tests are included:
|
||||
|
||||
1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.
|
||||
|
||||
2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.
|
||||
|
||||
#### Exercise
|
||||
|
||||
- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types?
|
||||
|
||||
- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?
|
||||
|
||||
- Make sure `azure-pipelines.yml` captures the above changes
|
||||
|
||||
### H. Release
|
||||
|
||||
In this section, you will learn how to schedule release at specific times by defining one or more scheduled release triggers.
|
||||
|
||||
#### Create Release Pipeline
|
||||
|
||||
#### Time Trigger
|
||||
|
||||
1. Choose the schedule icon in the Artifacts section of your pipeline and enable scheduled release triggers. Note: you can configure multiple schedules.
|
||||
|
||||
![Release Time Trigger](../../images/releaseTimeTrigger.png)
|
||||
|
||||
2. Select a time to schedule release trigger. For viewing the trigger execution, you can choose a trigger time that's about 10 mins from now.
|
||||
|
||||
#### Build Trigger (Continuous deployment trigger)
|
||||
|
||||
Along with the time triggers, we cann can also create a release every time a new build is available.
|
||||
|
||||
1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:
|
||||
|
||||
![Release Build Trigger](../../images/releaseBuildTrigger.png)
|
||||
|
||||
2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting "master" will trigger a release for every build from the master branch.
|
||||
|
||||
#### Approvals
|
||||
|
||||
For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:
|
||||
|
||||
![Pending Approval](../../images/pendingApproval.png)
|
||||
|
||||
There is also provision to include comments with approval/reject:
|
||||
|
||||
![Approval Comments](../../images/approvalComments.png)
|
||||
|
||||
Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines:
|
||||
|
||||
![Release Passed](../../images/releasePassed.png)
|
||||
|
||||
#### I. Application Insights (Optional)
|
||||
|
||||
For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.
|
||||
|
||||
Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.
|
||||
|
||||
```python
|
||||
import requests
|
||||
import json
|
||||
|
||||
input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
|
||||
|
||||
data = json.dumps({'data': input_j})
|
||||
test_sample = bytes(data, encoding = 'utf8')
|
||||
|
||||
url = '<scoring_url>'
|
||||
api_key = '<key>'
|
||||
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}
|
||||
|
||||
resp = requests.post(url, test_sample, headers=headers)
|
||||
print(resp.text)
|
||||
|
||||
```
|
||||
|
||||
1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:
|
||||
|
||||
![ML Workspace](../../images/mlworkspace.png)
|
||||
|
||||
2. Select Analytics.
|
||||
|
||||
3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:
|
||||
|
||||
````
|
||||
requests
|
||||
| where timestamp > ago(3h)
|
||||
````
|
||||
|
||||
![LogAnalytics Query](../../images/logAnalyticsQuery.png)
|
||||
|
||||
|
||||
#### J. Service Principal Authentication (Optional)
|
||||
|
||||
ServicePrincipalAuthentication class allows for authentication using a service principle instead of users own identity. The class is ideal for automation and CI/CD scenarios where interactive authentication is not desired. The below snippet shows how you can create a workspace using ServicePrincipalAuthentication.
|
||||
|
||||
````python
|
||||
from azureml.core.authentication import ServicePrincipalAuthentication
|
||||
spa = ServicePrincipalAuthentication(<tenant_id>, <username>, <password>)
|
||||
# Example: spa = ServicePrincipalAuthentication('0e4cb6d6-25c5-4c27-a1ca-42a112c18b71', '59d48937-6e62-4223-b07a-711b11ad24b6', 'zcnv77SNT*Vu')
|
||||
ws = Workspace.from_config(auth=spa)
|
||||
````
|
||||
|
||||
#### Exercise
|
||||
|
||||
1. Replace `tenant_id`, `username` and `password` with the values generated during the lab creation by reading these values from a secure file. Modify `ws = Workspace.from_config()` in the scripts to use `ServicePrincipalAuthentication`. Perform build again to avoid interactive authentication.
|
||||
|
||||
#### K. Data Changes
|
||||
|
||||
A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:
|
||||
|
||||
````python
|
||||
print("Adding incremental data...")
|
||||
telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))
|
||||
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
|
||||
````
|
||||
|
||||
The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment.
|
После Ширина: | Высота: | Размер: 201 KiB |
После Ширина: | Высота: | Размер: 210 KiB |
После Ширина: | Высота: | Размер: 137 KiB |
После Ширина: | Высота: | Размер: 171 KiB |
Двоичные данные
images/importGit.png
До Ширина: | Высота: | Размер: 128 KiB После Ширина: | Высота: | Размер: 157 KiB |
После Ширина: | Высота: | Размер: 107 KiB |
|
@ -0,0 +1,367 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##![LearnAI Header](https://coursematerial.blob.core.windows.net/assets/LearnAI_header.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Applying a pipeline to structured streaming data\n",
|
||||
"\n",
|
||||
"## Overview (see also [Programming Guide](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html))\n",
|
||||
"\n",
|
||||
"Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive. You can use the Dataset/DataFrame API in Scala, Java, Python or R to express streaming aggregations, event-time windows, stream-to-batch joins, etc. The computation is executed on the same optimized Spark SQL engine. Finally, the system ensures end-to-end exactly-once fault-tolerance guarantees through checkpointing and Write-Ahead Logs. In short, Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream processing without the user having to reason about streaming.\n",
|
||||
"\n",
|
||||
"Internally, by default, Structured Streaming queries are processed using a micro-batch processing engine, which processes data streams as a series of small batch jobs thereby achieving end-to-end latencies as low as 100 milliseconds and exactly-once fault-tolerance guarantees. However, since Spark 2.3, we have introduced a new low-latency processing mode called Continuous Processing, which can achieve end-to-end latencies as low as 1 millisecond with at-least-once guarantees. Without changing the Dataset/DataFrame operations in your queries, you will be able to choose the mode based on your application requirements."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load previously saved model\n",
|
||||
"\n",
|
||||
"Let's take in the model we saved earlier, and apply it to some streaming data!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .ansiout {\n",
|
||||
" display: block;\n",
|
||||
" unicode-bidi: embed;\n",
|
||||
" white-space: pre-wrap;\n",
|
||||
" word-wrap: break-word;\n",
|
||||
" word-break: break-all;\n",
|
||||
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
|
||||
" font-size: 13px;\n",
|
||||
" color: #555;\n",
|
||||
" margin-left: 4px;\n",
|
||||
" line-height: 19px;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<div class=\"ansiout\"></div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from pyspark.ml.pipeline import PipelineModel\n",
|
||||
"\n",
|
||||
"fileName = \"my_pipeline\"\n",
|
||||
"pipelineModel = PipelineModel.load(fileName)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Initiate Data Stream\n",
|
||||
"\n",
|
||||
"Here, we are going to simulate streaming data, by reading in the DataFrame from the previous lab, but serving it as a stream to our pipeline.\n",
|
||||
"\n",
|
||||
"**Note**: You must specify a schema when creating a streaming source DataFrame. Why!?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .ansiout {\n",
|
||||
" display: block;\n",
|
||||
" unicode-bidi: embed;\n",
|
||||
" white-space: pre-wrap;\n",
|
||||
" word-wrap: break-word;\n",
|
||||
" word-break: break-all;\n",
|
||||
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
|
||||
" font-size: 13px;\n",
|
||||
" color: #555;\n",
|
||||
" margin-left: 4px;\n",
|
||||
" line-height: 19px;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<div class=\"ansiout\"></div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from pyspark.sql.types import *\n",
|
||||
"\n",
|
||||
"schema = StructType([\n",
|
||||
" StructField(\"tweet\",StringType()), \n",
|
||||
" StructField(\"existence\",IntegerType()),\n",
|
||||
" StructField(\"confidence\",FloatType())])\n",
|
||||
"\n",
|
||||
"streamingData = (spark\n",
|
||||
" .readStream\n",
|
||||
" .schema(schema)\n",
|
||||
" .option(\"maxFilesPerTrigger\", 1)\n",
|
||||
" .parquet(\"dbfs:/gwDF\"))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now we are going to use our `pipelineModel` to transform the `streamingData`. The output will be called `stream`: a confusion matrix for evaluating the performance of the model."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .table-result-container {\n",
|
||||
" max-height: 300px;\n",
|
||||
" overflow: auto;\n",
|
||||
" }\n",
|
||||
" table, th, td {\n",
|
||||
" border: 1px solid black;\n",
|
||||
" border-collapse: collapse;\n",
|
||||
" }\n",
|
||||
" th, td {\n",
|
||||
" padding: 5px;\n",
|
||||
" }\n",
|
||||
" th {\n",
|
||||
" text-align: left;\n",
|
||||
" }\n",
|
||||
"</style><div class='table-result-container'><table class='table-result'><thead style='background-color: white'><tr><th>existence</th><th>prediction</th><th>count</th></tr></thead><tbody><tr><td>0</td><td>0.0</td><td>890</td></tr><tr><td>0</td><td>1.0</td><td>185</td></tr><tr><td>1</td><td>0.0</td><td>58</td></tr><tr><td>1</td><td>1.0</td><td>2997</td></tr></tbody></table></div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"stream = (pipelineModel\n",
|
||||
" .transform(streamingData)\n",
|
||||
" .groupBy(\"existence\", \"prediction\")\n",
|
||||
" .count()\n",
|
||||
" .sort(\"existence\", \"prediction\"))\n",
|
||||
"\n",
|
||||
"display(stream)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Optimization\n",
|
||||
"\n",
|
||||
"Why is this stream taking so long? What configuration should we set?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .ansiout {\n",
|
||||
" display: block;\n",
|
||||
" unicode-bidi: embed;\n",
|
||||
" white-space: pre-wrap;\n",
|
||||
" word-wrap: break-word;\n",
|
||||
" word-break: break-all;\n",
|
||||
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
|
||||
" font-size: 13px;\n",
|
||||
" color: #555;\n",
|
||||
" margin-left: 4px;\n",
|
||||
" line-height: 19px;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<div class=\"ansiout\"><span class=\"ansired\">Out[</span><span class=\"ansired\">4</span><span class=\"ansired\">]: </span>'200'\n",
|
||||
"</div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"spark.conf.get(\"spark.sql.shuffle.partitions\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .ansiout {\n",
|
||||
" display: block;\n",
|
||||
" unicode-bidi: embed;\n",
|
||||
" white-space: pre-wrap;\n",
|
||||
" word-wrap: break-word;\n",
|
||||
" word-break: break-all;\n",
|
||||
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
|
||||
" font-size: 13px;\n",
|
||||
" color: #555;\n",
|
||||
" margin-left: 4px;\n",
|
||||
" line-height: 19px;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<div class=\"ansiout\"></div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"spark.conf.set(\"spark.sql.shuffle.partitions\", \"8\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> See this [post](https://umbertogriffo.gitbooks.io/apache-spark-best-practices-and-tuning/content/sparksqlshufflepartitions_draft.html) for a detailed look into how to estimate the size of your data and choosing the right number of partitions. \n",
|
||||
"\n",
|
||||
"Let's try this again"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .table-result-container {\n",
|
||||
" max-height: 300px;\n",
|
||||
" overflow: auto;\n",
|
||||
" }\n",
|
||||
" table, th, td {\n",
|
||||
" border: 1px solid black;\n",
|
||||
" border-collapse: collapse;\n",
|
||||
" }\n",
|
||||
" th, td {\n",
|
||||
" padding: 5px;\n",
|
||||
" }\n",
|
||||
" th {\n",
|
||||
" text-align: left;\n",
|
||||
" }\n",
|
||||
"</style><div class='table-result-container'><table class='table-result'><thead style='background-color: white'><tr><th>existence</th><th>prediction</th><th>count</th></tr></thead><tbody><tr><td>0</td><td>0.0</td><td>890</td></tr><tr><td>0</td><td>1.0</td><td>185</td></tr><tr><td>1</td><td>0.0</td><td>58</td></tr><tr><td>1</td><td>1.0</td><td>2997</td></tr></tbody></table></div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"stream = (pipelineModel\n",
|
||||
" .transform(streamingData)\n",
|
||||
" .groupBy(\"existence\", \"prediction\")\n",
|
||||
" .count()\n",
|
||||
" .sort(\"existence\", \"prediction\"))\n",
|
||||
"\n",
|
||||
"display(stream)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Save the output\n",
|
||||
"\n",
|
||||
"We can save the output of the processed stream to a file."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import re\n",
|
||||
"\n",
|
||||
"streamingView = \"username\"\n",
|
||||
"checkpointFile = \"checkPoint\"\n",
|
||||
"dbutils.fs.rm(checkpointFile, True) # clear out the checkpointing directory\n",
|
||||
"\n",
|
||||
"(stream\n",
|
||||
" .writeStream\n",
|
||||
" .format(\"memory\")\n",
|
||||
" .option(\"checkpointLocation\", checkpointFile)\n",
|
||||
" .outputMode(\"complete\")\n",
|
||||
" .queryName(streamingView)\n",
|
||||
" .start())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"display(sql(\"select * from \" + streamingView))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||
"\n",
|
||||
"Licensed under the MIT License."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.1"
|
||||
},
|
||||
"name": "05_structured_streaming",
|
||||
"notebookId": 4057188818416178
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 1
|
||||
}
|
|
@ -1,220 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Exporting notebooks from ADB\n",
|
||||
"\n",
|
||||
"This notebook does two things:\n",
|
||||
"1. It recursively exports a folder recursively as a dbc archive.\n",
|
||||
"1. It recursively exports all notebooks in a folder as jupyter notebooks.\n",
|
||||
"\n",
|
||||
"We start with the setup"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"path = 'notebooks' # the path to folder in your ADB workspace\n",
|
||||
"\n",
|
||||
"region = 'westus'\n",
|
||||
"username = 'wopauli@microsoft.com' "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We configure the personal access token we configured in ADB. We are reading it in here to reduce the odds of accidentally exposing it."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"with open('token.txt', 'r') as f:\n",
|
||||
" token = f.read().strip()\n",
|
||||
" \n",
|
||||
"headers = {\n",
|
||||
" 'Authorization': 'Bearer %s' % token\n",
|
||||
"}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Next we download the entire DBC archive. THis serves multiple purposes:\n",
|
||||
"1. We have it exported.\n",
|
||||
"1. We will list its contents so that we can export jupyter notebooks one by one."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Starting export of DBC archive. This might take a while, depending on your connection.\n",
|
||||
"Done.\n",
|
||||
"Writing to file.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import requests\n",
|
||||
"\n",
|
||||
"url = 'https://%s.azuredatabricks.net/api/2.0/workspace/export?path=/Users/%s/%s&direct_download=true&format=DBC' % (region, username, path)\n",
|
||||
"\n",
|
||||
"print(\"Starting export of DBC archive. This might take a while, depending on your connection.\")\n",
|
||||
"r = requests.get(url=url, headers=headers)\n",
|
||||
"print(\"Done.\")\n",
|
||||
"\n",
|
||||
"if r.ok:\n",
|
||||
" print(\"Writing to file.\")\n",
|
||||
" with open(path + '.dbc', 'wb') as f:\n",
|
||||
" f.write(r.content)\n",
|
||||
"else:\n",
|
||||
" print(\"Downloading notebook archive failed\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We list the notebooks contained in the archive."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import zipfile\n",
|
||||
"\n",
|
||||
"path_to_zip_file = './notebooks.dbc'\n",
|
||||
"zip_ref = zipfile.ZipFile(path_to_zip_file, 'r')\n",
|
||||
"\n",
|
||||
"files = zip_ref.namelist()\n",
|
||||
"\n",
|
||||
"notebooks = [x for x in files if x.endswith('.python')]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We iterate through the notebooks, and export one by one as a jupyter notebook."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Working on: notebooks/day_2/05_automated_ML\n",
|
||||
"Working on: notebooks/day_2/03_aml_getting_started\n",
|
||||
"Working on: notebooks/day_2/04_ml_experimentation\n",
|
||||
"Working on: notebooks/day_2/01_logistic_regression\n",
|
||||
"Working on: notebooks/day_2/06_deployment\n",
|
||||
"Working on: notebooks/day_2/02_random_forests\n",
|
||||
"Working on: notebooks/day_1/04_hyperparameter_tuning\n",
|
||||
"Working on: notebooks/day_1/05_structured_streaming\n",
|
||||
"Working on: notebooks/day_1/03_sentiment_analysis\n",
|
||||
"Working on: notebooks/day_1/01_introduction\n",
|
||||
"Working on: notebooks/day_1/02_feature_engineering\n",
|
||||
"Working on: notebooks/tests/run_notebooks\n",
|
||||
"Working on: notebooks/includes/mnt_blob_rw\n",
|
||||
"Working on: notebooks/includes/mnt_blob\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"for notebook in notebooks:\n",
|
||||
" notebook = os.path.splitext(notebook)[0]\n",
|
||||
" print(\"Working on: %s\" % notebook)\n",
|
||||
" url = 'https://%s.azuredatabricks.net/api/2.0/workspace/export?path=/Users/%s/%s&direct_download=true&format=JUPYTER' % (region, username, notebook)\n",
|
||||
"\n",
|
||||
" r = requests.get(url=url, headers=headers)\n",
|
||||
" if r.ok:\n",
|
||||
" notebook_path, ipynb_notebook = os.path.split(notebook + \".ipynb\")\n",
|
||||
" \n",
|
||||
" if not os.path.exists(notebook_path):\n",
|
||||
" os.makedirs(notebook_path)\n",
|
||||
" \n",
|
||||
" with open(os.path.join(notebook_path, ipynb_notebook), 'wb') as f:\n",
|
||||
" f.write(r.content)\n",
|
||||
" else:\n",
|
||||
" print(\"Failed: %s\" % notebook)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'notebooks/includes'"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"notebook_path"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Consider using the following command to clear the output of all notebooks. \n",
|
||||
"\n",
|
||||
"*Note:* this may require `git bash` or `bash`, and may not work in vania\n",
|
||||
"\n",
|
||||
"jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace Notebook.ipynb"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1 +1,106 @@
|
|||
{"cells":[{"cell_type":"code","source":["import os\n\nsource = \"wasbs://data@coursematerial.blob.core.windows.net\"\nmount_point = \"/mnt/data\"\nextra_configs = {\"fs.azure.sas.data.coursematerial.blob.core.windows.net\":\"?sv=2018-03-28&ss=bfqt&srt=sco&sp=rwdlacup&se=2019-07-01T02:17:07Z&st=2019-02-14T19:17:07Z&spr=https&sig=1%2FnXywpfU6%2FYNLl0Zs1t5M8PF5p8ES7SPFX78tPtmYY%3D\"}\n\ntry:\n if len(os.listdir('/dbfs/mnt/data/')) > 0:\n print(\"Already mounted.\")\n else:\n dbutils.fs.mount(\n source = source,\n mount_point = mount_point,\n extra_configs = extra_configs)\n print(\"Mounted: %s at %s\" % (source, mount_point))\nexcept:\n dbutils.fs.mount(\n source = source,\n mount_point = mount_point,\n extra_configs = extra_configs)\n print(\"Mounted: %s at %s\" % (source, mount_point))"],"metadata":{},"outputs":[{"metadata":{},"output_type":"display_data","data":{"text/html":["<style scoped>\n .ansiout {\n display: block;\n unicode-bidi: embed;\n white-space: pre-wrap;\n word-wrap: break-word;\n word-break: break-all;\n font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n font-size: 13px;\n color: #555;\n margin-left: 4px;\n line-height: 19px;\n }\n</style>\n<div class=\"ansiout\">Mounted: wasbs://data@coursematerial.blob.core.windows.net at /mnt/data\n</div>"]}}],"execution_count":1},{"cell_type":"code","source":["# dbutils.fs.unmount('/mnt/data')"],"metadata":{},"outputs":[],"execution_count":2},{"cell_type":"code","source":["# os.listdir('/dbfs/mnt/data/')"],"metadata":{},"outputs":[{"metadata":{},"output_type":"display_data","data":{"text/html":["<style scoped>\n .ansiout {\n display: block;\n unicode-bidi: embed;\n white-space: pre-wrap;\n word-wrap: break-word;\n word-break: break-all;\n font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n font-size: 13px;\n color: #555;\n margin-left: 4px;\n line-height: 19px;\n }\n</style>\n<div class=\"ansiout\"><span class=\"ansired\">---------------------------------------------------------------------------</span>\n<span class=\"ansired\">FileNotFoundError</span> Traceback (most recent call last)\n<span class=\"ansigreen\"><command-1350352043586502></span> in <span class=\"ansicyan\"><module></span><span class=\"ansiblue\">()</span>\n<span class=\"ansigreen\">----> 1</span><span class=\"ansiyellow\"> </span>os<span class=\"ansiyellow\">.</span>listdir<span class=\"ansiyellow\">(</span><span class=\"ansiblue\">'/dbfs/mnt/data/'</span><span class=\"ansiyellow\">)</span><span class=\"ansiyellow\"></span>\n\n<span class=\"ansired\">FileNotFoundError</span>: [Errno 2] No such file or directory: '/dbfs/mnt/data/'</div>"]}}],"execution_count":3},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":4}],"metadata":{"name":"mnt_blob","notebookId":4057188818416716},"nbformat":4,"nbformat_minor":0}
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<style scoped>\n",
|
||||
" .ansiout {\n",
|
||||
" display: block;\n",
|
||||
" unicode-bidi: embed;\n",
|
||||
" white-space: pre-wrap;\n",
|
||||
" word-wrap: break-word;\n",
|
||||
" word-break: break-all;\n",
|
||||
" font-family: \"Source Code Pro\", \"Menlo\", monospace;;\n",
|
||||
" font-size: 13px;\n",
|
||||
" color: #555;\n",
|
||||
" margin-left: 4px;\n",
|
||||
" line-height: 19px;\n",
|
||||
" }\n",
|
||||
"</style>\n",
|
||||
"<div class=\"ansiout\">Mounted: wasbs://data@coursematerial.blob.core.windows.net at /mnt/data\n",
|
||||
"</div>"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"source = \"wasbs://data@coursematerial.blob.core.windows.net\"\n",
|
||||
"mount_point = \"/mnt/data\"\n",
|
||||
"extra_configs = {\"fs.azure.sas.data.coursematerial.blob.core.windows.net\":\"?sv=2018-03-28&ss=bfqt&srt=sco&sp=rwdlacup&se=2019-07-01T02:17:07Z&st=2019-02-14T19:17:07Z&spr=https&sig=1%2FnXywpfU6%2FYNLl0Zs1t5M8PF5p8ES7SPFX78tPtmYY%3D\"}\n",
|
||||
"\n",
|
||||
"try:\n",
|
||||
" if len(os.listdir('/dbfs/mnt/data/')) > 0:\n",
|
||||
" print(\"Already mounted.\")\n",
|
||||
" else:\n",
|
||||
" dbutils.fs.mount(\n",
|
||||
" source = source,\n",
|
||||
" mount_point = mount_point,\n",
|
||||
" extra_configs = extra_configs)\n",
|
||||
" print(\"Mounted: %s at %s\" % (source, mount_point))\n",
|
||||
"except:\n",
|
||||
" dbutils.fs.mount(\n",
|
||||
" source = source,\n",
|
||||
" mount_point = mount_point,\n",
|
||||
" extra_configs = extra_configs)\n",
|
||||
" print(\"Mounted: %s at %s\" % (source, mount_point))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# dbutils.fs.unmount('/mnt/data')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# os.listdir('/dbfs/mnt/data/')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.1"
|
||||
},
|
||||
"name": "mnt_blob",
|
||||
"notebookId": 4057188818416716
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 1
|
||||
}
|
||||
|
|
|
@ -0,0 +1,665 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Azure DevOps\n",
|
||||
"\n",
|
||||
"With Azure DevOps data scientists and application developers can work together to create and maintain AI-infused applications. Using a DevOps mindset is not new to software developers, who are used to running applications in production. However, data scientists in the past have often worked in silos and not followed best practices to facilitate the transition from development to production. With Azure DevOps data scientists can now develop with an eye toward production."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Part 1: Getting started\n",
|
||||
"\n",
|
||||
"This lab allows you to perform setup for building a **Continuous Integration/Continuous Deployment** pipeline related to Anomoly Detection and Predictive Maintenance.\n",
|
||||
"\n",
|
||||
"### Pre-requisites\n",
|
||||
"\n",
|
||||
"- Azure account\n",
|
||||
"- Azure DevOps account\n",
|
||||
"- Azure Machine Learning Service Workspace\n",
|
||||
"- Basic knowledge of Python\n",
|
||||
"\n",
|
||||
"After you launch your environment, follow the below steps:\n",
|
||||
"\n",
|
||||
"### Azure Machine Learning Service Workspace\n",
|
||||
"\n",
|
||||
"We will begin the lab by creating a new Machine Learning Service Workspace using Azure portal:\n",
|
||||
"\n",
|
||||
"1. Login to Azure portal using the credentials provided with the environment.\n",
|
||||
"\n",
|
||||
"2. Select **Create a Resource** and search the marketplace for **Machine Learning Service Workspace**.\n",
|
||||
"\n",
|
||||
"![Market Place](../images/marketplace.png)\n",
|
||||
"\n",
|
||||
"3. Select **Machine Learning Service Workspace** followed by **Create**:\n",
|
||||
"\n",
|
||||
"![Create Workspace](../images/createWorkspace.png)\n",
|
||||
"\n",
|
||||
"4. Populate the mandatory fields (Workspace name, Subscription, Resource group and Location):\n",
|
||||
"\n",
|
||||
"![Workspace Fields](../images/workspaceFields.png)\n",
|
||||
"\n",
|
||||
"### Sign in to Azure DevOps\n",
|
||||
"\n",
|
||||
"Go to **https://dev.azure.com** and login using your Azure username and password. You will be asked to provide a name and email. An organization is created for you based on the name you provide. Within the organization, you will be asked to create a project. Name your project \"ADPM\" and click on **Create project**. With private projects, only people you give access to will be able to view this project. After logging in, you should see the below:\n",
|
||||
"\n",
|
||||
"![Get Started](../images/getStarted.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Create Service connection\n",
|
||||
"\n",
|
||||
"The build pipeline for our project will need the proper permission settings so that it can create a remote compute target in Azure. This can be done by setting up a **Service Connection** and authorizing the build pipeline to use this connection.\n",
|
||||
"\n",
|
||||
"> If we didn't set up this **service connection**, we would have to interactively log into Azure (e.g. az login) everytime we run the build pipeline.\n",
|
||||
"\n",
|
||||
"Setting up a service connection involves the following steps:\n",
|
||||
"1. Click on **Project settings** in the bottom-left corner of your screen.\n",
|
||||
"2. On the next page, search for menu section **Pipelines** and select **Service Connection**.\n",
|
||||
"3. Create a **New service connection**, of type **Azure Resource Manager**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/createServiceConnection.png)\n",
|
||||
"\n",
|
||||
"4. On the page you are presented with, scroll down and click on the link saying **use the full version of the service connection dialog**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/changeToFullVersionServiceConnection.png)\n",
|
||||
"\n",
|
||||
"5. Begin filling out the full version of the form. All the information you need is provided in the lab setup page. If you closed this page, a link to it was emailed to you. Look for emails from **No Reply (CloudLabs) <noreply@cloudlabs.ai>**.\n",
|
||||
"\n",
|
||||
"![Get Started](../images/fullDialogueServiceConnection.png \"width=50\")\n",
|
||||
"\n",
|
||||
" - **Important!** Set **connection name** to **serviceConnection** (careful about capitalization).\n",
|
||||
" - For **Service principal client ID** paste the field called **Application/Client Id** in the lab setup page.\n",
|
||||
" - Set **Scope level** to **Subscription**.\n",
|
||||
" - For **Subscription**, select the same which you have been using throughout the course. You may already have a compute target in there (e.g. \"aml-copute\") and a AML workspace.\n",
|
||||
" - **Important!** Leave **Resource Group** empty.\n",
|
||||
" - For **Service principal key** paste the filed called **Application Secret Key** in the lab setup page.\n",
|
||||
" - Allow all pipelines to use this connection.\n",
|
||||
" - Click on **Verify connection** to make sure the connection is valid and then click on **OK**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Repository\n",
|
||||
"\n",
|
||||
"After you create your project in Azure DevOps, the next step is to clone our repository into your DevOps project. The simplest way is to go to **Repos > Files > Import** as shown below. Provide the clone url (https://github.com/azure/learnai-customai-airlift) in the wizard to import.\n",
|
||||
"\n",
|
||||
"![import repository](../images/importGit.png)\n",
|
||||
"\n",
|
||||
"You should now be able to see the git repo in your project."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Part 2: Building a pipeline\n",
|
||||
"\n",
|
||||
"Tha aim of this lab is to demonstrate how you can build a Continuous Integration/Continuous Deployment pipeline and kick it off when there is a new commit. This scenario is typically very common when a developer has updated the application part of the code repository or when the training script from a data scientist is updated."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Hosted Agents\n",
|
||||
"\n",
|
||||
"With Azure Pipelines, you've got a convenient option to build and deploy using a **Microsoft-hosted agent**. Each time you run a pipeline, you get a fresh virtual machine and maintenance/upgrades are taken care of. The virtual machine is discarded after one use. The Microsoft-hosted agent pool provides 5 virtual machine images to choose from:\n",
|
||||
"\n",
|
||||
"- Ubuntu 16.04\n",
|
||||
"- Visual Studio 2017 on Windows Server 2016\n",
|
||||
"- macOS 10.13\n",
|
||||
"- Windows Server 1803 (win1803) - for running Windows containers\n",
|
||||
"- Visual Studio 2015 on Windows Server 2012R2\n",
|
||||
"\n",
|
||||
"YAML-based pipelines will default to the Microsoft-hosted agent pool. You simply need to specify which virtual machine image you want to use."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Code Repository\n",
|
||||
"\n",
|
||||
"The repo is organized as follows:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
" code\n",
|
||||
" code/testing/\n",
|
||||
" code/scoring/\n",
|
||||
" code/aml_config/\n",
|
||||
" data_sample\n",
|
||||
" azure-pipelines.yml\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"The `code` folder contains all the python scripts to build the pipeline. The testing and scoring scripts are located in `code/testing/` and `code/scoring/` respectively. The config files created by the scripts are stored in `code/aml_config/`.\n",
|
||||
"\n",
|
||||
"Sample data is created in `data_sample` that is used for testing. `azure-pipelines.yml` file at the root of your repository contains the instructions for the pipeline."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## About the scripts\n",
|
||||
"\n",
|
||||
"For the purpose of DevOps, it's best not to use a Notebook because it can be error-prone. Instead, we have all the code sitting in individual Python scripts. This means that if we used a Notebook to develop our scripts, like we did throughout this course, we have some work to do to refactor the code and turn it into a series of modular Python scripts. We would also add scripts for running various tests everytime our build is triggered, such as unit tests, integration tests, tests to measure **drift** (a degradation over time of the predictions returned by the model on incoming data), etc.\n",
|
||||
"\n",
|
||||
"Let's take a look at the different scripts we have to deal with and what each does."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# %load ./code/pipeline.py\n",
|
||||
"\n",
|
||||
"############################### load required libraries\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"import pandas as pd\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"import azureml.core\n",
|
||||
"from azureml.core import Workspace, Run, Experiment, Datastore\n",
|
||||
"from azureml.core.compute import AmlCompute\n",
|
||||
"from azureml.core.compute import ComputeTarget\n",
|
||||
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
|
||||
"from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
|
||||
"from azureml.telemetry import set_diagnostics_collection\n",
|
||||
"from azureml.pipeline.steps import PythonScriptStep\n",
|
||||
"from azureml.pipeline.core import Pipeline, PipelineData, StepSequence\n",
|
||||
"\n",
|
||||
"print(\"SDK Version:\", azureml.core.VERSION)\n",
|
||||
"\n",
|
||||
"############################### load workspace and create experiment\n",
|
||||
"\n",
|
||||
"ws = Workspace.from_config()\n",
|
||||
"print('Workspace name: ' + ws.name, \n",
|
||||
" 'Subscription id: ' + ws.subscription_id, \n",
|
||||
" 'Resource group: ' + ws.resource_group, sep = '\\n')\n",
|
||||
"\n",
|
||||
"experiment_name = 'aml-pipeline-cicd' # choose a name for experiment\n",
|
||||
"project_folder = '.' # project folder\n",
|
||||
"\n",
|
||||
"experiment = Experiment(ws, experiment_name)\n",
|
||||
"print(\"Location:\", ws.location)\n",
|
||||
"output = {}\n",
|
||||
"output['SDK version'] = azureml.core.VERSION\n",
|
||||
"output['Subscription ID'] = ws.subscription_id\n",
|
||||
"output['Workspace'] = ws.name\n",
|
||||
"output['Resource Group'] = ws.resource_group\n",
|
||||
"output['Location'] = ws.location\n",
|
||||
"output['Project Directory'] = project_folder\n",
|
||||
"output['Experiment Name'] = experiment.name\n",
|
||||
"pd.set_option('display.max_colwidth', -1)\n",
|
||||
"pd.DataFrame(data = output, index = ['']).T\n",
|
||||
"\n",
|
||||
"set_diagnostics_collection(send_diagnostics=True)\n",
|
||||
"\n",
|
||||
"############################### create a run config\n",
|
||||
"\n",
|
||||
"cd = CondaDependencies.create(pip_packages=[\"azureml-sdk==1.0.17\", \"azureml-train-automl==1.0.17\", \"pyculiarity\", \"pytictoc\", \"cryptography==2.5\", \"pandas\"])\n",
|
||||
"\n",
|
||||
"amlcompute_run_config = RunConfiguration(framework = \"python\", conda_dependencies = cd)\n",
|
||||
"amlcompute_run_config.environment.docker.enabled = False\n",
|
||||
"amlcompute_run_config.environment.docker.gpu_support = False\n",
|
||||
"amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE\n",
|
||||
"amlcompute_run_config.environment.spark.precache_packages = False\n",
|
||||
"\n",
|
||||
"############################### create AML compute\n",
|
||||
"\n",
|
||||
"aml_compute_target = \"aml-compute\"\n",
|
||||
"try:\n",
|
||||
" aml_compute = AmlCompute(ws, aml_compute_target)\n",
|
||||
" print(\"found existing compute target.\")\n",
|
||||
"except:\n",
|
||||
" print(\"creating new compute target\")\n",
|
||||
" \n",
|
||||
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", \n",
|
||||
" idle_seconds_before_scaledown=1800, \n",
|
||||
" min_nodes = 0, \n",
|
||||
" max_nodes = 4)\n",
|
||||
" aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)\n",
|
||||
" aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
|
||||
" \n",
|
||||
"print(\"Azure Machine Learning Compute attached\")\n",
|
||||
"\n",
|
||||
"############################### point to data and scripts\n",
|
||||
"\n",
|
||||
"# we use this for exchanging data between pipeline steps\n",
|
||||
"def_data_store = ws.get_default_datastore()\n",
|
||||
"\n",
|
||||
"# get pointer to default blob store\n",
|
||||
"def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
|
||||
"print(\"Blobstore's name: {}\".format(def_blob_store.name))\n",
|
||||
"\n",
|
||||
"# Naming the intermediate data as anomaly data and assigning it to a variable\n",
|
||||
"anomaly_data = PipelineData(\"anomaly_data\", datastore = def_blob_store)\n",
|
||||
"print(\"Anomaly data object created\")\n",
|
||||
"\n",
|
||||
"# model = PipelineData(\"model\", datastore = def_data_store)\n",
|
||||
"# print(\"Model data object created\")\n",
|
||||
"\n",
|
||||
"anom_detect = PythonScriptStep(name = \"anomaly_detection\",\n",
|
||||
" # script_name=\"anom_detect.py\",\n",
|
||||
" script_name = \"CICD/code/anom_detect.py\",\n",
|
||||
" arguments = [\"--output_directory\", anomaly_data],\n",
|
||||
" outputs = [anomaly_data],\n",
|
||||
" compute_target = aml_compute, \n",
|
||||
" source_directory = project_folder,\n",
|
||||
" allow_reuse = True,\n",
|
||||
" runconfig = amlcompute_run_config)\n",
|
||||
"print(\"Anomaly Detection Step created.\")\n",
|
||||
"\n",
|
||||
"automl_step = PythonScriptStep(name = \"automl_step\",\n",
|
||||
" # script_name = \"automl_step.py\", \n",
|
||||
" script_name = \"CICD/code/automl_step.py\", \n",
|
||||
" arguments = [\"--input_directory\", anomaly_data],\n",
|
||||
" inputs = [anomaly_data],\n",
|
||||
" # outputs = [model],\n",
|
||||
" compute_target = aml_compute, \n",
|
||||
" source_directory = project_folder,\n",
|
||||
" allow_reuse = True,\n",
|
||||
" runconfig = amlcompute_run_config)\n",
|
||||
"\n",
|
||||
"print(\"AutoML Training Step created.\")\n",
|
||||
"\n",
|
||||
"############################### set up, validate and run pipeline\n",
|
||||
"\n",
|
||||
"steps = [anom_detect, automl_step]\n",
|
||||
"print(\"Step lists created\")\n",
|
||||
"\n",
|
||||
"pipeline = Pipeline(workspace = ws, steps = steps)\n",
|
||||
"print (\"Pipeline is built\")\n",
|
||||
"\n",
|
||||
"pipeline.validate()\n",
|
||||
"print(\"Pipeline validation complete\")\n",
|
||||
"\n",
|
||||
"pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)\n",
|
||||
"print(\"Pipeline is submitted for execution\")\n",
|
||||
"\n",
|
||||
"# Wait until the run finishes.\n",
|
||||
"pipeline_run.wait_for_completion(show_output = False)\n",
|
||||
"print(\"Pipeline run completed\")\n",
|
||||
"\n",
|
||||
"############################### upload artifacts to AML Workspace\n",
|
||||
"\n",
|
||||
"# Download aml_config info and output of automl_step\n",
|
||||
"def_data_store.download(target_path = '.',\n",
|
||||
" prefix = 'aml_config',\n",
|
||||
" show_progress = True,\n",
|
||||
" overwrite = True)\n",
|
||||
"\n",
|
||||
"def_data_store.download(target_path = '.',\n",
|
||||
" prefix = 'outputs',\n",
|
||||
" show_progress = True,\n",
|
||||
" overwrite = True)\n",
|
||||
"print(\"Updated aml_config and outputs folder\")\n",
|
||||
"\n",
|
||||
"model_fname = 'model.pkl'\n",
|
||||
"model_path = os.path.join(\"outputs\", model_fname)\n",
|
||||
"\n",
|
||||
"# Upload the model file explicitly into artifacts (for CI/CD)\n",
|
||||
"pipeline_run.upload_file(name = model_path, path_or_stream = model_path)\n",
|
||||
"print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The script `pipeline.py` run `anom_detect.py` and `automl_step.py` in that order. Let's see what these two scripts contain."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# %load ./code/anom_detect.py"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# %load ./code/automl_step.py"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"%load ./code/evaluate_model.py\n",
|
||||
"%load ./code/register_model.py\n",
|
||||
"%load ./code/create_scoring_image.py\n",
|
||||
"%load ./code/deploy_aci.py\n",
|
||||
"%load ./code/aci_service_test.py\n",
|
||||
"\n",
|
||||
"%load ./code/deploy_aks.py\n",
|
||||
"%load ./code/aks_service_test.py\n",
|
||||
"%load ./code/data_prep.py\n",
|
||||
"%load ./code/scoring/score.py\n",
|
||||
"%load ./code/testing/data_test.py"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# %load ./azure-pipelines.yml\n",
|
||||
"pool:\n",
|
||||
" vmImage: 'Ubuntu 16.04'\n",
|
||||
"steps:\n",
|
||||
"- task: UsePythonVersion@0\n",
|
||||
" inputs:\n",
|
||||
" versionSpec: 3.5\n",
|
||||
" architecture: 'x64'\n",
|
||||
"\n",
|
||||
"- task: DownloadSecureFile@1\n",
|
||||
" inputs:\n",
|
||||
" name: configFile\n",
|
||||
" secureFile: config.json\n",
|
||||
"- script: echo \"Printing the secure file path\" \n",
|
||||
"- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)\n",
|
||||
"\n",
|
||||
"- task: CondaEnvironment@1\n",
|
||||
" displayName: 'Create Conda Environment '\n",
|
||||
" inputs:\n",
|
||||
" createCustomEnvironment: true\n",
|
||||
" environmentName: azuremlsdk\n",
|
||||
" packageSpecs: 'python=3.6'\n",
|
||||
" updateConda: false\n",
|
||||
" createOptions: 'cython==0.29 urllib3<1.24'\n",
|
||||
"- script: |\n",
|
||||
" pip install --user azureml-sdk==1.0.17 pandas\n",
|
||||
" displayName: 'Install prerequisites'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/pipeline.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/pipeline.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/evaluate_model.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/evaluate_model.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/register_model.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/register_model.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/create_scoring_image.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/create_scoring_image.py'\n",
|
||||
"\n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/deploy_aci.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/deploy_aci.py'\n",
|
||||
" \n",
|
||||
"- task: AzureCLI@1\n",
|
||||
" displayName: 'Azure CLI CICD/code/aci_service_test.py'\n",
|
||||
" inputs:\n",
|
||||
" azureSubscription: 'serviceConnection'\n",
|
||||
" scriptLocation: inlineScript\n",
|
||||
" inlineScript: 'python CICD/code/aci_service_test.py'\n",
|
||||
"- script: |\n",
|
||||
" python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv\n",
|
||||
" displayName: 'Test Schema'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating a config file and uploading it as a Secure File\n",
|
||||
"\n",
|
||||
"On your own labtop, create a file called `config.json` to capture the `subscription_id`, `resource_group`, `workspace_name` and `workspace_region`:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"{\n",
|
||||
" \"subscription_id\": \".......\",\n",
|
||||
" \"resource_group\": \".......\",\n",
|
||||
" \"workspace_name\": \".......\",\n",
|
||||
" \"workspace_region\": \".......\"\n",
|
||||
"}\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"You can get all of the info from the Machine Learning Service Workspace created in the portal as shown below. **Attention:** For `workspace_region` use one word and all lowercase, e.g. `westus2`.\n",
|
||||
"\n",
|
||||
"![ML Workspace](../images/configFileOnPortal.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"It's not best practice to commit the above config information to your source repository. To address this, we can use the Secure Files library to store files such as signing certificates, Apple Provisioning Profiles, Android Keystore files, and SSH keys on the server without having to commit them to your source repository. Secure files are defined and managed in the Library tab in Azure Pipelines.\n",
|
||||
"\n",
|
||||
"The contents of the secure files are encrypted and can only be used during the build or release pipeline by referencing them from a task. There's a size limit of 10 MB for each secure file.\n",
|
||||
"\n",
|
||||
"#### Upload Secure File\n",
|
||||
"\n",
|
||||
"1. Select **Pipelines**, **Library** and **Secure Files**, then **+Secure File** to upload `config.json` file.\n",
|
||||
"\n",
|
||||
"![Upload Secure File](../images/uploadSecureFile.png)\n",
|
||||
"\n",
|
||||
"2. Select the uploaded file `config.json` and ensure **Authorize for use in all pipelines** is ticked and click on **Save**. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating a build\n",
|
||||
"\n",
|
||||
"Azure Pipelines allow you to build AI applications without needing to set up any infrastructure of your own. Python is preinstalled on Microsoft-hosted agents in Azure Pipelines. You can use Linux, macOS, or Windows agents to run your builds.\n",
|
||||
"\n",
|
||||
"#### New Pipeline\n",
|
||||
"\n",
|
||||
"1. To create a new pipeline, select **New pipeline** from the Pipelines blade:\n",
|
||||
"\n",
|
||||
" ![New Pipeline](../images/newPipeline.png)\n",
|
||||
"\n",
|
||||
"2. You will be prompted with **Where is your code?**. Select **Azure Repos** followed by your repo.\n",
|
||||
"\n",
|
||||
"3. Select **Run**. Once the agent is allocated, you'll start seeing the live logs of the build.\n",
|
||||
"\n",
|
||||
"#### Notification\n",
|
||||
"\n",
|
||||
"The summary and status of the build will be sent to the email registered (i.e. Azure login user). Login using the email registered at `www.office.com` to view the notification."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Azure Pipelines with YAML\n",
|
||||
"\n",
|
||||
"You can define your pipeline using a YAML file: `azure-pipelines.yml` alongside the rest of the code for your app. The big advantage of using YAML is that the pipeline is versioned with the code and follows the same branching structure. \n",
|
||||
"\n",
|
||||
"The basic steps include:\n",
|
||||
"\n",
|
||||
"1. Configure Azure Pipelines to use your Git repo.\n",
|
||||
"2. Edit your `azure-pipelines.yml` file to define your build.\n",
|
||||
"3. Push your code to your version control repository which kicks off the default trigger to build and deploy.\n",
|
||||
"4. Code is now updated, built, tested, and packaged. It can be deployed to any target.\n",
|
||||
"\n",
|
||||
"![Pipelines-Image-Yam](../images/pipelines-image-yaml.png)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"Open the yml file in the repo to understand the build steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Creating test scripts\n",
|
||||
"\n",
|
||||
"In this workshop, multiple tests are included:\n",
|
||||
"\n",
|
||||
"1. A basic test script `code/testing/data_test.py` is provided to test the schema of the json data for prediction using sample data in `data_sample/predmain_bad_schema.csv`.\n",
|
||||
"\n",
|
||||
"2. `code/aci_service_test.py` and `code/aks_service_test.py` to test deployment using ACI and AKS respectively.\n",
|
||||
"\n",
|
||||
"#### Exercise\n",
|
||||
"\n",
|
||||
"- Can you either extend `code/testing/data_test.py` or create a new one to check for the feature types? \n",
|
||||
"\n",
|
||||
"- `code/aci_service_test.py` and `code/aks_service_test.py` scripts check if you are getting scores from the deployed service. Can you check if you are getting the desired scores by modifying the scripts?\n",
|
||||
"\n",
|
||||
"- Make sure `azure-pipelines.yml` captures the above changes"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Build trigger (continuous deployment trigger)\n",
|
||||
"\n",
|
||||
"Along with the time triggers, we cann can also create a release every time a new build is available.\n",
|
||||
"\n",
|
||||
"1. Enable the *Continuous deployment trigger* and ensure *Enabled* is selected in the *Continuous deployment trigger* configuration as shown below:\n",
|
||||
"\n",
|
||||
"![Release Build Trigger](../images/releaseBuildTrigger.png)\n",
|
||||
"\n",
|
||||
"2. Populate the branch in *Build branch filters*. A release will be triggered only for a build that is from one of the branches populated. For example, selecting \"master\" will trigger a release for every build from the master branch.\n",
|
||||
"\n",
|
||||
"#### Approvals\n",
|
||||
"\n",
|
||||
"For the QC task, you will recieve an *Azure DevOps Notifaction* email to view approval. On selecting *View Approval*, you will be taken to the following page to approve/reject:\n",
|
||||
"\n",
|
||||
"![Pending Approval](../images/pendingApproval.png)\n",
|
||||
"\n",
|
||||
"There is also provision to include comments with approval/reject:\n",
|
||||
"\n",
|
||||
"![Approval Comments](../images/approvalComments.png)\n",
|
||||
"\n",
|
||||
"Once the post-deployment approvals are approved by the users chosen, the pipeline will be listed with a green tick next to QC under the list of release pipelines: \n",
|
||||
"\n",
|
||||
"![Release Passed](../images/releasePassed.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Application Insights (optional)\n",
|
||||
"\n",
|
||||
"For your convenience, Azure Application Insights is automatically added when you create the Azure Machine Learning workspace. In this section, we will look at how we can investigate the predictions from the service created using `Analytics`. Analytics is the powerful search and query tool of Application Insights. Analytics is a web tool so no setup is required.\n",
|
||||
"\n",
|
||||
"Run the below script (after replacing `<scoring_url>` and `<key>`) locally to obtain the predictions. You can also change `input_j` to obtain different predictions.\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"import requests\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"input_j = [[1.92168882e+02, 5.82427351e+02, 2.09748253e+02, 4.32529303e+01, 1.52377597e+01, 5.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]\n",
|
||||
"\n",
|
||||
"data = json.dumps({'data': input_j})\n",
|
||||
"test_sample = bytes(data, encoding = 'utf8')\n",
|
||||
"\n",
|
||||
"url = '<scoring_url>'\n",
|
||||
"api_key = '<key>' \n",
|
||||
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}\n",
|
||||
"\n",
|
||||
"resp = requests.post(url, test_sample, headers=headers)\n",
|
||||
"print(resp.text)\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"1. From the Machine Learning Workspace in the portal, Select `Application Insights` in the overview tab:\n",
|
||||
"\n",
|
||||
"![ML Workspace](../images/mlworkspace.png)\n",
|
||||
"\n",
|
||||
"2. Select Analytics.\n",
|
||||
"\n",
|
||||
"3. The predictions will be logged which can be queried in the Log Analytics page in the Azure portal as shown below. For example, to query `requests`, run the following query:\n",
|
||||
"\n",
|
||||
"````\n",
|
||||
" requests\n",
|
||||
" | where timestamp > ago(3h)\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"![LogAnalytics Query](../images/logAnalyticsQuery.png)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Data Changes\n",
|
||||
"\n",
|
||||
"A data scientist may want to trigger the pipeline when new data is available. To illustrate this, a small incremental data is made available in `data_sample\\telemetry_incremental.csv` which is picked up in the below code snippet of anom_detect.py:\n",
|
||||
"\n",
|
||||
"````python\n",
|
||||
" print(\"Adding incremental data...\")\n",
|
||||
" telemetry_incremental = pd.read_csv(os.path.join('data_sample/', 'telemetry_incremental.csv'))\n",
|
||||
" telemetry = telemetry.append(telemetry_incremental, ignore_index=True)\n",
|
||||
"````\n",
|
||||
"\n",
|
||||
"The data changes would cause a change in the model evaluation and if it's better than the baseline model, it would be propagated for deployment."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.1"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -0,0 +1,19 @@
|
|||
# Introduction
|
||||
|
||||
In this course, we will implement a Continuous Integration (CI)/Continuous Delivery (CD) pipeline for Anomaly Detection and Predictive Maintenance applications. For developing an AI application, there are frequently two streams of work:
|
||||
|
||||
1. Data Scientists building machine learning models
|
||||
2. App developers building the application and exposing it to end users to consume
|
||||
|
||||
In short, the pipeline is designed to kick off for each new commit, run the test suite, if the test passes takes the latest build, packages it in a Docker container and then deploys to create a scoring service as shown below.
|
||||
|
||||
![Architecture](images/architecture.png)
|
||||
|
||||
## Modules Covered
|
||||
|
||||
The goal of this course is to cover the following modules:
|
||||
|
||||
* Introduction to CI/CD
|
||||
* Create a CI/CD pipeline using Azure
|
||||
* Customize a CI/CD pipeline using Azure
|
||||
* Learn how to develop a Machine Learning pipeline to update models and create service
|
|
@ -0,0 +1,71 @@
|
|||
pool:
|
||||
vmImage: 'Ubuntu 16.04'
|
||||
steps:
|
||||
- task: UsePythonVersion@0
|
||||
inputs:
|
||||
versionSpec: 3.5
|
||||
architecture: 'x64'
|
||||
|
||||
- task: DownloadSecureFile@1
|
||||
inputs:
|
||||
name: configFile
|
||||
secureFile: config.json
|
||||
- script: echo "Printing the secure file path"
|
||||
- script: cp $(Agent.TempDirectory)/config.json $(Build.SourcesDirectory)
|
||||
|
||||
- task: CondaEnvironment@1
|
||||
displayName: 'Create Conda Environment '
|
||||
inputs:
|
||||
createCustomEnvironment: true
|
||||
environmentName: azuremlsdk
|
||||
packageSpecs: 'python=3.6'
|
||||
updateConda: false
|
||||
createOptions: 'cython==0.29 urllib3<1.24'
|
||||
- script: |
|
||||
pip install --user azureml-sdk pandas
|
||||
displayName: 'Install prerequisites'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/pipeline.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/pipeline.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/evaluate_model.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/evaluate_model.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/register_model.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/register_model.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/create_scoring_image.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/create_scoring_image.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/deploy_aci.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/deploy_aci.py'
|
||||
|
||||
- task: AzureCLI@1
|
||||
displayName: 'Azure CLI CICD/code/aci_service_test.py'
|
||||
inputs:
|
||||
azureSubscription: 'serviceConnection'
|
||||
scriptLocation: inlineScript
|
||||
inlineScript: 'python CICD/code/aci_service_test.py'
|
||||
- script: |
|
||||
python CICD/code/testing/data_test.py CICD/data_sample/predmain_bad_schema.csv
|
||||
displayName: 'Test Schema'
|
|
@ -0,0 +1,7 @@
|
|||
.ipynb_checkpoints
|
||||
azureml-logs
|
||||
.azureml
|
||||
.git
|
||||
outputs
|
||||
azureml-setup
|
||||
docs
|
|
@ -0,0 +1,38 @@
|
|||
import numpy
|
||||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
from azureml.core.webservice import AciWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the ACI Details
|
||||
try:
|
||||
with open("aml_config/aci_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
service_name = config['aci_name']
|
||||
# Get the hosted web service
|
||||
service=Webservice(name = service_name, workspace =ws)
|
||||
|
||||
# Input for Model with all features
|
||||
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
|
||||
print(input_j)
|
||||
test_sample = json.dumps({'data': input_j})
|
||||
test_sample = bytes(test_sample,encoding = 'utf8')
|
||||
try:
|
||||
prediction = service.run(input_data = test_sample)
|
||||
print(prediction)
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
print(result)
|
||||
raise Exception('ACI service is not working as expected')
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
import numpy
|
||||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the AKS Details
|
||||
os.chdir('./CICD')
|
||||
try:
|
||||
with open("aml_config/aks_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
service_name = config['aks_service_name']
|
||||
# Get the hosted web service
|
||||
service=Webservice(workspace=ws, name=service_name)
|
||||
|
||||
# Input for Model with all features
|
||||
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
|
||||
|
||||
print(input_j)
|
||||
test_sample = json.dumps({'data': input_j})
|
||||
test_sample = bytes(test_sample,encoding = 'utf8')
|
||||
try:
|
||||
prediction = service.run(input_data = test_sample)
|
||||
print(prediction)
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
print(result)
|
||||
raise Exception('AKS service is not working as expected')
|
||||
|
||||
# Delete aks after test
|
||||
#service.delete()
|
||||
|
|
@ -0,0 +1,115 @@
|
|||
import argparse
|
||||
import pickle
|
||||
import pandas as pd
|
||||
import os
|
||||
from pyculiarity import detect_ts # python port of Twitter AD lib
|
||||
from pytictoc import TicToc # so we can time our operations
|
||||
|
||||
def rolling_average(df, column, n=24):
|
||||
"""
|
||||
Calculates rolling average according to Welford's online algorithm (Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition).
|
||||
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_Online_algorithm
|
||||
|
||||
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
|
||||
|
||||
:param df: a dataframe with time series in columns
|
||||
:param column: name of the column of interest
|
||||
:param n: number of measurements to consider
|
||||
:return: None
|
||||
"""
|
||||
|
||||
ra = [0] * df.shape[0]
|
||||
ra[0] = df[column].values[0]
|
||||
|
||||
for r in range(1, df.shape[0]):
|
||||
curr_n = float(min(n, r))
|
||||
ra[r] = ra[r-1] + (df[column].values[r] - ra[r-1])/curr_n
|
||||
|
||||
df = pd.DataFrame(data={'datetime': df['datetime'], 'value': ra})
|
||||
return df
|
||||
|
||||
|
||||
def do_ad(df, alpha=0.005, max_anoms=0.1, only_last=None, longterm=False, e_value=False, direction='both'):
|
||||
"""
|
||||
This method performs the actual anomaly detection. Expecting the a dataframe with multiple sensors,
|
||||
and a specification of which sensor to use for anomaly detection.
|
||||
|
||||
:param df: a dataframe with a timestamp column and one more columns with telemetry data
|
||||
:param column: name of the column on which to perform AD
|
||||
:param alpha: see pyculiarity documentation for the meaning of these parameters
|
||||
:param max_anoms:
|
||||
:param only_last:
|
||||
:param longterm:
|
||||
:param e_value:
|
||||
:param direction:
|
||||
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
|
||||
"""
|
||||
results = detect_ts(df,
|
||||
max_anoms=max_anoms,
|
||||
alpha=alpha,
|
||||
direction=direction,
|
||||
e_value=e_value,
|
||||
longterm=longterm,
|
||||
only_last=only_last)
|
||||
|
||||
return results['anoms']['timestamp'].values
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser("anom_detect")
|
||||
|
||||
parser.add_argument("--output_directory", type=str, help="output directory")
|
||||
args = parser.parse_args()
|
||||
|
||||
print("output directory: %s" % args.output_directory)
|
||||
os.makedirs(args.output_directory, exist_ok=True)
|
||||
|
||||
# public store of telemetry data
|
||||
data_dir = 'https://sethmottstore.blob.core.windows.net/predmaint/'
|
||||
|
||||
print("Reading data ... ", end="")
|
||||
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
|
||||
print("Done.")
|
||||
|
||||
print("Adding incremental data...")
|
||||
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
|
||||
telemetry = telemetry.append(telemetry_incremental, ignore_index=True)
|
||||
print("Done.")
|
||||
|
||||
print("Parsing datetime...", end="")
|
||||
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
print("Done.")
|
||||
|
||||
window_size = 12 # how many measures to include in rolling average
|
||||
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
|
||||
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
|
||||
machine_ids = telemetry['machineID'].unique()
|
||||
|
||||
t = TicToc()
|
||||
for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is just here to allow us to test this
|
||||
df = telemetry.loc[telemetry.loc[:, 'machineID'] == machine_id, :]
|
||||
t.tic()
|
||||
print("Working on sensor: ")
|
||||
for s, sensor in enumerate(sensors):
|
||||
N = window_sizes[s]
|
||||
print(" %s " % sensor)
|
||||
|
||||
df_ra = rolling_average(df, sensor, N)
|
||||
anoms_timestamps = do_ad(df_ra)
|
||||
|
||||
df_anoms = pd.DataFrame(data={'datetime': anoms_timestamps, 'machineID': [machine_id] * len(anoms_timestamps), 'errorID': [sensor] * len(anoms_timestamps)})
|
||||
|
||||
# if this is the first machine and sensor, we initialize a new dataframe
|
||||
if machine_id == 1 and s == 0:
|
||||
df_anoms_all = df_anoms
|
||||
else: # otherwise we append the newly detected anomalies to the existing dataframe
|
||||
df_anoms_all = df_anoms_all.append(df_anoms, ignore_index=True)
|
||||
|
||||
# store of output
|
||||
obj = {}
|
||||
obj["df_anoms"] = df_anoms_all
|
||||
|
||||
out_file = os.path.join(args.output_directory, "anoms.pkl")
|
||||
with open(out_file, "wb") as fp:
|
||||
pickle.dump(obj, fp)
|
||||
|
||||
t.toc("Processing machine %s took" % machine_id)
|
|
@ -0,0 +1,337 @@
|
|||
import json
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
|
||||
import pandas as pd
|
||||
from sklearn import datasets
|
||||
|
||||
from sklearn.metrics import classification_report
|
||||
from sklearn.metrics import confusion_matrix
|
||||
from sklearn.metrics import roc_auc_score
|
||||
from sklearn.externals import joblib
|
||||
|
||||
import azureml.core
|
||||
from azureml.core.experiment import Experiment
|
||||
from azureml.core.workspace import Workspace
|
||||
from azureml.train.automl import AutoMLConfig
|
||||
from azureml.train.automl.run import AutoMLRun
|
||||
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
import azureml.core
|
||||
import numpy as np
|
||||
|
||||
import argparse
|
||||
import pickle
|
||||
import json
|
||||
import numpy as np
|
||||
from sklearn import datasets
|
||||
|
||||
import logging
|
||||
|
||||
import azureml.core
|
||||
from azureml.core.experiment import Experiment
|
||||
from azureml.core.workspace import Workspace
|
||||
from azureml.train.automl import AutoMLConfig
|
||||
from azureml.train.automl.run import AutoMLRun
|
||||
from azureml.core.run import Run
|
||||
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
|
||||
import pandas as pd
|
||||
import numpy as np
|
||||
import urllib.request
|
||||
import os
|
||||
|
||||
def download_data():
|
||||
os.makedirs('../data', exist_ok = True)
|
||||
container = 'https://sethmottstore.blob.core.windows.net/predmaint/'
|
||||
|
||||
urllib.request.urlretrieve(container + 'telemetry.csv', filename='../data/telemetry.csv')
|
||||
urllib.request.urlretrieve(container + 'maintenance.csv', filename='../data/maintenance.csv')
|
||||
urllib.request.urlretrieve(container + 'machines.csv', filename='../data/machines.csv')
|
||||
urllib.request.urlretrieve(container + 'failures.csv', filename='../data/failures.csv')
|
||||
# we replace errors.csv with anoms.csv (results from running anomaly detection)
|
||||
# urllib.request.urlretrieve(container + 'errors.csv', filename='../data/errors.csv')
|
||||
urllib.request.urlretrieve(container + 'anoms.csv', filename='../data/anoms.csv')
|
||||
|
||||
df_telemetry = pd.read_csv('../data/telemetry.csv', header=0)
|
||||
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
df_errors = pd.read_csv('../data/anoms.csv', header=0)
|
||||
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
|
||||
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
|
||||
df_errors = df_errors.replace({"errorID": rep_dir})
|
||||
df_subset = df_errors.loc[(df_errors.datetime.between('2015-01-01', '2016-01-01')) & (df_errors.machineID == 1)]
|
||||
df_subset.head()
|
||||
df_fails = pd.read_csv('../data/failures.csv', header=0)
|
||||
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
df_maint = pd.read_csv('../data/maintenance.csv', header=0)
|
||||
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format="%m/%d/%Y %I:%M:%S %p")
|
||||
df_machines = pd.read_csv('../data/machines.csv', header=0)
|
||||
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
return df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines
|
||||
|
||||
|
||||
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
|
||||
keys = ['machineID', 'datetime']
|
||||
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
|
||||
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
|
||||
df_wide = df_wide.groupby(keys).sum().reset_index()
|
||||
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
|
||||
# run a rolling window through event flags to aggregate data
|
||||
dummy_col_names = df_dummies.columns
|
||||
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
|
||||
df.reset_index(inplace=True)
|
||||
df = df.loc[df.index % on == on-1]
|
||||
df.reset_index(inplace=True, drop=True)
|
||||
df_first = df.groupby('machineID', as_index=False).nth(0)
|
||||
# calculate the time of the last event and the time elapsed since
|
||||
for col in dummy_col_names:
|
||||
whenlast, diffcol = 'last_' + col, 'd' + col
|
||||
df.loc[:, col].fillna(value = 0, inplace=True)
|
||||
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
|
||||
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
|
||||
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
|
||||
# for the first occurence we don't know when it last happened, so we assume it happened then
|
||||
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
|
||||
df[whenlast].fillna(method='ffill', inplace=True)
|
||||
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
|
||||
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
|
||||
df.drop(columns = whenlast, inplace=True)
|
||||
if show_example == True:
|
||||
col = np.random.choice(dummy_col_names, size = 1)[0]
|
||||
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
|
||||
print('Example:\n')
|
||||
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
|
||||
return df
|
||||
|
||||
|
||||
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
|
||||
"""
|
||||
calculates rolling averages and standard deviations
|
||||
|
||||
Arguments:
|
||||
df -- dataframe to run it on
|
||||
colnames -- names of columns we want rolling statistics for
|
||||
suffixes -- suffixes attached to the new columns (provide a list with strings)
|
||||
window -- the lag over which rolling statistics are calculated
|
||||
on -- the interval at which rolling statistics are calculated
|
||||
groupby -- the column used to group results by
|
||||
lagon -- the name of the datetime column used to compute lags (if none specified it defaults to row number)
|
||||
|
||||
Returns:
|
||||
a dataframe with rolling statistics over a specified lag calculated over a specified interval
|
||||
"""
|
||||
|
||||
rolling_colnames = [c + suffixes[0] for c in colnames]
|
||||
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
|
||||
df_rolling_mean.columns = rolling_colnames
|
||||
df_rolling_mean.reset_index(inplace=True)
|
||||
|
||||
rolling_colnames = [c + suffixes[1] for c in colnames]
|
||||
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
|
||||
df_rolling_sd.columns = rolling_colnames
|
||||
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
|
||||
df_rolling_sd.reset_index(inplace=True, drop=True)
|
||||
|
||||
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
|
||||
df_res = df_res.loc[df_res.index % on == on-1]
|
||||
return df_res
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser("automl_train")
|
||||
|
||||
parser.add_argument("--input_directory", type=str, help="input directory")
|
||||
args = parser.parse_args()
|
||||
|
||||
print("input directory: %s" % args.input_directory)
|
||||
|
||||
run = Run.get_context()
|
||||
|
||||
ws = run.experiment.workspace
|
||||
def_data_store = ws.get_default_datastore()
|
||||
|
||||
# Choose a name for the experiment and specify the project folder.
|
||||
experiment_name = 'automl-local-classification'
|
||||
project_folder = '.'
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
print("Location:", ws.location)
|
||||
output = {}
|
||||
output['SDK version'] = azureml.core.VERSION
|
||||
output['Subscription ID'] = ws.subscription_id
|
||||
output['Workspace'] = ws.name
|
||||
output['Resource Group'] = ws.resource_group
|
||||
output['Location'] = ws.location
|
||||
output['Project Directory'] = project_folder
|
||||
output['Experiment Name'] = experiment.name
|
||||
pd.set_option('display.max_colwidth', -1)
|
||||
pd.DataFrame(data=output, index=['']).T
|
||||
|
||||
set_diagnostics_collection(send_diagnostics=True)
|
||||
|
||||
print("SDK Version:", azureml.core.VERSION)
|
||||
|
||||
df_telemetry, df_errors, df_subset, df_fails, df_maint, df_machines = download_data()
|
||||
|
||||
with open(os.path.join(args.input_directory, "anoms.pkl"), "rb") as fp:
|
||||
obj = pickle.load(fp)
|
||||
df_errors = obj['df_anoms']
|
||||
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
|
||||
df_errors = df_errors.replace({"errorID": rep_dir})
|
||||
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
|
||||
df_join = pd.merge(left=df_maint, right=df_fails.rename(columns={'failure':'comp'}), how = 'outer', indicator=True,
|
||||
on=['datetime', 'machineID', 'comp'], validate='one_to_one')
|
||||
df_join.head()
|
||||
|
||||
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
|
||||
|
||||
# this will make it easier to automatically create features with the right column names
|
||||
# df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
# df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
# df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
cols_to_average = df_telemetry.columns[-4:]
|
||||
|
||||
df_telemetry_rolling_3h = get_rolling_aggregates(df_telemetry, cols_to_average,
|
||||
suffixes = ['_ma_3', '_sd_3'],
|
||||
window = 3, on = 3,
|
||||
groupby = 'machineID', lagon = 'datetime')
|
||||
|
||||
df_telemetry_rolling_12h = get_rolling_aggregates(df_telemetry, cols_to_average,
|
||||
suffixes = ['_ma_12', '_sd_12'],
|
||||
window = 12, on = 3,
|
||||
groupby = 'machineID', lagon = 'datetime')
|
||||
|
||||
df_telemetry_rolling = pd.concat([df_telemetry_rolling_3h, df_telemetry_rolling_12h.drop(['machineID', 'datetime'], axis=1)], axis=1)
|
||||
|
||||
df_telemetry_feat_roll = df_left.merge(df_telemetry_rolling, how="inner", on=['machineID', 'datetime'], validate = "one_to_one")
|
||||
df_telemetry_feat_roll.fillna(method='bfill', inplace=True)
|
||||
df_telemetry_feat_roll.head()
|
||||
|
||||
del df_telemetry_rolling, df_telemetry_rolling_3h, df_telemetry_rolling_12h
|
||||
df_errors_feat_roll = get_datetime_diffs(df_left, df_errors, catvar='errorID', prefix='e', window = 6, lagon = 'datetime', on = 3)
|
||||
df_errors_feat_roll.tail()
|
||||
|
||||
df_errors_feat_roll.loc[df_errors_feat_roll['machineID'] == 2, :].head()
|
||||
|
||||
df_maint_feat_roll = get_datetime_diffs(df_left, df_maint, catvar='comp', prefix='m',
|
||||
window = 6, lagon = 'datetime', on = 3, show_example=False)
|
||||
df_maint_feat_roll.tail()
|
||||
|
||||
df_maint_feat_roll.loc[df_maint_feat_roll['machineID'] == 2, :].head()
|
||||
|
||||
df_fails_feat_roll = get_datetime_diffs(df_left, df_fails, catvar='failure', prefix='f',
|
||||
window = 6, lagon = 'datetime', on = 3, show_example=False)
|
||||
df_fails_feat_roll.tail()
|
||||
|
||||
assert(df_errors_feat_roll.shape[0] == df_fails_feat_roll.shape[0] == df_maint_feat_roll.shape[0] == df_telemetry_feat_roll.shape[0])
|
||||
df_all = pd.concat([df_telemetry_feat_roll,
|
||||
df_errors_feat_roll.drop(columns=['machineID', 'datetime']),
|
||||
df_maint_feat_roll.drop(columns=['machineID', 'datetime']),
|
||||
df_fails_feat_roll.drop(columns=['machineID', 'datetime'])], axis = 1, verify_integrity=True)
|
||||
|
||||
# df_all = pd.merge(left=df_telemetry_feat_roll, right=df_all, on = ['machineID', 'datetime'], validate='one_to_one')
|
||||
df_all = pd.merge(left=df_all, right=df_machines, how="left", on='machineID', validate = 'many_to_one')
|
||||
del df_join, df_left
|
||||
del df_telemetry_feat_roll, df_errors_feat_roll, df_fails_feat_roll, df_maint_feat_roll
|
||||
|
||||
for i in range(1, 5): # iterate over the four components
|
||||
# find all the times a component failed for a given machine
|
||||
df_temp = df_all.loc[df_all['f_' + str(i)] == 1, ['machineID', 'datetime']]
|
||||
label = 'y_' + str(i) # name of target column (one per component)
|
||||
df_all[label] = 0
|
||||
for n in range(df_temp.shape[0]): # iterate over all the failure times
|
||||
machineID, datetime = df_temp.iloc[n, :]
|
||||
dt_end = datetime - pd.Timedelta('3 hours') # 3 hours prior to failure
|
||||
dt_start = datetime - pd.Timedelta('2 days') # n days prior to failure
|
||||
if n % 500 == 0:
|
||||
print("a failure occured on machine {0} at {1}, so {2} is set to 1 between {4} and {3}".format(machineID, datetime, label, dt_end, dt_start))
|
||||
df_all.loc[(df_all['machineID'] == machineID) &
|
||||
(df_all['datetime'].between(dt_start, dt_end)), label] = 1
|
||||
|
||||
df_all.columns
|
||||
|
||||
X_drop = ['datetime', 'machineID', 'f_1', 'f_2', 'f_3', 'f_4', 'y_1', 'y_2', 'y_3', 'y_4', 'model']
|
||||
Y_keep = ['y_1', 'y_2', 'y_3', 'y_4']
|
||||
|
||||
X_train = df_all.loc[df_all['datetime'] < '2015-10-01', ].drop(X_drop, axis=1)
|
||||
y_train = df_all.loc[df_all['datetime'] < '2015-10-01', Y_keep]
|
||||
|
||||
X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
|
||||
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
|
||||
|
||||
|
||||
primary_metric = 'AUC_weighted'
|
||||
|
||||
automl_config = AutoMLConfig(task = 'classification',
|
||||
preprocess = False,
|
||||
name = experiment_name,
|
||||
debug_log = 'automl_errors.log',
|
||||
primary_metric = primary_metric,
|
||||
max_time_sec = 1200,
|
||||
iterations = 2,
|
||||
n_cross_validations = 2,
|
||||
verbosity = logging.INFO,
|
||||
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
|
||||
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
|
||||
path = project_folder, )
|
||||
|
||||
local_run = experiment.submit(automl_config, show_output = True)
|
||||
|
||||
# Wait until the run finishes.
|
||||
local_run.wait_for_completion(show_output = True)
|
||||
|
||||
# create new AutoMLRun object to ensure everything is in order
|
||||
ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)
|
||||
|
||||
# aux function for comparing performance of runs (quick workaround for automl's _get_max_min_comparator)
|
||||
def maximize(x, y):
|
||||
if x >= y:
|
||||
return x
|
||||
else:
|
||||
return y
|
||||
|
||||
# next couple of lines are stripped down version of automl's get_output
|
||||
children = list(ml_run.get_children())
|
||||
|
||||
best_run = None # will be child run with best performance
|
||||
best_score = None # performance of that child run
|
||||
|
||||
for child in children:
|
||||
candidate_score = child.get_metrics()[primary_metric]
|
||||
if not np.isnan(candidate_score):
|
||||
if best_score is None:
|
||||
best_score = candidate_score
|
||||
best_run = child
|
||||
else:
|
||||
new_score = maximize(best_score, candidate_score)
|
||||
if new_score != best_score:
|
||||
best_score = new_score
|
||||
best_run = child
|
||||
|
||||
# print accuracy
|
||||
best_accuracy = best_run.get_metrics()['accuracy']
|
||||
print("Best run accuracy:", best_accuracy)
|
||||
|
||||
# download model and save to pkl
|
||||
model_path = "outputs/model.pkl"
|
||||
best_run.download_file(name=model_path, output_file_path=model_path)
|
||||
|
||||
# Writing the run id to /aml_config/run_id.json
|
||||
run_id = {}
|
||||
run_id['run_id'] = best_run.id
|
||||
run_id['experiment_name'] = best_run.experiment.name
|
||||
|
||||
# save run info
|
||||
os.makedirs('aml_config', exist_ok = True)
|
||||
with open('aml_config/run_id.json', 'w') as outfile:
|
||||
json.dump(run_id, outfile)
|
||||
|
||||
# upload run info and model (pkl) to def_data_store, so that pipeline mast can access it
|
||||
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
|
||||
|
||||
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)
|
|
@ -0,0 +1,57 @@
|
|||
import os, json, sys
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.image import ContainerImage, Image
|
||||
from azureml.core.model import Model
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the latest model details
|
||||
|
||||
try:
|
||||
with open("aml_config/model.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model to register thus no need to create new scoring image')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
model_name = config['model_name']
|
||||
model_version = config['model_version']
|
||||
|
||||
|
||||
model_list = Model.list(workspace=ws)
|
||||
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
|
||||
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
|
||||
|
||||
os.chdir('./CICD/code/scoring')
|
||||
image_name = "predmaintenance-model-score"
|
||||
|
||||
image_config = ContainerImage.image_configuration(execution_script = "score.py",
|
||||
runtime = "python-slim",
|
||||
conda_file = "conda_dependencies.yml",
|
||||
description = "Image with predictive maintenance model",
|
||||
tags = {'area': "diabetes", 'type': "regression"}
|
||||
)
|
||||
|
||||
image = Image.create(name = image_name,
|
||||
models = [model],
|
||||
image_config = image_config,
|
||||
workspace = ws)
|
||||
|
||||
image.wait_for_creation(show_output = True)
|
||||
os.chdir('../../../')
|
||||
|
||||
if image.creation_state != 'Succeeded':
|
||||
raise Exception('Image creation status: {image.creation_state}')
|
||||
|
||||
print('{}(v.{} [{}]) stored at {} with build log {}'.format(image.name, image.version, image.creation_state, image.image_location, image.image_build_log_uri))
|
||||
|
||||
# Writing the image details to /aml_config/image.json
|
||||
image_json = {}
|
||||
image_json['image_name'] = image.name
|
||||
image_json['image_version'] = image.version
|
||||
image_json['image_location'] = image.image_location
|
||||
with open('aml_config/image.json', 'w') as outfile:
|
||||
json.dump(image_json,outfile)
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
from azureml.core.webservice import AciWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the Image to deploy details
|
||||
try:
|
||||
with open("aml_config/image.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
image_name = config['image_name']
|
||||
image_version = config['image_version']
|
||||
|
||||
images = Image.list(workspace=ws)
|
||||
image, = (m for m in images if m.version==image_version and m.name == image_name)
|
||||
print('From image.json, Image used to deploy webservice on ACI: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
|
||||
|
||||
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,
|
||||
memory_gb=1,
|
||||
tags={'area': "pred-maintenance", 'type': "automl"},
|
||||
description='A sample description')
|
||||
|
||||
aci_service_name='aciwebservice'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
|
||||
service = Webservice.deploy_from_image(deployment_config=aciconfig,
|
||||
image=image,
|
||||
name=aci_service_name,
|
||||
workspace=ws)
|
||||
|
||||
service.wait_for_deployment()
|
||||
print('Deployed ACI Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
|
||||
|
||||
#service=Webservice(name ='aciws0622', workspace =ws)
|
||||
# Writing the ACI details to /aml_config/aci_webservice.json
|
||||
aci_webservice = {}
|
||||
aci_webservice['aci_name'] = service.name
|
||||
aci_webservice['aci_url'] = service.scoring_uri
|
||||
with open('aml_config/aci_webservice.json', 'w') as outfile:
|
||||
json.dump(aci_webservice,outfile)
|
||||
|
|
@ -0,0 +1,76 @@
|
|||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.compute import AksCompute, ComputeTarget
|
||||
from azureml.core.webservice import Webservice, AksWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the Image to deploy details
|
||||
try:
|
||||
with open("aml_config/image.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
image_name = config['image_name']
|
||||
image_version = config['image_version']
|
||||
|
||||
images = Image.list(workspace=ws)
|
||||
image, = (m for m in images if m.version==image_version and m.name == image_name)
|
||||
print('From image.json, Image used to deploy webservice: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
|
||||
|
||||
# Check if AKS already Available
|
||||
try:
|
||||
with open("aml_config/aks_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
aks_name = config['aks_name']
|
||||
aks_service_name = config['aks_service_name']
|
||||
compute_list = ws.compute_targets()
|
||||
aks_target, =(c for c in compute_list if c.name ==aks_name)
|
||||
service=Webservice(name =aks_service_name, workspace =ws)
|
||||
print('Updating AKS service {} with image: {}'.format(aks_service_name,image.image_location))
|
||||
service.update(image=image)
|
||||
except:
|
||||
aks_name = 'aks'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
aks_service_name = 'akswebservice'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
prov_config = AksCompute.provisioning_configuration(agent_count = 6, vm_size = 'Standard_F2', location='eastus')
|
||||
print('No AKS found in aks_webservice.json. Creating new Aks: {} and AKS Webservice: {}'.format(aks_name,aks_service_name))
|
||||
# Create the cluster
|
||||
aks_target = ComputeTarget.create(workspace = ws,
|
||||
name = aks_name,
|
||||
provisioning_configuration = prov_config)
|
||||
|
||||
aks_target.wait_for_completion(show_output = True)
|
||||
print(aks_target.provisioning_state)
|
||||
print(aks_target.provisioning_errors)
|
||||
|
||||
# Use the default configuration (can also provide parameters to customize)
|
||||
aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)
|
||||
|
||||
service = Webservice.deploy_from_image(workspace = ws,
|
||||
name = aks_service_name,
|
||||
image = image,
|
||||
deployment_config = aks_config,
|
||||
deployment_target = aks_target)
|
||||
|
||||
service.wait_for_deployment(show_output = True)
|
||||
print(service.state)
|
||||
print('Deployed AKS Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
|
||||
|
||||
|
||||
|
||||
# Writing the AKS details to /aml_config/aks_webservice.json
|
||||
aks_webservice = {}
|
||||
aks_webservice['aks_name'] = aks_name
|
||||
aks_webservice['aks_service_name'] = service.name
|
||||
aks_webservice['aks_url'] = service.scoring_uri
|
||||
aks_webservice['aks_keys'] = service.get_keys()
|
||||
with open('aml_config/aks_webservice.json', 'w') as outfile:
|
||||
json.dump(aks_webservice,outfile)
|
||||
|
|
@ -0,0 +1,59 @@
|
|||
import os, json
|
||||
from azureml.core import Workspace
|
||||
from azureml.core import Experiment
|
||||
from azureml.core.model import Model
|
||||
import azureml.core
|
||||
from azureml.core import Run
|
||||
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Paramaterize the matrics on which the models should be compared
|
||||
|
||||
# Add golden data set on which all the model performance can be evaluated
|
||||
|
||||
# Get the latest run_id
|
||||
with open("aml_config/run_id.json") as f:
|
||||
config = json.load(f)
|
||||
|
||||
new_model_run_id = config["run_id"]
|
||||
experiment_name = config["experiment_name"]
|
||||
exp = Experiment(workspace = ws, name = experiment_name)
|
||||
|
||||
|
||||
try:
|
||||
# Get most recently registered model, we assume that is the model in production. Download this model and compare it with the recently trained model by running test with same data set.
|
||||
model_list = Model.list(ws)
|
||||
production_model = next(filter(lambda x: x.created_time == max(model.created_time for model in model_list), model_list))
|
||||
production_model_run_id = production_model.tags.get('run_id')
|
||||
run_list = exp.get_runs()
|
||||
# production_model_run = next(filter(lambda x: x.id == production_model_run_id, run_list))
|
||||
|
||||
|
||||
# Get the run history for both production model and newly trained model and compare mse
|
||||
production_model_run = Run(exp,run_id=production_model_run_id)
|
||||
new_model_run = Run(exp,run_id=new_model_run_id)
|
||||
|
||||
production_model_metric = production_model_run.get_metrics().get('accuracy')
|
||||
new_model_metric = new_model_run.get_metrics().get('accuracy')
|
||||
print('Current Production model accuracy: {}, New trained model accuracy: {}'.format(production_model_metric, new_model_metric))
|
||||
|
||||
promote_new_model=False
|
||||
if new_model_metric < production_model_metric:
|
||||
promote_new_model=True
|
||||
print('New trained model performs better, thus it will be registered')
|
||||
except:
|
||||
promote_new_model=True
|
||||
print('This is the first model to be trained, thus nothing to evaluate for now')
|
||||
|
||||
run_id = {}
|
||||
run_id['run_id'] = ''
|
||||
# Writing the run id to /aml_config/run_id.json
|
||||
if promote_new_model:
|
||||
run_id['run_id'] = new_model_run_id
|
||||
|
||||
run_id['experiment_name'] = experiment_name
|
||||
with open('aml_config/run_id.json', 'w') as outfile:
|
||||
json.dump(run_id,outfile)
|
||||
|
|
@ -0,0 +1,150 @@
|
|||
|
||||
############################### load required libraries
|
||||
|
||||
import os
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
import azureml.core
|
||||
from azureml.core import Workspace, Run, Experiment, Datastore
|
||||
from azureml.core.compute import AmlCompute
|
||||
from azureml.core.compute import ComputeTarget
|
||||
from azureml.core.runconfig import CondaDependencies, RunConfiguration
|
||||
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
from azureml.pipeline.steps import PythonScriptStep
|
||||
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
|
||||
|
||||
print("SDK Version:", azureml.core.VERSION)
|
||||
|
||||
############################### load workspace and create experiment
|
||||
|
||||
ws = Workspace.from_config()
|
||||
print('Workspace name: ' + ws.name,
|
||||
'Subscription id: ' + ws.subscription_id,
|
||||
'Resource group: ' + ws.resource_group, sep = '\n')
|
||||
|
||||
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
|
||||
project_folder = '.' # project folder
|
||||
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
print("Location:", ws.location)
|
||||
output = {}
|
||||
output['SDK version'] = azureml.core.VERSION
|
||||
output['Subscription ID'] = ws.subscription_id
|
||||
output['Workspace'] = ws.name
|
||||
output['Resource Group'] = ws.resource_group
|
||||
output['Location'] = ws.location
|
||||
output['Project Directory'] = project_folder
|
||||
output['Experiment Name'] = experiment.name
|
||||
pd.set_option('display.max_colwidth', -1)
|
||||
pd.DataFrame(data = output, index = ['']).T
|
||||
|
||||
set_diagnostics_collection(send_diagnostics=True)
|
||||
|
||||
############################### create a run config
|
||||
|
||||
cd = CondaDependencies.create(pip_packages=["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
|
||||
|
||||
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
|
||||
amlcompute_run_config.environment.docker.enabled = False
|
||||
amlcompute_run_config.environment.docker.gpu_support = False
|
||||
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
|
||||
amlcompute_run_config.environment.spark.precache_packages = False
|
||||
|
||||
############################### create AML compute
|
||||
|
||||
aml_compute_target = "aml-compute"
|
||||
try:
|
||||
aml_compute = AmlCompute(ws, aml_compute_target)
|
||||
print("found existing compute target.")
|
||||
except:
|
||||
print("creating new compute target")
|
||||
|
||||
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
|
||||
idle_seconds_before_scaledown=1800,
|
||||
min_nodes = 0,
|
||||
max_nodes = 4)
|
||||
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
|
||||
aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
|
||||
|
||||
print("Azure Machine Learning Compute attached")
|
||||
|
||||
############################### point to data and scripts
|
||||
|
||||
# we use this for exchanging data between pipeline steps
|
||||
def_data_store = ws.get_default_datastore()
|
||||
|
||||
# get pointer to default blob store
|
||||
def_blob_store = Datastore(ws, "workspaceblobstore")
|
||||
print("Blobstore's name: {}".format(def_blob_store.name))
|
||||
|
||||
# Naming the intermediate data as anomaly data and assigning it to a variable
|
||||
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
|
||||
print("Anomaly data object created")
|
||||
|
||||
# model = PipelineData("model", datastore = def_data_store)
|
||||
# print("Model data object created")
|
||||
|
||||
anom_detect = PythonScriptStep(name = "anomaly_detection",
|
||||
# script_name="anom_detect.py",
|
||||
script_name = "CICD/code/anom_detect.py",
|
||||
arguments = ["--output_directory", anomaly_data],
|
||||
outputs = [anomaly_data],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
print("Anomaly Detection Step created.")
|
||||
|
||||
automl_step = PythonScriptStep(name = "automl_step",
|
||||
# script_name = "automl_step.py",
|
||||
script_name = "CICD/code/automl_step.py",
|
||||
arguments = ["--input_directory", anomaly_data],
|
||||
inputs = [anomaly_data],
|
||||
# outputs = [model],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
|
||||
print("AutoML Training Step created.")
|
||||
|
||||
############################### set up, validate and run pipeline
|
||||
|
||||
steps = [anom_detect, automl_step]
|
||||
print("Step lists created")
|
||||
|
||||
pipeline = Pipeline(workspace = ws, steps = steps)
|
||||
print ("Pipeline is built")
|
||||
|
||||
pipeline.validate()
|
||||
print("Pipeline validation complete")
|
||||
|
||||
pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)
|
||||
print("Pipeline is submitted for execution")
|
||||
|
||||
# Wait until the run finishes.
|
||||
pipeline_run.wait_for_completion(show_output = False)
|
||||
print("Pipeline run completed")
|
||||
|
||||
############################### upload artifacts to AML Workspace
|
||||
|
||||
# Download aml_config info and output of automl_step
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'aml_config',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'outputs',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
print("Updated aml_config and outputs folder")
|
||||
|
||||
model_fname = 'model.pkl'
|
||||
model_path = os.path.join("outputs", model_fname)
|
||||
|
||||
# Upload the model file explicitly into artifacts (for CI/CD)
|
||||
pipeline_run.upload_file(name = model_path, path_or_stream = model_path)
|
||||
print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))
|
|
@ -0,0 +1,58 @@
|
|||
import os, json,sys
|
||||
from azureml.core import Workspace
|
||||
from azureml.core import Run
|
||||
from azureml.core import Experiment
|
||||
from azureml.core.model import Model
|
||||
|
||||
from azureml.core.runconfig import RunConfiguration
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the latest evaluation result
|
||||
try:
|
||||
with open("aml_config/run_id.json") as f:
|
||||
config = json.load(f)
|
||||
if not config["run_id"]:
|
||||
raise Exception('No new model to register as production model perform better')
|
||||
except:
|
||||
print('No new model to register as production model perform better')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
run_id = config["run_id"]
|
||||
experiment_name = config["experiment_name"]
|
||||
exp = Experiment(workspace = ws, name = experiment_name)
|
||||
|
||||
run = Run(experiment = exp, run_id = run_id)
|
||||
names=run.get_file_names
|
||||
names()
|
||||
print('Run ID for last run: {}'.format(run_id))
|
||||
model_local_dir="model"
|
||||
os.makedirs(model_local_dir,exist_ok=True)
|
||||
|
||||
# Download Model to Project root directory
|
||||
model_name= 'model.pkl'
|
||||
run.download_file(name = './outputs/'+model_name,
|
||||
output_file_path = './model/'+model_name)
|
||||
print('Downloaded model {} to Project root directory'.format(model_name))
|
||||
os.chdir('./model')
|
||||
model = Model.register(model_path = model_name, # this points to a local file
|
||||
model_name = model_name, # this is the name the model is registered as
|
||||
tags = {'area': "predictive maintenance", 'type': "automl", 'run_id' : run_id},
|
||||
description="Model for predictive maintenance dataset",
|
||||
workspace = ws)
|
||||
os.chdir('..')
|
||||
print('Model registered: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
|
||||
|
||||
# Remove the evaluate.json as we no longer need it
|
||||
# os.remove("aml_config/evaluate.json")
|
||||
|
||||
# Writing the registered model details to /aml_config/model.json
|
||||
model_json = {}
|
||||
model_json['model_name'] = model.name
|
||||
model_json['model_version'] = model.version
|
||||
model_json['run_id'] = run_id
|
||||
with open('aml_config/model.json', 'w') as outfile:
|
||||
json.dump(model_json,outfile)
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
name: myenv
|
||||
channels:
|
||||
- defaults
|
||||
dependencies:
|
||||
- python=3.6.2
|
||||
- pip:
|
||||
- scikit-learn==0.19.1
|
||||
- azureml-sdk[automl]
|
||||
- azureml-monitoring
|
||||
- pyculiarity
|
||||
- scipy
|
||||
- numpy
|
||||
- pandas
|
|
@ -0,0 +1,301 @@
|
|||
import datetime
|
||||
import pandas as pd
|
||||
from pyculiarity import detect_ts
|
||||
import os
|
||||
import pickle
|
||||
import json
|
||||
from sklearn.externals import joblib
|
||||
from azureml.core.model import Model
|
||||
import azureml.train.automl
|
||||
from azureml.monitoring import ModelDataCollector
|
||||
import time
|
||||
import glob
|
||||
import numpy as np
|
||||
import scipy
|
||||
|
||||
def create_data_dict(data, sensors):
|
||||
"""
|
||||
|
||||
:param data:
|
||||
:return:
|
||||
"""
|
||||
data_dict = {}
|
||||
for column in data.columns:
|
||||
data_dict[column] = [data[column].values[0]]
|
||||
if column in sensors:
|
||||
data_dict[column + '_avg'] = [0.0]
|
||||
data_dict[column + '_an'] = [False]
|
||||
|
||||
return data_dict
|
||||
|
||||
|
||||
def init_df():
|
||||
"""
|
||||
Init DataFrame from one row of data
|
||||
:param data:
|
||||
:return:
|
||||
"""
|
||||
|
||||
# data_dict = create_data_dict(data)
|
||||
|
||||
df = pd.DataFrame() #data=data_dict, index=data_dict['timestamp'])
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def append_data(df, data, sensors):
|
||||
"""
|
||||
We either add the data and the results (res_dict) of the anomaly detection to the existing data frame,
|
||||
or create a new one if the data frame is empty
|
||||
"""
|
||||
data_dict = create_data_dict(data, sensors)
|
||||
|
||||
#todo, this is only necessary, because currently the webservice doesn't get any timestamps
|
||||
if df.shape[0] == 0:
|
||||
prv_timestamp = datetime.datetime(2015, 1, 1, 5, 0) # 1/1/2015 6:00:00 AM
|
||||
else:
|
||||
prv_timestamp = df['timestamp'].max()
|
||||
|
||||
data_dict['timestamp'] = [prv_timestamp + datetime.timedelta(hours=1)]
|
||||
|
||||
df = df.append(pd.DataFrame(data=data_dict, index=data_dict['timestamp']))
|
||||
|
||||
return df
|
||||
|
||||
|
||||
|
||||
|
||||
def generate_stream(telemetry, n=None):
|
||||
"""
|
||||
n is the number of sensor readings we are simulating
|
||||
"""
|
||||
|
||||
if not n:
|
||||
n = telemetry.shape[0]
|
||||
|
||||
machine_ids = [1] # telemetry['machineID'].unique()
|
||||
timestamps = telemetry['timestamp'].unique()
|
||||
|
||||
# sort test_data by timestamp
|
||||
# on every iteration, shuffle machine IDs
|
||||
# then loop over machine IDs
|
||||
|
||||
#t = TicToc()
|
||||
for timestamp in timestamps:
|
||||
#t.tic()
|
||||
np.random.shuffle(machine_ids)
|
||||
for machine_id in machine_ids:
|
||||
data = telemetry.loc[(telemetry['timestamp'] == timestamp) & (telemetry['machineID'] == machine_id), :]
|
||||
run(data)
|
||||
#t.toc("Processing all machines took")
|
||||
|
||||
|
||||
def load_df(data):
|
||||
machineID = data['machineID'].values[0]
|
||||
|
||||
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machineID)
|
||||
if os.path.exists(filename):
|
||||
df = pd.read_csv(filename)
|
||||
df['timestamp'] = pd.to_datetime(df['timestamp'], format="%Y-%m-%d %H:%M:%S")
|
||||
else:
|
||||
df = pd.DataFrame()
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def save_df(df):
|
||||
"""
|
||||
|
||||
:param df:
|
||||
:return:
|
||||
"""
|
||||
machine_id = df.ix[0, 'machineID']
|
||||
|
||||
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machine_id)
|
||||
|
||||
df.to_csv(filename, index=False)
|
||||
|
||||
|
||||
def running_avgs(df, sensors, window_size=24, only_copy=False):
|
||||
"""
|
||||
Calculates rolling average according to Welford's online algorithm.
|
||||
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online
|
||||
|
||||
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
|
||||
|
||||
:param df: a dataframe with time series in columns
|
||||
:param column: name of the column of interest
|
||||
:param n: number of measurements to consider
|
||||
:return: None
|
||||
"""
|
||||
|
||||
curr_n = df.shape[0]
|
||||
row_index = curr_n - 1
|
||||
window_size = min(window_size, curr_n)
|
||||
|
||||
for sensor in sensors:
|
||||
val_col_index = df.columns.get_loc(sensor)
|
||||
avg_col_index = df.columns.get_loc(sensor + "_avg")
|
||||
|
||||
curr_value = df.ix[row_index, val_col_index]
|
||||
|
||||
if curr_n == 0 or only_copy:
|
||||
df.ix[row_index, avg_col_index] = curr_value
|
||||
else:
|
||||
prv_avg = df.ix[(row_index -1), avg_col_index]
|
||||
df.ix[row_index, avg_col_index] = prv_avg + (curr_value - prv_avg) / window_size
|
||||
|
||||
|
||||
|
||||
def init():
|
||||
global model
|
||||
global prediction_dc
|
||||
global storage_location
|
||||
|
||||
storage_location = "/tmp/output"
|
||||
|
||||
if not os.path.exists(storage_location):
|
||||
os.makedirs(storage_location)
|
||||
|
||||
# next, we delete previous output files
|
||||
files = glob.glob(os.path.join(storage_location,'*'))
|
||||
|
||||
for f in files:
|
||||
os.remove(f)
|
||||
|
||||
model_name = "model.pkl"
|
||||
|
||||
model_path = Model.get_model_path(model_name = model_name)
|
||||
# deserialize the model file back into a sklearn model
|
||||
model = joblib.load(model_path)
|
||||
prediction_dc = ModelDataCollector("automl_model", identifier="predictions", feature_names=["prediction"])
|
||||
|
||||
|
||||
def run(rawdata, window=14 * 24):
|
||||
"""
|
||||
|
||||
:param data:
|
||||
:param window:
|
||||
:return:
|
||||
"""
|
||||
|
||||
try:
|
||||
# set some parameters for the AD algorithm
|
||||
alpha = 0.1
|
||||
max_anoms = 0.05
|
||||
only_last = None # alternative, we can set this to 'hr' or 'day'
|
||||
|
||||
json_data = json.loads(rawdata)['data']
|
||||
|
||||
# this is the beginning of anomaly detection code
|
||||
# TODO: the anomaly detection service expected one row of a pd.DataFrame w/ a timestamp and machine id, but here we only get a list of values
|
||||
# we therefore create a time stamp ourselves
|
||||
# and create a data frame that the anomaly detection code can understand
|
||||
# eventually, we want this to be harmonized!
|
||||
timestamp = time.strftime("%m/%d/%Y %H:%M:%S", time.localtime())
|
||||
machineID = 1 # TODO scipy.random.choice(100)
|
||||
telemetry_data = json_data[0][8:16:2]
|
||||
sensors = ['volt','pressure','vibration', 'rotate']
|
||||
|
||||
data_dict = {}
|
||||
data_dict['timestamp'] = [timestamp]
|
||||
data_dict['machineID'] = [machineID]
|
||||
|
||||
for i in range(0,4):
|
||||
data_dict[sensors[i]] = [telemetry_data[i]]
|
||||
|
||||
telemetry_df = pd.DataFrame(data=data_dict)
|
||||
telemetry_df['timestamp'] = pd.to_datetime(telemetry_df['timestamp'])
|
||||
|
||||
# load dataframe
|
||||
df = load_df(telemetry_df)
|
||||
|
||||
# add current sensor readings to data frame, also adds fields for anomaly detection results
|
||||
df = append_data(df, telemetry_df, sensors)
|
||||
|
||||
# # calculate running averages (no need to do this here, because we are already sending preprocessed data)
|
||||
# # TODO: this is disabled for now, because we are dealing with pre-processed data
|
||||
# running_avgs(df, sensors, only_copy=True)
|
||||
|
||||
# note timestamp so that we can update the correct row of the dataframe later
|
||||
timestamp = df['timestamp'].max()
|
||||
|
||||
# we get a copy of the current (also last) row of the dataframe
|
||||
current_row = df.loc[df['timestamp'] == timestamp, :]
|
||||
|
||||
|
||||
# determine how many sensor readings we already have
|
||||
rows = df.shape[0]
|
||||
|
||||
# if the data frame doesn't have enough rows for our sliding window size, we just return (setting that we have no
|
||||
# anomalies)
|
||||
if rows < window:
|
||||
save_df(df)
|
||||
json_data = current_row.to_json()
|
||||
|
||||
return json.dumps({"result": [0]})
|
||||
|
||||
# determine the first row of the data frame that falls into the sliding window
|
||||
start_row = rows - window
|
||||
|
||||
# a flag to indicate whether we detected an anomaly in any of the sensors after this reading
|
||||
detected_an_anomaly = False
|
||||
|
||||
anom_list = []
|
||||
# we loop over the sensor columns
|
||||
for column in sensors:
|
||||
df_s = df.ix[start_row:rows, ('timestamp', column + "_avg")]
|
||||
|
||||
# pyculiarity expects two columns with particular names
|
||||
df_s.columns = ['timestamp', 'value']
|
||||
|
||||
# we reset the timestamps, so that the current measurement is the last within the sliding time window
|
||||
# df_s = reset_time(df_s)
|
||||
|
||||
# calculate the median value within each time sliding window
|
||||
# values = df_s.groupby(df_s.index.date)['value'].median()
|
||||
|
||||
# create dataframe with median values etc.
|
||||
# df_agg = pd.DataFrame(data={'timestamp': pd.to_datetime(values.index), 'value': values})
|
||||
|
||||
# find anomalies
|
||||
results = detect_ts(df_s, max_anoms=max_anoms,
|
||||
alpha=alpha,
|
||||
direction='both',
|
||||
e_value=False,
|
||||
only_last=only_last)
|
||||
|
||||
# create a data frame where we mark for each day whether it was an anomaly
|
||||
df_s = df_s.merge(results['anoms'], on='timestamp', how='left')
|
||||
|
||||
# mark the current sensor reading as anomaly Specifically, if we get an anomaly in the the sliding window
|
||||
# leading up (including) the current sensor reading, we mark the current sensor reading as anomaly note,
|
||||
# alternatively one could mark all the sensor readings that fall within the sliding window as anomalies.
|
||||
# However, we prefer our approach, because without the current sensor reading the other sensor readings in
|
||||
# this sliding window may not have been an anomaly
|
||||
# current_row[column + '_an'] = not np.isnan(df_agg.tail(1)['anoms'].iloc[0])
|
||||
if not np.isnan(df_s.tail(1)['anoms'].iloc[0]):
|
||||
current_row.ix[0,column + '_an'] = True
|
||||
detected_an_anomaly = True
|
||||
anom_list.append(1.0)
|
||||
else:
|
||||
anom_list.append(0.0)
|
||||
|
||||
# It's only necessary to update the current row in the data frame, if we detected an anomaly
|
||||
if detected_an_anomaly:
|
||||
df.loc[df['timestamp'] == timestamp, :] = current_row
|
||||
save_df(df)
|
||||
|
||||
json_data[0][8:16:2] = anom_list
|
||||
|
||||
# # this is the end of anomaly detection code
|
||||
|
||||
data = np.array(json_data)
|
||||
result = model.predict(data)
|
||||
prediction_dc.collect(result)
|
||||
print ("saving prediction data" + time.strftime("%H:%M:%S"))
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
return json.dumps({"error": result})
|
||||
|
||||
return json.dumps({"result":result.tolist()})
|
|
@ -0,0 +1,32 @@
|
|||
# test integrity of the input data
|
||||
|
||||
import sys
|
||||
import os
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
|
||||
# number of features
|
||||
n_columns = 37
|
||||
def check_schema(X):
|
||||
n_actual_columns = X.shape[1]
|
||||
if n_actual_columns != n_columns:
|
||||
print("Error: found {} feature columns. The data should have {} feature columns.".format(n_actual_columns, n_columns))
|
||||
return False
|
||||
return True
|
||||
|
||||
def main():
|
||||
filename = sys.argv[1]
|
||||
if not os.path.exists(filename):
|
||||
print("Error: The file {} does not exist".format(filename))
|
||||
return
|
||||
|
||||
dataset = pd.read_csv(filename)
|
||||
if check_schema(dataset[dataset.columns[:-1]]):
|
||||
print("Data schema test succeeded")
|
||||
else:
|
||||
print("Data schema test failed")
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,7 @@
|
|||
.ipynb_checkpoints
|
||||
azureml-logs
|
||||
.azureml
|
||||
.git
|
||||
outputs
|
||||
azureml-setup
|
||||
docs
|
|
@ -0,0 +1,38 @@
|
|||
import numpy
|
||||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
from azureml.core.webservice import AciWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the ACI Details
|
||||
try:
|
||||
with open("aml_config/aci_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
service_name = config['aci_name']
|
||||
# Get the hosted web service
|
||||
service=Webservice(name = service_name, workspace =ws)
|
||||
|
||||
# Input for Model with all features
|
||||
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
|
||||
print(input_j)
|
||||
test_sample = json.dumps({'data': input_j})
|
||||
test_sample = bytes(test_sample,encoding = 'utf8')
|
||||
try:
|
||||
prediction = service.run(input_data = test_sample)
|
||||
print(prediction)
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
print(result)
|
||||
raise Exception('ACI service is not working as expected')
|
||||
|
|
@ -0,0 +1,43 @@
|
|||
import numpy
|
||||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the AKS Details
|
||||
os.chdir('./devops')
|
||||
try:
|
||||
with open("aml_config/aks_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
service_name = config['aks_service_name']
|
||||
# Get the hosted web service
|
||||
service=Webservice(workspace=ws, name=service_name)
|
||||
|
||||
# Input for Model with all features
|
||||
input_j = [[1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01]]
|
||||
|
||||
print(input_j)
|
||||
test_sample = json.dumps({'data': input_j})
|
||||
test_sample = bytes(test_sample,encoding = 'utf8')
|
||||
try:
|
||||
prediction = service.run(input_data = test_sample)
|
||||
print(prediction)
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
print(result)
|
||||
raise Exception('AKS service is not working as expected')
|
||||
|
||||
# Delete aks after test
|
||||
#service.delete()
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
import argparse
|
||||
import pickle
|
||||
import pandas as pd
|
||||
import os
|
||||
from pyculiarity import detect_ts # python port of Twitter AD lib
|
||||
from pytictoc import TicToc # so we can time our operations
|
||||
|
||||
parser = argparse.ArgumentParser("anom_detect")
|
||||
|
||||
parser.add_argument("--output_directory", type = str, help = "output directory")
|
||||
args = parser.parse_args()
|
||||
|
||||
print("output directory: %s" % args.output_directory)
|
||||
os.makedirs(args.output_directory, exist_ok = True)
|
||||
|
||||
# public store of telemetry data
|
||||
data_dir = 'https://coursematerial.blob.core.windows.net/data/telemetry'
|
||||
|
||||
print("Reading data ... ", end = "")
|
||||
telemetry = pd.read_csv(os.path.join(data_dir, 'telemetry.csv'))
|
||||
print("Done.")
|
||||
|
||||
print("Adding incremental data...")
|
||||
telemetry_incremental = pd.read_csv(os.path.join('CICD/data_sample/', 'telemetry_incremental.csv'))
|
||||
telemetry = telemetry.append(telemetry_incremental, ignore_index = True)
|
||||
print("Done.")
|
||||
|
||||
print("Parsing datetime...", end = "")
|
||||
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
print("Done.")
|
||||
|
||||
window_size = 12 # how many measures to include in rolling average
|
||||
sensors = telemetry.columns[2:] # sensors are stored in column 2 on
|
||||
window_sizes = [window_size] * len(sensors) # this can be changed to have individual window_sizes for each sensor
|
||||
machine_ids = telemetry['machineID'].unique()
|
||||
|
||||
t = TicToc()
|
||||
for machine_id in machine_ids[:1]: # TODO: make sure to remove the [:2], this is just here to allow us to test this
|
||||
df = telemetry.loc[telemetry.loc[:, 'machineID'] == machine_id, :]
|
||||
t.tic()
|
||||
print("Working on sensor: ")
|
||||
for s, sensor in enumerate(sensors):
|
||||
N = window_sizes[s]
|
||||
print(" %s " % sensor)
|
||||
|
||||
df_ra = rolling_average(df, sensor, N)
|
||||
anoms_timestamps = do_ad(df_ra)
|
||||
|
||||
df_anoms = pd.DataFrame(data = {'datetime': anoms_timestamps, 'machineID': [machine_id] * len(anoms_timestamps), 'errorID': [sensor] * len(anoms_timestamps)})
|
||||
|
||||
# if this is the first machine and sensor, we initialize a new dataframe
|
||||
if machine_id == 1 and s == 0:
|
||||
df_anoms_all = df_anoms
|
||||
else: # otherwise we append the newly detected anomalies to the existing dataframe
|
||||
df_anoms_all = df_anoms_all.append(df_anoms, ignore_index = True)
|
||||
|
||||
# store of output
|
||||
obj = {}
|
||||
obj["df_anoms"] = df_anoms_all
|
||||
|
||||
out_file = os.path.join(args.output_directory, "anoms.pkl")
|
||||
with open(out_file, "wb") as fp:
|
||||
pickle.dump(obj, fp)
|
||||
|
||||
t.toc("Processing machine %s took" % machine_id)
|
|
@ -0,0 +1,326 @@
|
|||
|
||||
############################### load required libraries
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import numpy as np
|
||||
import os
|
||||
import pandas as pd
|
||||
import pickle
|
||||
import random
|
||||
import urllib.request
|
||||
|
||||
from sklearn import datasets
|
||||
from sklearn.metrics import classification_report
|
||||
from sklearn.metrics import confusion_matrix
|
||||
from sklearn.metrics import roc_auc_score
|
||||
from sklearn.externals import joblib
|
||||
|
||||
import azureml.core
|
||||
from azureml.core.run import Run
|
||||
from azureml.core.experiment import Experiment
|
||||
from azureml.core.workspace import Workspace
|
||||
from azureml.train.automl import AutoMLConfig
|
||||
from azureml.train.automl.run import AutoMLRun
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
|
||||
############################### set up experiment
|
||||
|
||||
parser = argparse.ArgumentParser("automl_train")
|
||||
parser.add_argument("--input_directory", default = "data", type = str, help = "input directory")
|
||||
args = parser.parse_args()
|
||||
print("input directory: %s" % args.input_directory)
|
||||
|
||||
run = Run.get_context()
|
||||
ws = run.experiment.workspace
|
||||
|
||||
# Choose a name for the experiment and specify the project folder.
|
||||
experiment_name = 'automl-local-classification'
|
||||
project_folder = '.'
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
|
||||
output = {}
|
||||
output['SDK version'] = azureml.core.VERSION
|
||||
output['Subscription ID'] = ws.subscription_id
|
||||
output['Workspace'] = ws.name
|
||||
output['Resource Group'] = ws.resource_group
|
||||
output['Location'] = ws.location
|
||||
output['Project Directory'] = project_folder
|
||||
output['Experiment Name'] = experiment.name
|
||||
print("Run info:", output)
|
||||
|
||||
set_diagnostics_collection(send_diagnostics = True)
|
||||
|
||||
############################### define functions
|
||||
|
||||
def download_data():
|
||||
"""
|
||||
download the anomaly detection and predictive maintenance data
|
||||
:return: all the data
|
||||
"""
|
||||
os.makedirs('../data', exist_ok = True)
|
||||
container = 'https://coursematerial.blob.core.windows.net/data/telemetry/'
|
||||
|
||||
urllib.request.urlretrieve(container + 'telemetry.csv', filename = '../data/telemetry.csv')
|
||||
urllib.request.urlretrieve(container + 'maintenance.csv', filename = '../data/maintenance.csv')
|
||||
urllib.request.urlretrieve(container + 'machines.csv', filename = '../data/machines.csv')
|
||||
urllib.request.urlretrieve(container + 'failures.csv', filename = '../data/failures.csv')
|
||||
# we replace errors.csv with anoms.csv (results from running anomaly detection)
|
||||
# urllib.request.urlretrieve(container + 'errors.csv', filename = '../data/errors.csv')
|
||||
urllib.request.urlretrieve(container + 'anoms.csv', filename = '../data/anoms.csv')
|
||||
|
||||
df_telemetry = pd.read_csv('../data/telemetry.csv', header = 0)
|
||||
df_errors = pd.read_csv('../data/anoms.csv', header = 0)
|
||||
df_fails = pd.read_csv('../data/failures.csv', header = 0)
|
||||
df_maint = pd.read_csv('../data/maintenance.csv', header = 0)
|
||||
df_machines = pd.read_csv('../data/machines.csv', header = 0)
|
||||
|
||||
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
|
||||
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
|
||||
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
|
||||
df_errors = df_errors.replace({"errorID": rep_dir})
|
||||
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
|
||||
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
|
||||
return df_telemetry, df_errors, df_fails, df_maint, df_machines
|
||||
|
||||
|
||||
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
|
||||
"""
|
||||
calculates rolling averages and standard deviations
|
||||
|
||||
:param df: dataframe to run it on
|
||||
:param colnames: names of columns we want rolling statistics for
|
||||
:param suffixes: suffixes attached to the new columns (provide a list with strings)
|
||||
:param window: the lag over which rolling statistics are calculated
|
||||
:param on: the interval at which rolling statistics are calculated
|
||||
:param groupby: the column used to group results by
|
||||
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
|
||||
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
|
||||
"""
|
||||
|
||||
rolling_colnames = [c + suffixes[0] for c in colnames]
|
||||
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
|
||||
df_rolling_mean.columns = rolling_colnames
|
||||
df_rolling_mean.reset_index(inplace=True)
|
||||
|
||||
rolling_colnames = [c + suffixes[1] for c in colnames]
|
||||
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
|
||||
df_rolling_sd.columns = rolling_colnames
|
||||
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
|
||||
df_rolling_sd.reset_index(inplace=True, drop=True)
|
||||
|
||||
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
|
||||
df_res = df_res.loc[df_res.index % on == on-1]
|
||||
|
||||
return df_res
|
||||
|
||||
|
||||
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
|
||||
"""
|
||||
calculates for every timestamp the time elapsed since the last time an event occured where an event is either an error or anomaly, maintenance, or failure
|
||||
|
||||
:param df_left: the telemetry data collected at regular intervals
|
||||
:param df_right: the event data collected at irregular intervals
|
||||
:param catvar: the name of the categorical column that encodes the event
|
||||
:param prefix: the prefix for the new column showing time elapsed
|
||||
:param window: window size for detecting event
|
||||
:param on: frequency we want the results to be in
|
||||
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
|
||||
:param diff_type: the unit we want time differences to be measured in (hour by default)
|
||||
:param validate: whether we should validate results
|
||||
:param show_example: whether we should show an example to check that things are working
|
||||
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
|
||||
"""
|
||||
|
||||
keys = ['machineID', 'datetime']
|
||||
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
|
||||
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
|
||||
df_wide = df_wide.groupby(keys).sum().reset_index()
|
||||
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
|
||||
|
||||
# run a rolling window through event flags to aggregate data
|
||||
dummy_col_names = df_dummies.columns
|
||||
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
|
||||
df.reset_index(inplace=True)
|
||||
df = df.loc[df.index % on == on-1]
|
||||
df.reset_index(inplace=True, drop=True)
|
||||
df_first = df.groupby('machineID', as_index=False).nth(0)
|
||||
|
||||
# calculate the time of the last event and the time elapsed since
|
||||
for col in dummy_col_names:
|
||||
whenlast, diffcol = 'last_' + col, 'd' + col
|
||||
df.loc[:, col].fillna(value = 0, inplace=True)
|
||||
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
|
||||
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
|
||||
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
|
||||
# for the first occurence we don't know when it last happened, so we assume it happened then
|
||||
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
|
||||
df[whenlast].fillna(method='ffill', inplace=True)
|
||||
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
|
||||
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
|
||||
df.drop(columns = whenlast, inplace=True)
|
||||
|
||||
if show_example == True:
|
||||
col = np.random.choice(dummy_col_names, size = 1)[0]
|
||||
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
|
||||
print('Example:\n')
|
||||
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
|
||||
|
||||
return df
|
||||
|
||||
|
||||
############################### get and preprocess data
|
||||
|
||||
df_telemetry, df_errors, df_fails, df_maint, df_machines = download_data()
|
||||
|
||||
df_left = df_telemetry.loc[:, ['datetime', 'machineID']] # we set this aside to this table to join all our results with
|
||||
|
||||
cols_to_average = df_telemetry.columns[-4:]
|
||||
|
||||
df_telemetry_rolling_3h = get_rolling_aggregates(df_telemetry, cols_to_average,
|
||||
suffixes = ['_ma_3', '_sd_3'],
|
||||
window = 3, on = 3,
|
||||
groupby = 'machineID', lagon = 'datetime')
|
||||
|
||||
df_telemetry_rolling_12h = get_rolling_aggregates(df_telemetry, cols_to_average,
|
||||
suffixes = ['_ma_12', '_sd_12'],
|
||||
window = 12, on = 3,
|
||||
groupby = 'machineID', lagon = 'datetime')
|
||||
|
||||
df_telemetry_rolling = pd.concat([df_telemetry_rolling_3h, df_telemetry_rolling_12h.drop(['machineID', 'datetime'], axis=1)], axis=1)
|
||||
|
||||
df_telemetry_feat_roll = df_left.merge(df_telemetry_rolling, how = "inner", on = ['machineID', 'datetime'], validate = "one_to_one")
|
||||
df_telemetry_feat_roll.fillna(method = 'bfill', inplace = True)
|
||||
# df_telemetry_feat_roll.head()
|
||||
|
||||
del df_telemetry_rolling, df_telemetry_rolling_3h, df_telemetry_rolling_12h
|
||||
df_errors_feat_roll = get_datetime_diffs(df_left, df_errors, catvar = 'errorID', prefix = 'e', window = 6, lagon = 'datetime', on = 3)
|
||||
# df_errors_feat_roll.tail()
|
||||
|
||||
df_errors_feat_roll.loc[df_errors_feat_roll['machineID'] == 2, :].head()
|
||||
|
||||
df_maint_feat_roll = get_datetime_diffs(df_left, df_maint, catvar = 'comp', prefix = 'm',
|
||||
window = 6, lagon = 'datetime', on = 3, show_example = False)
|
||||
# df_maint_feat_roll.tail()
|
||||
|
||||
df_maint_feat_roll.loc[df_maint_feat_roll['machineID'] == 2, :].head()
|
||||
|
||||
df_fails_feat_roll = get_datetime_diffs(df_left, df_fails, catvar = 'failure', prefix = 'f',
|
||||
window = 6, lagon = 'datetime', on = 3, show_example = False)
|
||||
# df_fails_feat_roll.tail()
|
||||
|
||||
assert(df_errors_feat_roll.shape[0] == df_fails_feat_roll.shape[0] == df_maint_feat_roll.shape[0] == df_telemetry_feat_roll.shape[0])
|
||||
|
||||
df_all = pd.concat([df_telemetry_feat_roll,
|
||||
df_errors_feat_roll.drop(columns = ['machineID', 'datetime']),
|
||||
df_maint_feat_roll.drop(columns = ['machineID', 'datetime']),
|
||||
df_fails_feat_roll.drop(columns = ['machineID', 'datetime'])], axis = 1, verify_integrity = True)
|
||||
|
||||
df_all = pd.merge(left = df_all, right = df_machines, how = "left", on = 'machineID', validate = 'many_to_one')
|
||||
|
||||
del df_left, df_telemetry_feat_roll, df_errors_feat_roll, df_fails_feat_roll, df_maint_feat_roll
|
||||
|
||||
for i in range(1, 5): # iterate over the four components
|
||||
# find all the times a component failed for a given machine
|
||||
df_temp = df_all.loc[df_all['f_' + str(i)] == 1, ['machineID', 'datetime']]
|
||||
label = 'y_' + str(i) # name of target column (one per component)
|
||||
df_all[label] = 0
|
||||
for n in range(df_temp.shape[0]): # iterate over all the failure times
|
||||
machineID, datetime = df_temp.iloc[n, :]
|
||||
dt_end = datetime - pd.Timedelta('3 hours') # 3 hours prior to failure
|
||||
dt_start = datetime - pd.Timedelta('2 days') # n days prior to failure
|
||||
if n % 500 == 0:
|
||||
print("a failure occured on machine {0} at {1}, so {2} is set to 1 between {4} and {3}".format(machineID, datetime, label, dt_end, dt_start))
|
||||
df_all.loc[(df_all['machineID'] == machineID) &
|
||||
(df_all['datetime'].between(dt_start, dt_end)), label] = 1
|
||||
|
||||
############################### run automl experiment
|
||||
|
||||
X_drop = ['datetime', 'machineID', 'f_1', 'f_2', 'f_3', 'f_4', 'y_1', 'y_2', 'y_3', 'y_4', 'model']
|
||||
Y_keep = ['y_1', 'y_2', 'y_3', 'y_4']
|
||||
|
||||
X_train = df_all.loc[df_all['datetime'] < '2015-10-01', ].drop(X_drop, axis=1)
|
||||
y_train = df_all.loc[df_all['datetime'] < '2015-10-01', Y_keep]
|
||||
|
||||
X_test = df_all.loc[df_all['datetime'] > '2015-10-15', ].drop(X_drop, axis=1)
|
||||
y_test = df_all.loc[df_all['datetime'] > '2015-10-15', Y_keep]
|
||||
|
||||
primary_metric = 'AUC_weighted'
|
||||
|
||||
automl_config = AutoMLConfig(task = 'classification',
|
||||
preprocess = False,
|
||||
name = experiment_name,
|
||||
debug_log = 'automl_errors.log',
|
||||
primary_metric = primary_metric,
|
||||
max_time_sec = 1200,
|
||||
iterations = 2,
|
||||
n_cross_validations = 2,
|
||||
verbosity = logging.INFO,
|
||||
X = X_train.values, # we convert from pandas to numpy arrays using .vaules
|
||||
y = y_train.values[:, 0], # we convert from pandas to numpy arrays using .vaules
|
||||
path = project_folder, )
|
||||
|
||||
local_run = experiment.submit(automl_config, show_output = True)
|
||||
|
||||
# Wait until the run finishes.
|
||||
local_run.wait_for_completion(show_output = True)
|
||||
|
||||
# create new AutoMLRun object to ensure everything is in order
|
||||
ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)
|
||||
|
||||
# aux function for comparing performance of runs (quick workaround for automl's _get_max_min_comparator)
|
||||
def maximize(x, y):
|
||||
if x >= y:
|
||||
return x
|
||||
else:
|
||||
return y
|
||||
|
||||
# next couple of lines are stripped down version of automl's get_output
|
||||
children = list(ml_run.get_children())
|
||||
|
||||
best_run = None # will be child run with best performance
|
||||
best_score = None # performance of that child run
|
||||
|
||||
for child in children:
|
||||
candidate_score = child.get_metrics()[primary_metric]
|
||||
if not np.isnan(candidate_score):
|
||||
if best_score is None:
|
||||
best_score = candidate_score
|
||||
best_run = child
|
||||
else:
|
||||
new_score = maximize(best_score, candidate_score)
|
||||
if new_score != best_score:
|
||||
best_score = new_score
|
||||
best_run = child
|
||||
|
||||
# print accuracy
|
||||
best_accuracy = best_run.get_metrics()['accuracy']
|
||||
print("Best run accuracy:", best_accuracy)
|
||||
|
||||
# download model and save to pkl
|
||||
model_path = "outputs/model.pkl"
|
||||
best_run.download_file(name = model_path, output_file_path = model_path)
|
||||
|
||||
# Writing the run id to /aml_config/run_id.json
|
||||
run_id = {}
|
||||
run_id['run_id'] = best_run.id
|
||||
run_id['experiment_name'] = best_run.experiment.name
|
||||
|
||||
# save run info
|
||||
os.makedirs('aml_config', exist_ok = True)
|
||||
with open('aml_config/run_id.json', 'w') as outfile:
|
||||
json.dump(run_id, outfile)
|
||||
|
||||
############################### upload run info and model pkl to def_data_store
|
||||
|
||||
def_data_store = ws.get_default_datastore()
|
||||
def_data_store.upload(src_dir = 'aml_config', target_path = 'aml_config', overwrite = True)
|
||||
def_data_store.upload(src_dir = 'outputs', target_path = 'outputs', overwrite = True)
|
|
@ -0,0 +1,57 @@
|
|||
import os, json, sys
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.image import ContainerImage, Image
|
||||
from azureml.core.model import Model
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the latest model details
|
||||
|
||||
try:
|
||||
with open("aml_config/model.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model to register thus no need to create new scoring image')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
model_name = config['model_name']
|
||||
model_version = config['model_version']
|
||||
|
||||
|
||||
model_list = Model.list(workspace=ws)
|
||||
model, = (m for m in model_list if m.version==model_version and m.name==model_name)
|
||||
print('Model picked: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
|
||||
|
||||
os.chdir('./devops/code/scoring')
|
||||
image_name = "predmaintenance-model-score"
|
||||
|
||||
image_config = ContainerImage.image_configuration(execution_script = "score.py",
|
||||
runtime = "python-slim",
|
||||
conda_file = "conda_dependencies.yml",
|
||||
description = "Image with predictive maintenance model",
|
||||
tags = {'area': "diabetes", 'type': "regression"}
|
||||
)
|
||||
|
||||
image = Image.create(name = image_name,
|
||||
models = [model],
|
||||
image_config = image_config,
|
||||
workspace = ws)
|
||||
|
||||
image.wait_for_creation(show_output = True)
|
||||
os.chdir('../../../')
|
||||
|
||||
if image.creation_state != 'Succeeded':
|
||||
raise Exception('Image creation status: {image.creation_state}')
|
||||
|
||||
print('{}(v.{} [{}]) stored at {} with build log {}'.format(image.name, image.version, image.creation_state, image.image_location, image.image_build_log_uri))
|
||||
|
||||
# Writing the image details to /aml_config/image.json
|
||||
image_json = {}
|
||||
image_json['image_name'] = image.name
|
||||
image_json['image_version'] = image.version
|
||||
image_json['image_location'] = image.image_location
|
||||
with open('aml_config/image.json', 'w') as outfile:
|
||||
json.dump(image_json,outfile)
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.webservice import Webservice
|
||||
from azureml.core.webservice import AciWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the Image to deploy details
|
||||
try:
|
||||
with open("aml_config/image.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
image_name = config['image_name']
|
||||
image_version = config['image_version']
|
||||
|
||||
images = Image.list(workspace=ws)
|
||||
image, = (m for m in images if m.version==image_version and m.name == image_name)
|
||||
print('From image.json, Image used to deploy webservice on ACI: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
|
||||
|
||||
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,
|
||||
memory_gb=1,
|
||||
tags={'area': "pred-maintenance", 'type': "automl"},
|
||||
description='A sample description')
|
||||
|
||||
aci_service_name='aciwebservice'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
|
||||
service = Webservice.deploy_from_image(deployment_config=aciconfig,
|
||||
image=image,
|
||||
name=aci_service_name,
|
||||
workspace=ws)
|
||||
|
||||
service.wait_for_deployment()
|
||||
print('Deployed ACI Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
|
||||
|
||||
#service=Webservice(name ='aciws0622', workspace =ws)
|
||||
# Writing the ACI details to /aml_config/aci_webservice.json
|
||||
aci_webservice = {}
|
||||
aci_webservice['aci_name'] = service.name
|
||||
aci_webservice['aci_url'] = service.scoring_uri
|
||||
with open('aml_config/aci_webservice.json', 'w') as outfile:
|
||||
json.dump(aci_webservice,outfile)
|
||||
|
|
@ -0,0 +1,76 @@
|
|||
import os, json, datetime, sys
|
||||
from operator import attrgetter
|
||||
from azureml.core import Workspace
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.image import Image
|
||||
from azureml.core.compute import AksCompute, ComputeTarget
|
||||
from azureml.core.webservice import Webservice, AksWebservice
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the Image to deploy details
|
||||
try:
|
||||
with open("aml_config/image.json") as f:
|
||||
config = json.load(f)
|
||||
except:
|
||||
print('No new model, thus no deployment on ACI')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
image_name = config['image_name']
|
||||
image_version = config['image_version']
|
||||
|
||||
images = Image.list(workspace=ws)
|
||||
image, = (m for m in images if m.version==image_version and m.name == image_name)
|
||||
print('From image.json, Image used to deploy webservice: {}\nImage Version: {}\nImage Location = {}'.format(image.name, image.version, image.image_location))
|
||||
|
||||
# Check if AKS already Available
|
||||
try:
|
||||
with open("aml_config/aks_webservice.json") as f:
|
||||
config = json.load(f)
|
||||
aks_name = config['aks_name']
|
||||
aks_service_name = config['aks_service_name']
|
||||
compute_list = ws.compute_targets()
|
||||
aks_target, =(c for c in compute_list if c.name ==aks_name)
|
||||
service=Webservice(name =aks_service_name, workspace =ws)
|
||||
print('Updating AKS service {} with image: {}'.format(aks_service_name,image.image_location))
|
||||
service.update(image=image)
|
||||
except:
|
||||
aks_name = 'aks'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
aks_service_name = 'akswebservice'+ datetime.datetime.now().strftime('%m%d%H')
|
||||
prov_config = AksCompute.provisioning_configuration(agent_count = 6, vm_size = 'Standard_F2', location='eastus')
|
||||
print('No AKS found in aks_webservice.json. Creating new Aks: {} and AKS Webservice: {}'.format(aks_name,aks_service_name))
|
||||
# Create the cluster
|
||||
aks_target = ComputeTarget.create(workspace = ws,
|
||||
name = aks_name,
|
||||
provisioning_configuration = prov_config)
|
||||
|
||||
aks_target.wait_for_completion(show_output = True)
|
||||
print(aks_target.provisioning_state)
|
||||
print(aks_target.provisioning_errors)
|
||||
|
||||
# Use the default configuration (can also provide parameters to customize)
|
||||
aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)
|
||||
|
||||
service = Webservice.deploy_from_image(workspace = ws,
|
||||
name = aks_service_name,
|
||||
image = image,
|
||||
deployment_config = aks_config,
|
||||
deployment_target = aks_target)
|
||||
|
||||
service.wait_for_deployment(show_output = True)
|
||||
print(service.state)
|
||||
print('Deployed AKS Webservice: {} \nWebservice Uri: {}'.format(service.name, service.scoring_uri))
|
||||
|
||||
|
||||
|
||||
# Writing the AKS details to /aml_config/aks_webservice.json
|
||||
aks_webservice = {}
|
||||
aks_webservice['aks_name'] = aks_name
|
||||
aks_webservice['aks_service_name'] = service.name
|
||||
aks_webservice['aks_url'] = service.scoring_uri
|
||||
aks_webservice['aks_keys'] = service.get_keys()
|
||||
with open('aml_config/aks_webservice.json', 'w') as outfile:
|
||||
json.dump(aks_webservice,outfile)
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
import os, json
|
||||
import azureml.core
|
||||
from azureml.core import Workspace
|
||||
from azureml.core import Experiment
|
||||
from azureml.core import Run
|
||||
from azureml.core.model import Model
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Paramaterize the matrics on which the models should be compared
|
||||
|
||||
# Add golden data set on which all the model performance can be evaluated
|
||||
|
||||
# Get the latest run_id
|
||||
with open("aml_config/run_id.json") as f:
|
||||
config = json.load(f)
|
||||
|
||||
new_model_run_id = config["run_id"]
|
||||
experiment_name = config["experiment_name"]
|
||||
exp = Experiment(workspace = ws, name = experiment_name)
|
||||
|
||||
try:
|
||||
# Get most recently registered model, we assume that is the model in production. Download this model and compare it with the recently trained model by running test with same data set.
|
||||
model_list = Model.list(ws)
|
||||
production_model = next(filter(lambda x: x.created_time == max(model.created_time for model in model_list), model_list))
|
||||
production_model_run_id = production_model.tags.get('run_id')
|
||||
run_list = exp.get_runs()
|
||||
# production_model_run = next(filter(lambda x: x.id == production_model_run_id, run_list))
|
||||
|
||||
|
||||
# Get the run history for both production model and newly trained model and compare mse
|
||||
production_model_run = Run(exp,run_id=production_model_run_id)
|
||||
new_model_run = Run(exp,run_id=new_model_run_id)
|
||||
|
||||
production_model_metric = production_model_run.get_metrics().get('accuracy')
|
||||
new_model_metric = new_model_run.get_metrics().get('accuracy')
|
||||
print('Current Production model accuracy: {}, New trained model accuracy: {}'.format(production_model_metric, new_model_metric))
|
||||
|
||||
promote_new_model=False
|
||||
if new_model_metric < production_model_metric:
|
||||
promote_new_model = True
|
||||
print('New trained model performs better, thus it will be registered')
|
||||
except:
|
||||
promote_new_model = True
|
||||
print('This is the first model to be trained, thus nothing to evaluate for now')
|
||||
|
||||
run_id = {}
|
||||
run_id['run_id'] = ''
|
||||
# Writing the run id to /aml_config/run_id.json
|
||||
if promote_new_model:
|
||||
run_id['run_id'] = new_model_run_id
|
||||
|
||||
run_id['experiment_name'] = experiment_name
|
||||
with open('aml_config/run_id.json', 'w') as outfile:
|
||||
json.dump(run_id,outfile)
|
|
@ -0,0 +1,173 @@
|
|||
|
||||
def download_data():
|
||||
"""
|
||||
download the anomaly detection and predictive maintenance data
|
||||
:return: all the data
|
||||
"""
|
||||
os.makedirs('../data', exist_ok = True)
|
||||
container = 'https://coursematerial.blob.core.windows.net/data/telemetry/'
|
||||
|
||||
urllib.request.urlretrieve(container + 'telemetry.csv', filename = '../data/telemetry.csv')
|
||||
urllib.request.urlretrieve(container + 'maintenance.csv', filename = '../data/maintenance.csv')
|
||||
urllib.request.urlretrieve(container + 'machines.csv', filename = '../data/machines.csv')
|
||||
urllib.request.urlretrieve(container + 'failures.csv', filename = '../data/failures.csv')
|
||||
# we replace errors.csv with anoms.csv (results from running anomaly detection)
|
||||
# urllib.request.urlretrieve(container + 'errors.csv', filename = '../data/errors.csv')
|
||||
urllib.request.urlretrieve(container + 'anoms.csv', filename = '../data/anoms.csv')
|
||||
|
||||
df_telemetry = pd.read_csv('../data/telemetry.csv', header = 0)
|
||||
df_errors = pd.read_csv('../data/anoms.csv', header = 0)
|
||||
df_fails = pd.read_csv('../data/failures.csv', header = 0)
|
||||
df_maint = pd.read_csv('../data/maintenance.csv', header = 0)
|
||||
df_machines = pd.read_csv('../data/machines.csv', header = 0)
|
||||
|
||||
df_telemetry['datetime'] = pd.to_datetime(df_telemetry['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
|
||||
df_errors['datetime'] = pd.to_datetime(df_errors['datetime'])
|
||||
rep_dir = {"volt":"error1", "rotate":"error2", "pressure":"error3", "vibration":"error4"}
|
||||
df_errors = df_errors.replace({"errorID": rep_dir})
|
||||
df_errors['errorID'] = df_errors['errorID'].apply(lambda x: int(x[-1]))
|
||||
|
||||
df_fails['datetime'] = pd.to_datetime(df_fails['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
df_fails['failure'] = df_fails['failure'].apply(lambda x: int(x[-1]))
|
||||
|
||||
df_maint['datetime'] = pd.to_datetime(df_maint['datetime'], format = "%m/%d/%Y %I:%M:%S %p")
|
||||
df_maint['comp'] = df_maint['comp'].apply(lambda x: int(x[-1]))
|
||||
|
||||
return df_telemetry, df_errors, df_fails, df_maint, df_machines
|
||||
|
||||
|
||||
def get_rolling_aggregates(df, colnames, suffixes, window, on, groupby, lagon = None):
|
||||
"""
|
||||
calculates rolling averages and standard deviations
|
||||
|
||||
:param df: dataframe to run it on
|
||||
:param colnames: names of columns we want rolling statistics for
|
||||
:param suffixes: suffixes attached to the new columns (provide a list with strings)
|
||||
:param window: the lag over which rolling statistics are calculated
|
||||
:param on: the interval at which rolling statistics are calculated
|
||||
:param groupby: the column used to group results by
|
||||
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
|
||||
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
|
||||
"""
|
||||
|
||||
rolling_colnames = [c + suffixes[0] for c in colnames]
|
||||
df_rolling_mean = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].mean()
|
||||
df_rolling_mean.columns = rolling_colnames
|
||||
df_rolling_mean.reset_index(inplace=True)
|
||||
|
||||
rolling_colnames = [c + suffixes[1] for c in colnames]
|
||||
df_rolling_sd = df.groupby(groupby).rolling(window=window, on=lagon)[colnames].var()
|
||||
df_rolling_sd.columns = rolling_colnames
|
||||
df_rolling_sd = df_rolling_sd.apply(np.sqrt)
|
||||
df_rolling_sd.reset_index(inplace=True, drop=True)
|
||||
|
||||
df_res = pd.concat([df_rolling_mean, df_rolling_sd], axis=1)
|
||||
df_res = df_res.loc[df_res.index % on == on-1]
|
||||
|
||||
return df_res
|
||||
|
||||
|
||||
def get_datetime_diffs(df_left, df_right, catvar, prefix, window, on, lagon = None, diff_type = 'timedelta64[h]', validate = 'one_to_one', show_example = True):
|
||||
"""
|
||||
calculates for every timestamp the time elapsed since the last time an event occured where an event is either an error or anomaly, maintenance, or failure
|
||||
|
||||
:param df_left: the telemetry data collected at regular intervals
|
||||
:param df_right: the event data collected at irregular intervals
|
||||
:param catvar: the name of the categorical column that encodes the event
|
||||
:param prefix: the prefix for the new column showing time elapsed
|
||||
:param window: window size for detecting event
|
||||
:param on: frequency we want the results to be in
|
||||
:param lagon: the name of the datetime column used to compute lags (if none specified it defaults to row number)
|
||||
:param diff_type: the unit we want time differences to be measured in (hour by default)
|
||||
:param validate: whether we should validate results
|
||||
:param show_example: whether we should show an example to check that things are working
|
||||
:return: a dataframe with rolling statistics over a specified lag calculated over a specified interval
|
||||
"""
|
||||
|
||||
keys = ['machineID', 'datetime']
|
||||
df_dummies = pd.get_dummies(df_right[catvar], prefix=prefix)
|
||||
df_wide = pd.concat([df_right.loc[:, keys], df_dummies], axis=1)
|
||||
df_wide = df_wide.groupby(keys).sum().reset_index()
|
||||
df = df_left.merge(df_wide, how="left", on=keys, validate = validate).fillna(0)
|
||||
|
||||
# run a rolling window through event flags to aggregate data
|
||||
dummy_col_names = df_dummies.columns
|
||||
df = df.groupby('machineID').rolling(window=window, on=lagon)[dummy_col_names].max()
|
||||
df.reset_index(inplace=True)
|
||||
df = df.loc[df.index % on == on-1]
|
||||
df.reset_index(inplace=True, drop=True)
|
||||
df_first = df.groupby('machineID', as_index=False).nth(0)
|
||||
|
||||
# calculate the time of the last event and the time elapsed since
|
||||
for col in dummy_col_names:
|
||||
whenlast, diffcol = 'last_' + col, 'd' + col
|
||||
df.loc[:, col].fillna(value = 0, inplace=True)
|
||||
# let's assume an event happened in row 0, so we don't have missing values for the time elapsed
|
||||
df.iloc[df_first.index, df.columns.get_loc(col)] = 1
|
||||
df.loc[df[col] == 1, whenlast] = df.loc[df[col] == 1, 'datetime']
|
||||
# for the first occurence we don't know when it last happened, so we assume it happened then
|
||||
df.iloc[df_first.index, df.columns.get_loc(whenlast)] = df.iloc[df_first.index, df.columns.get_loc('datetime')]
|
||||
df[whenlast].fillna(method='ffill', inplace=True)
|
||||
# df.loc[df[whenlast] > df['datetime'], whenlast] = np.nan
|
||||
df.loc[df[whenlast] <= df['datetime'], diffcol] = (df['datetime'] - df[whenlast]).astype(diff_type)
|
||||
df.drop(columns = whenlast, inplace=True)
|
||||
|
||||
if show_example == True:
|
||||
col = np.random.choice(dummy_col_names, size = 1)[0]
|
||||
idx = np.random.choice(df.loc[df[col] == 1, :].index.tolist(), size = 1)[0]
|
||||
print('Example:\n')
|
||||
print(df.loc[df.index.isin(range(idx-3, idx+5)), ['datetime', col, 'd' + col]])
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def rolling_average(df, column, n = 24):
|
||||
"""
|
||||
Calculates rolling average according to Welford's online algorithm (Donald Knuth's Art of Computer Programming, Vol 2, page 232, 3rd edition).
|
||||
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_Online_algorithm
|
||||
|
||||
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
|
||||
|
||||
:param df: a dataframe with time series in columns
|
||||
:param column: name of the column of interest
|
||||
:param n: number of measurements to consider
|
||||
:return: None
|
||||
"""
|
||||
|
||||
ra = [0] * df.shape[0]
|
||||
ra[0] = df[column].values[0]
|
||||
|
||||
for r in range(1, df.shape[0]):
|
||||
curr_n = float(min(n, r))
|
||||
ra[r] = ra[r-1] + (df[column].values[r] - ra[r-1])/curr_n
|
||||
|
||||
df = pd.DataFrame(data = {'datetime': df['datetime'], 'value': ra})
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def do_ad(df, alpha = 0.005, max_anoms = 0.1, only_last = None, longterm = False, e_value = False, direction = 'both'):
|
||||
"""
|
||||
This method performs the actual anomaly detection. Expecting the a dataframe with multiple sensors,
|
||||
and a specification of which sensor to use for anomaly detection.
|
||||
|
||||
:param df: a dataframe with a timestamp column and one more columns with telemetry data
|
||||
:param column: name of the column on which to perform AD
|
||||
:param alpha: see pyculiarity documentation for the meaning of these parameters
|
||||
:param max_anoms:
|
||||
:param only_last:
|
||||
:param longterm:
|
||||
:param e_value:
|
||||
:param direction:
|
||||
:return: a pd.Series containing anomalies. If not an anomaly, entry will be NaN, otherwise the sensor reading
|
||||
"""
|
||||
results = detect_ts(df,
|
||||
max_anoms = max_anoms,
|
||||
alpha = alpha,
|
||||
direction = direction,
|
||||
e_value = e_value,
|
||||
longterm = longterm,
|
||||
only_last = only_last)
|
||||
|
||||
return results['anoms']['timestamp'].values
|
|
@ -0,0 +1,148 @@
|
|||
|
||||
############################### load required libraries
|
||||
|
||||
import os
|
||||
import pandas as pd
|
||||
import json
|
||||
|
||||
import azureml.core
|
||||
print("SDK Version:", azureml.core.VERSION)
|
||||
from azureml.core import Workspace, Run, Experiment, Datastore
|
||||
from azureml.core.compute import ComputeTarget, AmlCompute
|
||||
from azureml.core.runconfig import CondaDependencies, RunConfiguration
|
||||
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
|
||||
from azureml.telemetry import set_diagnostics_collection
|
||||
from azureml.pipeline.steps import PythonScriptStep
|
||||
from azureml.pipeline.core import Pipeline, PipelineData, StepSequence
|
||||
|
||||
|
||||
############################### load workspace and create experiment
|
||||
|
||||
ws = Workspace.from_config()
|
||||
print('Workspace name: ' + ws.name,
|
||||
'Subscription id: ' + ws.subscription_id,
|
||||
'Resource group: ' + ws.resource_group, sep = '\n')
|
||||
|
||||
experiment_name = 'aml-pipeline-cicd' # choose a name for experiment
|
||||
project_folder = '.' # project folder
|
||||
experiment = Experiment(ws, experiment_name)
|
||||
|
||||
output = {}
|
||||
output['SDK version'] = azureml.core.VERSION
|
||||
output['Subscription ID'] = ws.subscription_id
|
||||
output['Workspace'] = ws.name
|
||||
output['Resource Group'] = ws.resource_group
|
||||
output['Location'] = ws.location
|
||||
output['Project Directory'] = project_folder
|
||||
output['Experiment Name'] = experiment.name
|
||||
pd.set_option('display.max_colwidth', -1)
|
||||
print(pd.DataFrame(data = output, index = ['']).T)
|
||||
|
||||
set_diagnostics_collection(send_diagnostics = True)
|
||||
|
||||
############################### create a run config
|
||||
|
||||
cd = CondaDependencies.create(pip_packages = ["azureml-sdk==1.0.17", "azureml-train-automl==1.0.17", "pyculiarity", "pytictoc", "cryptography==2.5", "pandas"])
|
||||
|
||||
amlcompute_run_config = RunConfiguration(framework = "python", conda_dependencies = cd)
|
||||
amlcompute_run_config.environment.docker.enabled = False
|
||||
amlcompute_run_config.environment.docker.gpu_support = False
|
||||
amlcompute_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
|
||||
amlcompute_run_config.environment.spark.precache_packages = False
|
||||
|
||||
############################### create AML compute
|
||||
|
||||
aml_compute_target = "aml-compute"
|
||||
try:
|
||||
aml_compute = AmlCompute(ws, aml_compute_target)
|
||||
print("found existing compute target.")
|
||||
except:
|
||||
print("creating new compute target")
|
||||
|
||||
provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
|
||||
idle_seconds_before_scaledown=1800,
|
||||
min_nodes = 0,
|
||||
max_nodes = 4)
|
||||
aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
|
||||
aml_compute.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)
|
||||
|
||||
print("Azure Machine Learning Compute attached")
|
||||
|
||||
############################### point to data and scripts
|
||||
|
||||
# we use this for exchanging data between pipeline steps
|
||||
def_data_store = ws.get_default_datastore()
|
||||
|
||||
# get pointer to default blob store
|
||||
def_blob_store = Datastore(ws, "workspaceblobstore")
|
||||
print("Blobstore's name: {}".format(def_blob_store.name))
|
||||
|
||||
# Naming the intermediate data as anomaly data and assigning it to a variable
|
||||
anomaly_data = PipelineData("anomaly_data", datastore = def_blob_store)
|
||||
print("Anomaly data object created")
|
||||
|
||||
# model = PipelineData("model", datastore = def_data_store)
|
||||
# print("Model data object created")
|
||||
|
||||
anom_detect = PythonScriptStep(name = "anomaly_detection",
|
||||
# script_name="anom_detect.py",
|
||||
script_name = "CICD/code/anom_detect.py",
|
||||
arguments = ["--output_directory", anomaly_data],
|
||||
outputs = [anomaly_data],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
print("Anomaly Detection Step created.")
|
||||
|
||||
automl_step = PythonScriptStep(name = "automl_step",
|
||||
# script_name = "automl_step.py",
|
||||
script_name = "CICD/code/automl_step.py",
|
||||
arguments = ["--input_directory", anomaly_data],
|
||||
inputs = [anomaly_data],
|
||||
# outputs = [model],
|
||||
compute_target = aml_compute,
|
||||
source_directory = project_folder,
|
||||
allow_reuse = True,
|
||||
runconfig = amlcompute_run_config)
|
||||
|
||||
print("AutoML Training Step created.")
|
||||
|
||||
############################### set up, validate and run pipeline
|
||||
|
||||
steps = [anom_detect, automl_step]
|
||||
print("Step lists created")
|
||||
|
||||
pipeline = Pipeline(workspace = ws, steps = steps)
|
||||
print ("Pipeline is built")
|
||||
|
||||
pipeline.validate()
|
||||
print("Pipeline validation complete")
|
||||
|
||||
pipeline_run = experiment.submit(pipeline) #, regenerate_outputs=True)
|
||||
print("Pipeline is submitted for execution")
|
||||
|
||||
# Wait until the run finishes.
|
||||
pipeline_run.wait_for_completion(show_output = False)
|
||||
print("Pipeline run completed")
|
||||
|
||||
############################### upload artifacts to AML Workspace
|
||||
|
||||
# Download aml_config info and output of automl_step
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'aml_config',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
|
||||
def_data_store.download(target_path = '.',
|
||||
prefix = 'outputs',
|
||||
show_progress = True,
|
||||
overwrite = True)
|
||||
print("Updated aml_config and outputs folder")
|
||||
|
||||
model_fname = 'model.pkl'
|
||||
model_path = os.path.join("outputs", model_fname)
|
||||
|
||||
# Upload the model file explicitly into artifacts (for CI/CD)
|
||||
pipeline_run.upload_file(name = model_path, path_or_stream = model_path)
|
||||
print('Uploaded the model {} to experiment {}'.format(model_fname, pipeline_run.experiment.name))
|
|
@ -0,0 +1,56 @@
|
|||
import os, json,sys
|
||||
from azureml.core import Workspace
|
||||
from azureml.core import Run
|
||||
from azureml.core import Experiment
|
||||
from azureml.core.model import Model
|
||||
from azureml.core.runconfig import RunConfiguration
|
||||
|
||||
# Get workspace
|
||||
ws = Workspace.from_config()
|
||||
|
||||
# Get the latest evaluation result
|
||||
try:
|
||||
with open("aml_config/run_id.json") as f:
|
||||
config = json.load(f)
|
||||
if not config["run_id"]:
|
||||
raise Exception('No new model to register as production model perform better')
|
||||
except:
|
||||
print('No new model to register as production model perform better')
|
||||
#raise Exception('No new model to register as production model perform better')
|
||||
sys.exit(0)
|
||||
|
||||
run_id = config["run_id"]
|
||||
experiment_name = config["experiment_name"]
|
||||
exp = Experiment(workspace = ws, name = experiment_name)
|
||||
|
||||
run = Run(experiment = exp, run_id = run_id)
|
||||
names = run.get_file_names
|
||||
names()
|
||||
print('Run ID for last run: {}'.format(run_id))
|
||||
model_local_dir = "model"
|
||||
os.makedirs(model_local_dir, exist_ok = True)
|
||||
|
||||
# Download Model to Project root directory
|
||||
model_name = 'model.pkl'
|
||||
run.download_file(name = './outputs/' + model_name,
|
||||
output_file_path = './model/' + model_name)
|
||||
print('Downloaded model {} to Project root directory'.format(model_name))
|
||||
os.chdir('./model')
|
||||
model = Model.register(model_path = model_name, # this points to a local file
|
||||
model_name = model_name, # this is the name the model is registered as
|
||||
tags = {'area': "predictive maintenance", 'type': "automl", 'run_id' : run_id},
|
||||
description = "Model for predictive maintenance dataset",
|
||||
workspace = ws)
|
||||
os.chdir('..')
|
||||
print('Model registered: {} \nModel Description: {} \nModel Version: {}'.format(model.name, model.description, model.version))
|
||||
|
||||
# Remove the evaluate.json as we no longer need it
|
||||
# os.remove("aml_config/evaluate.json")
|
||||
|
||||
# Writing the registered model details to /aml_config/model.json
|
||||
model_json = {}
|
||||
model_json['model_name'] = model.name
|
||||
model_json['model_version'] = model.version
|
||||
model_json['run_id'] = run_id
|
||||
with open('aml_config/model.json', 'w') as outfile:
|
||||
json.dump(model_json,outfile)
|
|
@ -0,0 +1,13 @@
|
|||
name: myenv
|
||||
channels:
|
||||
- defaults
|
||||
dependencies:
|
||||
- python=3.6.2
|
||||
- pip:
|
||||
- scikit-learn==0.19.1
|
||||
- azureml-sdk[automl]
|
||||
- azureml-monitoring
|
||||
- pyculiarity
|
||||
- scipy
|
||||
- numpy
|
||||
- pandas
|
|
@ -0,0 +1,301 @@
|
|||
import datetime
|
||||
import pandas as pd
|
||||
from pyculiarity import detect_ts
|
||||
import os
|
||||
import pickle
|
||||
import json
|
||||
from sklearn.externals import joblib
|
||||
from azureml.core.model import Model
|
||||
import azureml.train.automl
|
||||
from azureml.monitoring import ModelDataCollector
|
||||
import time
|
||||
import glob
|
||||
import numpy as np
|
||||
import scipy
|
||||
|
||||
def create_data_dict(data, sensors):
|
||||
"""
|
||||
|
||||
:param data:
|
||||
:return:
|
||||
"""
|
||||
data_dict = {}
|
||||
for column in data.columns:
|
||||
data_dict[column] = [data[column].values[0]]
|
||||
if column in sensors:
|
||||
data_dict[column + '_avg'] = [0.0]
|
||||
data_dict[column + '_an'] = [False]
|
||||
|
||||
return data_dict
|
||||
|
||||
|
||||
def init_df():
|
||||
"""
|
||||
Init DataFrame from one row of data
|
||||
:param data:
|
||||
:return:
|
||||
"""
|
||||
|
||||
# data_dict = create_data_dict(data)
|
||||
|
||||
df = pd.DataFrame() #data=data_dict, index=data_dict['timestamp'])
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def append_data(df, data, sensors):
|
||||
"""
|
||||
We either add the data and the results (res_dict) of the anomaly detection to the existing data frame,
|
||||
or create a new one if the data frame is empty
|
||||
"""
|
||||
data_dict = create_data_dict(data, sensors)
|
||||
|
||||
#todo, this is only necessary, because currently the webservice doesn't get any timestamps
|
||||
if df.shape[0] == 0:
|
||||
prv_timestamp = datetime.datetime(2015, 1, 1, 5, 0) # 1/1/2015 6:00:00 AM
|
||||
else:
|
||||
prv_timestamp = df['timestamp'].max()
|
||||
|
||||
data_dict['timestamp'] = [prv_timestamp + datetime.timedelta(hours=1)]
|
||||
|
||||
df = df.append(pd.DataFrame(data=data_dict, index=data_dict['timestamp']))
|
||||
|
||||
return df
|
||||
|
||||
|
||||
|
||||
|
||||
def generate_stream(telemetry, n=None):
|
||||
"""
|
||||
n is the number of sensor readings we are simulating
|
||||
"""
|
||||
|
||||
if not n:
|
||||
n = telemetry.shape[0]
|
||||
|
||||
machine_ids = [1] # telemetry['machineID'].unique()
|
||||
timestamps = telemetry['timestamp'].unique()
|
||||
|
||||
# sort test_data by timestamp
|
||||
# on every iteration, shuffle machine IDs
|
||||
# then loop over machine IDs
|
||||
|
||||
#t = TicToc()
|
||||
for timestamp in timestamps:
|
||||
#t.tic()
|
||||
np.random.shuffle(machine_ids)
|
||||
for machine_id in machine_ids:
|
||||
data = telemetry.loc[(telemetry['timestamp'] == timestamp) & (telemetry['machineID'] == machine_id), :]
|
||||
run(data)
|
||||
#t.toc("Processing all machines took")
|
||||
|
||||
|
||||
def load_df(data):
|
||||
machineID = data['machineID'].values[0]
|
||||
|
||||
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machineID)
|
||||
if os.path.exists(filename):
|
||||
df = pd.read_csv(filename)
|
||||
df['timestamp'] = pd.to_datetime(df['timestamp'], format="%Y-%m-%d %H:%M:%S")
|
||||
else:
|
||||
df = pd.DataFrame()
|
||||
|
||||
return df
|
||||
|
||||
|
||||
def save_df(df):
|
||||
"""
|
||||
|
||||
:param df:
|
||||
:return:
|
||||
"""
|
||||
machine_id = df.ix[0, 'machineID']
|
||||
|
||||
filename = os.path.join(storage_location, "data_w_anoms_ID_%03d.csv" % machine_id)
|
||||
|
||||
df.to_csv(filename, index=False)
|
||||
|
||||
|
||||
def running_avgs(df, sensors, window_size=24, only_copy=False):
|
||||
"""
|
||||
Calculates rolling average according to Welford's online algorithm.
|
||||
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online
|
||||
|
||||
This adds a column next to the column of interest, with the suffix '_<n>' on the column name
|
||||
|
||||
:param df: a dataframe with time series in columns
|
||||
:param column: name of the column of interest
|
||||
:param n: number of measurements to consider
|
||||
:return: None
|
||||
"""
|
||||
|
||||
curr_n = df.shape[0]
|
||||
row_index = curr_n - 1
|
||||
window_size = min(window_size, curr_n)
|
||||
|
||||
for sensor in sensors:
|
||||
val_col_index = df.columns.get_loc(sensor)
|
||||
avg_col_index = df.columns.get_loc(sensor + "_avg")
|
||||
|
||||
curr_value = df.ix[row_index, val_col_index]
|
||||
|
||||
if curr_n == 0 or only_copy:
|
||||
df.ix[row_index, avg_col_index] = curr_value
|
||||
else:
|
||||
prv_avg = df.ix[(row_index -1), avg_col_index]
|
||||
df.ix[row_index, avg_col_index] = prv_avg + (curr_value - prv_avg) / window_size
|
||||
|
||||
|
||||
|
||||
def init():
|
||||
global model
|
||||
global prediction_dc
|
||||
global storage_location
|
||||
|
||||
storage_location = "/tmp/output"
|
||||
|
||||
if not os.path.exists(storage_location):
|
||||
os.makedirs(storage_location)
|
||||
|
||||
# next, we delete previous output files
|
||||
files = glob.glob(os.path.join(storage_location,'*'))
|
||||
|
||||
for f in files:
|
||||
os.remove(f)
|
||||
|
||||
model_name = "model.pkl"
|
||||
|
||||
model_path = Model.get_model_path(model_name = model_name)
|
||||
# deserialize the model file back into a sklearn model
|
||||
model = joblib.load(model_path)
|
||||
prediction_dc = ModelDataCollector("automl_model", identifier="predictions", feature_names=["prediction"])
|
||||
|
||||
|
||||
def run(rawdata, window=14 * 24):
|
||||
"""
|
||||
|
||||
:param data:
|
||||
:param window:
|
||||
:return:
|
||||
"""
|
||||
|
||||
try:
|
||||
# set some parameters for the AD algorithm
|
||||
alpha = 0.1
|
||||
max_anoms = 0.05
|
||||
only_last = None # alternative, we can set this to 'hr' or 'day'
|
||||
|
||||
json_data = json.loads(rawdata)['data']
|
||||
|
||||
# this is the beginning of anomaly detection code
|
||||
# TODO: the anomaly detection service expected one row of a pd.DataFrame w/ a timestamp and machine id, but here we only get a list of values
|
||||
# we therefore create a time stamp ourselves
|
||||
# and create a data frame that the anomaly detection code can understand
|
||||
# eventually, we want this to be harmonized!
|
||||
timestamp = time.strftime("%m/%d/%Y %H:%M:%S", time.localtime())
|
||||
machineID = 1 # TODO scipy.random.choice(100)
|
||||
telemetry_data = json_data[0][8:16:2]
|
||||
sensors = ['volt','pressure','vibration', 'rotate']
|
||||
|
||||
data_dict = {}
|
||||
data_dict['timestamp'] = [timestamp]
|
||||
data_dict['machineID'] = [machineID]
|
||||
|
||||
for i in range(0,4):
|
||||
data_dict[sensors[i]] = [telemetry_data[i]]
|
||||
|
||||
telemetry_df = pd.DataFrame(data=data_dict)
|
||||
telemetry_df['timestamp'] = pd.to_datetime(telemetry_df['timestamp'])
|
||||
|
||||
# load dataframe
|
||||
df = load_df(telemetry_df)
|
||||
|
||||
# add current sensor readings to data frame, also adds fields for anomaly detection results
|
||||
df = append_data(df, telemetry_df, sensors)
|
||||
|
||||
# # calculate running averages (no need to do this here, because we are already sending preprocessed data)
|
||||
# # TODO: this is disabled for now, because we are dealing with pre-processed data
|
||||
# running_avgs(df, sensors, only_copy=True)
|
||||
|
||||
# note timestamp so that we can update the correct row of the dataframe later
|
||||
timestamp = df['timestamp'].max()
|
||||
|
||||
# we get a copy of the current (also last) row of the dataframe
|
||||
current_row = df.loc[df['timestamp'] == timestamp, :]
|
||||
|
||||
|
||||
# determine how many sensor readings we already have
|
||||
rows = df.shape[0]
|
||||
|
||||
# if the data frame doesn't have enough rows for our sliding window size, we just return (setting that we have no
|
||||
# anomalies)
|
||||
if rows < window:
|
||||
save_df(df)
|
||||
json_data = current_row.to_json()
|
||||
|
||||
return json.dumps({"result": [0]})
|
||||
|
||||
# determine the first row of the data frame that falls into the sliding window
|
||||
start_row = rows - window
|
||||
|
||||
# a flag to indicate whether we detected an anomaly in any of the sensors after this reading
|
||||
detected_an_anomaly = False
|
||||
|
||||
anom_list = []
|
||||
# we loop over the sensor columns
|
||||
for column in sensors:
|
||||
df_s = df.ix[start_row:rows, ('timestamp', column + "_avg")]
|
||||
|
||||
# pyculiarity expects two columns with particular names
|
||||
df_s.columns = ['timestamp', 'value']
|
||||
|
||||
# we reset the timestamps, so that the current measurement is the last within the sliding time window
|
||||
# df_s = reset_time(df_s)
|
||||
|
||||
# calculate the median value within each time sliding window
|
||||
# values = df_s.groupby(df_s.index.date)['value'].median()
|
||||
|
||||
# create dataframe with median values etc.
|
||||
# df_agg = pd.DataFrame(data={'timestamp': pd.to_datetime(values.index), 'value': values})
|
||||
|
||||
# find anomalies
|
||||
results = detect_ts(df_s, max_anoms=max_anoms,
|
||||
alpha=alpha,
|
||||
direction='both',
|
||||
e_value=False,
|
||||
only_last=only_last)
|
||||
|
||||
# create a data frame where we mark for each day whether it was an anomaly
|
||||
df_s = df_s.merge(results['anoms'], on='timestamp', how='left')
|
||||
|
||||
# mark the current sensor reading as anomaly Specifically, if we get an anomaly in the the sliding window
|
||||
# leading up (including) the current sensor reading, we mark the current sensor reading as anomaly note,
|
||||
# alternatively one could mark all the sensor readings that fall within the sliding window as anomalies.
|
||||
# However, we prefer our approach, because without the current sensor reading the other sensor readings in
|
||||
# this sliding window may not have been an anomaly
|
||||
# current_row[column + '_an'] = not np.isnan(df_agg.tail(1)['anoms'].iloc[0])
|
||||
if not np.isnan(df_s.tail(1)['anoms'].iloc[0]):
|
||||
current_row.ix[0,column + '_an'] = True
|
||||
detected_an_anomaly = True
|
||||
anom_list.append(1.0)
|
||||
else:
|
||||
anom_list.append(0.0)
|
||||
|
||||
# It's only necessary to update the current row in the data frame, if we detected an anomaly
|
||||
if detected_an_anomaly:
|
||||
df.loc[df['timestamp'] == timestamp, :] = current_row
|
||||
save_df(df)
|
||||
|
||||
json_data[0][8:16:2] = anom_list
|
||||
|
||||
# # this is the end of anomaly detection code
|
||||
|
||||
data = np.array(json_data)
|
||||
result = model.predict(data)
|
||||
prediction_dc.collect(result)
|
||||
print ("saving prediction data" + time.strftime("%H:%M:%S"))
|
||||
except Exception as e:
|
||||
result = str(e)
|
||||
return json.dumps({"error": result})
|
||||
|
||||
return json.dumps({"result":result.tolist()})
|
|
@ -0,0 +1,32 @@
|
|||
# test integrity of the input data
|
||||
|
||||
import sys
|
||||
import os
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
|
||||
# number of features
|
||||
n_columns = 37
|
||||
def check_schema(X):
|
||||
n_actual_columns = X.shape[1]
|
||||
if n_actual_columns != n_columns:
|
||||
print("Error: found {} feature columns. The data should have {} feature columns.".format(n_actual_columns, n_columns))
|
||||
return False
|
||||
return True
|
||||
|
||||
def main():
|
||||
filename = sys.argv[1]
|
||||
if not os.path.exists(filename):
|
||||
print("Error: The file {} does not exist".format(filename))
|
||||
return
|
||||
|
||||
dataset = pd.read_csv(filename)
|
||||
if check_schema(dataset[dataset.columns[:-1]]):
|
||||
print("Data schema test succeeded")
|
||||
else:
|
||||
print("Data schema test failed")
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
{
|
||||
"subscription_id": ".......",
|
||||
"resource_group": ".......",
|
||||
"workspace_name": ".......",
|
||||
"workspace_region": "......."
|
||||
}
|
|
@ -0,0 +1,2 @@
|
|||
1.62168882e+02, 4.82427351e+02, 1.09748253e+02, 4.32529303e+01, 3.52377597e+01, 4.37307613e+01, 1.15729573e+01, 4.27624778e+00, 1.68042813e+02, 4.61654301e+02, 1.03138200e+02, 4.08555785e+01, 1.80809993e+01, 4.85402042e+01, 1.09373285e+01, 4.18269355e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.07200000e+03, 5.64000000e+02, 2.22900000e+03, 9.84000000e+02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 3.03000000e+02, 6.63000000e+02, 3.18300000e+03, 3.03000000e+02, 5.34300000e+03, 4.26300000e+03, 6.88200000e+03, 1.02300000e+03, 1.80000000e+01
|
||||
|
|
|
@ -0,0 +1,33 @@
|
|||
datetime,machineID,volt,rotate,pressure,vibration
|
||||
8/9/2015 5:00:00 AM,89,156.022596809483,499.186773543787,94.6935081356238,47.9299454212229
|
||||
8/9/2015 6:00:00 AM,89,189.338289348546,481.699267606406,115.672119136436,33.2653137178549
|
||||
8/9/2015 7:00:00 AM,89,161.157427445239,428.727765607777,98.9656750339584,41.4441610087944
|
||||
8/9/2015 8:00:00 AM,89,161.502408348608,453.436246296314,103.372133704982,39.2582621215621
|
||||
8/9/2015 9:00:00 AM,89,162.711527502725,474.580253904076,106.913159242991,38.3757632773898
|
||||
8/9/2015 10:00:00 AM,89,166.443990032135,431.671589345972,103.886570207659,39.9456973566939
|
||||
8/9/2015 11:00:00 AM,89,178.877688597966,375.234725956535,84.5290039772805,33.7262250941115
|
||||
8/9/2015 12:00:00 PM,89,152.710305382723,528.240939612068,117.482500743972,42.2221279079703
|
||||
8/9/2015 1:00:00 PM,89,178.297689547913,439.9008196176,86.5297382410314,41.8368689980528
|
||||
8/9/2015 2:00:00 PM,89,178.449967910224,461.03136701172,98.3063510756247,43.8636647313615
|
||||
8/9/2015 3:00:00 PM,89,173.639372721488,450.261022137965,112.418628429993,37.4570141147091
|
||||
8/9/2015 4:00:00 PM,89,153.189370579857,352.018187762502,98.9312630483397,29.0874981562648
|
||||
8/9/2015 5:00:00 PM,89,199.945957715423,421.809350524228,91.2802844059766,39.2105928790095
|
||||
8/9/2015 6:00:00 PM,89,166.408336082299,466.800808863573,117.552067959932,43.5182645195745
|
||||
8/9/2015 7:00:00 PM,89,167.376369450821,522.687921580833,95.3470267846314,38.6090811803213
|
||||
8/9/2015 8:00:00 PM,89,138.101762905172,431.361050254412,94.9800280124435,36.9840404537683
|
||||
8/9/2015 9:00:00 PM,89,149.29819536088,488.138545310211,110.176286869331,44.7414170785692
|
||||
8/9/2015 10:00:00 PM,89,169.349342103404,420.718947026669,90.2096031418544,44.4246012177021
|
||||
8/9/2015 11:00:00 PM,89,157.884780585546,374.46347545389,105.531747266353,37.4048802607342
|
||||
8/10/2015 12:00:00 AM,89,174.92102638734,393.681456167383,94.2123283383687,38.4380787184679
|
||||
8/10/2015 1:00:00 AM,89,173.859334040321,474.998720872934,114.831449991881,26.9997142587449
|
||||
8/10/2015 2:00:00 AM,89,147.507135631244,434.592467073247,109.14774266869,38.0553522602426
|
||||
8/10/2015 3:00:00 AM,89,182.508384464887,475.127724817095,88.7916931828417,36.9715744818552
|
||||
8/10/2015 4:00:00 AM,89,196.365633856682,392.765937285152,72.888759644257,45.5800850607391
|
||||
8/10/2015 5:00:00 AM,89,188.078669648455,407.441122417009,97.5390134742126,34.3381545913848
|
||||
8/10/2015 6:00:00 AM,89,146.16908298412,526.089383569558,97.3899974138672,33.3395802812222
|
||||
8/10/2015 7:00:00 AM,89,180.966858727778,539.805309324902,98.3638631679925,39.6400150497035
|
||||
8/10/2015 8:00:00 AM,89,173.114942080223,475.993018274367,94.3389221073905,44.808235501154
|
||||
8/10/2015 9:00:00 AM,89,165.710025826903,541.353097455816,97.6247228539178,35.4394473823794
|
||||
8/10/2015 10:00:00 AM,89,203.685406201796,473.07005430003,76.5413087938538,45.7219151497101
|
||||
8/10/2015 11:00:00 AM,89,193.493560935218,441.351844215338,89.5960554969496,40.9367542256887
|
||||
8/10/2015 12:00:00 PM,89,177.320149053588,285.642227983577,87.9600045132346,45.7532914751573
|
|