Configuration for environment customization (#206)
This commit is contained in:
Родитель
1dbb72c400
Коммит
73602bd78c
|
@ -1,4 +1,10 @@
|
|||
# Pipeline for the canary deployment workflow.
|
||||
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mcr.microsoft.com/mlops/python:latest
|
||||
|
||||
pr: none
|
||||
trigger:
|
||||
branches:
|
||||
|
@ -31,7 +37,7 @@ stages:
|
|||
timeoutInMinutes: 0
|
||||
pool:
|
||||
vmImage: 'ubuntu-latest'
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
steps:
|
||||
- task: AzureCLI@1
|
||||
inputs:
|
||||
|
|
|
@ -1,4 +1,10 @@
|
|||
# Pipeline to run basic code quality tests as part of pull requests to the master branch.
|
||||
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mcr.microsoft.com/mlops/python:latest
|
||||
|
||||
trigger: none
|
||||
pr:
|
||||
branches:
|
||||
|
@ -8,11 +14,11 @@ pr:
|
|||
pool:
|
||||
vmImage: 'ubuntu-latest'
|
||||
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
|
||||
variables:
|
||||
- template: diabetes_regression-variables.yml
|
||||
- group: devopsforai-aml-vg
|
||||
|
||||
steps:
|
||||
- template: azdo-base-pipeline.yml
|
||||
- template: azdo-base-pipeline.yml
|
||||
|
|
|
@ -1,4 +1,10 @@
|
|||
# Continuous Integration (CI) pipeline that orchestrates the training, evaluation, registration, deployment, and testing of the diabetes_regression model.
|
||||
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mcr.microsoft.com/mlops/python:latest
|
||||
|
||||
pr: none
|
||||
trigger:
|
||||
branches:
|
||||
|
@ -25,7 +31,7 @@ stages:
|
|||
jobs:
|
||||
- job: "Model_CI_Pipeline"
|
||||
displayName: "Model CI Pipeline"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- template: azdo-base-pipeline.yml
|
||||
|
@ -48,7 +54,7 @@ stages:
|
|||
- job: "Get_Pipeline_ID"
|
||||
condition: and(succeeded(), eq(coalesce(variables['auto-trigger-training'], 'true'), 'true'))
|
||||
displayName: "Get Pipeline ID for execution"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- task: AzureCLI@1
|
||||
|
@ -84,7 +90,7 @@ stages:
|
|||
dependsOn: "Run_ML_Pipeline"
|
||||
condition: always()
|
||||
displayName: "Determine if evaluation succeeded and new model is registered"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- template: diabetes_regression-template-get-model-version.yml
|
||||
|
@ -96,7 +102,7 @@ stages:
|
|||
jobs:
|
||||
- job: "Deploy_ACI"
|
||||
displayName: "Deploy to ACI"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- template: diabetes_regression-template-get-model-version.yml
|
||||
|
@ -129,7 +135,7 @@ stages:
|
|||
jobs:
|
||||
- job: "Deploy_AKS"
|
||||
displayName: "Deploy to AKS"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- template: diabetes_regression-template-get-model-version.yml
|
||||
|
@ -163,7 +169,7 @@ stages:
|
|||
jobs:
|
||||
- job: "Deploy_Webapp"
|
||||
displayName: "Deploy to Webapp"
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
timeoutInMinutes: 0
|
||||
steps:
|
||||
- template: diabetes_regression-template-get-model-version.yml
|
||||
|
|
|
@ -1,4 +1,10 @@
|
|||
# Builds the container image that is used by other pipelines for scoring.
|
||||
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mcr.microsoft.com/mlops/python:latest
|
||||
|
||||
pr: none
|
||||
trigger:
|
||||
branches:
|
||||
|
@ -16,7 +22,7 @@ trigger:
|
|||
pool:
|
||||
vmImage: 'ubuntu-latest'
|
||||
|
||||
container: mcr.microsoft.com/mlops/python:latest
|
||||
container: mlops
|
||||
|
||||
variables:
|
||||
- group: devopsforai-aml-vg
|
||||
|
|
|
@ -2,6 +2,8 @@
|
|||
|
||||
To use this existing project structure and scripts for your new ML project, you can quickly get started from the existing repository, bootstrap and create a template that works for your ML project. Bootstrapping will prepare a similar directory structure for your project which includes renaming files and folders, deleting and cleaning up some directories and fixing imports and absolute path based on your project name. This will enable reusing various resources like pre-built pipelines and scripts for your new project.
|
||||
|
||||
## Generating the project structure
|
||||
|
||||
To bootstrap from the existing MLOpsPython repository clone this repository, ensure Python is installed locally, and run bootstrap.py script as below
|
||||
|
||||
`python bootstrap.py --d [dirpath] --n [projectname]`
|
||||
|
@ -11,3 +13,11 @@ Where `[dirpath]` is the absolute path to the root of your directory where MLOps
|
|||
The script renames folders, files and files' content from the base project name `diabetes` to your project name. However, you might need to manually rename variables defined in a variable group and their values.
|
||||
|
||||
[This article](https://docs.microsoft.com/azure/machine-learning/tutorial-convert-ml-experiment-to-production#use-your-own-model-with-mlopspython-code-template) will also assist to use this code template for your own ML project.
|
||||
|
||||
## Customizing the CI and AML environments
|
||||
|
||||
In your project you will want to customize your own Docker image and Conda environment to use only the dependencies and tools required for your use case. This requires you to edit the following environment definition files:
|
||||
- The Azure ML training and scoring Conda environment defined in [conda_dependencies.yml](diabetes_regression/conda_dependencies.yml).
|
||||
- The CI Docker image and Conda environment used by the Azure DevOps build agent. See [instructions for customizing the Azure DevOps job container](../docs/custom_container.md).
|
||||
|
||||
You will want to synchronize dependency versions as appropriate between both environment definitions (for example, ML libraries used both in training and in unit tests).
|
|
@ -0,0 +1,99 @@
|
|||
# Customizing the Azure DevOps job container
|
||||
|
||||
The Model training and deployment pipeline uses a Docker container
|
||||
on the Azure Pipelines agents to provide a reproducible environment
|
||||
to run test and deployment code.
|
||||
The image of the container
|
||||
`mcr.microsoft.com/mlops/python:latest` is built with this
|
||||
[Dockerfile](../environment_setup/Dockerfile).
|
||||
|
||||
In your project you will want to build your own
|
||||
Docker image that only contains the dependencies and tools required for your
|
||||
use case. This image will be more likely smaller and therefore faster, and it
|
||||
will be totally maintained by your team.
|
||||
|
||||
## Provision an Azure Container Registry
|
||||
|
||||
An Azure Container Registry is deployed along your Azure ML Workspace to manage models.
|
||||
You can use that registry instance to store your MLOps container image as well, or
|
||||
provision a separate instance.
|
||||
|
||||
## Create a Registry Service Connection
|
||||
|
||||
[Create a service connection](https://docs.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints?view=azure-devops&tabs=yaml#sep-docreg) to your Azure Container Registry:
|
||||
- As *Connection type*, select *Docker Registry*
|
||||
- As *Registry type*, select *Azure Container Registry*
|
||||
- As *Azure container registry*, select your Container registry instance
|
||||
- As *Service connection name*, enter `acrconnection`
|
||||
|
||||
## Update the environment definition
|
||||
|
||||
Modify the [Dockerfile](../environment_setup/Dockerfile) and/or the
|
||||
[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) CI Conda
|
||||
environment definition to tailor your environment.
|
||||
Conda provides a [reusable environment for training and deployment with Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-environments).
|
||||
The Conda environment used for CI should use the same package versions as the Conda environment
|
||||
used for the Azure ML training and scoring environments (defined in [conda_dependencies.yml](../diabetes_regression/conda_dependencies.yml)).
|
||||
This enables you to run unit and integration tests using the exact same dependencies as used in the ML pipeline.
|
||||
|
||||
If a package is available in a Conda package repository, then we recommend that
|
||||
you use the Conda installation rather than the pip installation. Conda packages
|
||||
typically come with prebuilt binaries that make installation more reliable.
|
||||
|
||||
## Create a container build pipeline
|
||||
|
||||
In your [Azure DevOps](https://dev.azure.com) project create a new build
|
||||
pipeline referring to the
|
||||
[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml)
|
||||
pipeline definition in your forked repository.
|
||||
|
||||
Edit the [environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) file
|
||||
and modify the string `'public/mlops/python'` with an name suitable to describe your environment,
|
||||
e.g. `'mlops/diabetes_regression'`.
|
||||
|
||||
Save and run the pipeline. This will build and push a container image to your Azure Container Registry with
|
||||
the name you have just edited. The next step is to modify the build pipeline to run the CI job on a container
|
||||
run from that image.
|
||||
|
||||
## Modify the model pipeline
|
||||
|
||||
Modify the model pipeline file [diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml) by replacing this section:
|
||||
|
||||
```
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mcr.microsoft.com/mlops/python:latest
|
||||
```
|
||||
|
||||
with (using the image name previously defined):
|
||||
|
||||
```
|
||||
resources:
|
||||
containers:
|
||||
- container: mlops
|
||||
image: mlops/diabetes_regression
|
||||
endpoint: acrconnection
|
||||
```
|
||||
|
||||
Run the pipeline and ensure your container has been used.
|
||||
|
||||
## Addressing conflicting dependencies
|
||||
|
||||
Especially when working in a team, it's possible for environment changes across branches to interfere with one another.
|
||||
|
||||
For example, if the master branch is using scikit-learn and you create a branch to use Tensorflow instead, and you
|
||||
decide to remove scikit-learn from the
|
||||
[ci_dependencies.yml](../diabetes_regression/ci_dependencies.yml) Conda environment definition
|
||||
and run the [docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml) Docker image,
|
||||
then the master branch will stop building.
|
||||
|
||||
You could leave scikit-learn in addition to Tensorflow in the environment, but that is not ideal, as you would have to take an extra step to remove scikit-learn after merging your branch to master.
|
||||
|
||||
A better approach would be to use a distinct name for your modified environment, such as `mlops/diabetes_regression/tensorflow`.
|
||||
By changing the name of the image in your branch in both the container build pipeline
|
||||
[environment_setup/docker-image-pipeline.yml](../environment_setup/docker-image-pipeline.yml)
|
||||
and the model pipeline file
|
||||
[diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml),
|
||||
and running both pipelines in sequence on your branch,
|
||||
you avoid any branch conflicts, and the name does not have to be changed after merging to master.
|
|
@ -157,7 +157,7 @@ performs linting, unit testing and publishes a training pipeline.
|
|||
### Set up the Pipeline
|
||||
|
||||
In your [Azure DevOps](https://dev.azure.com) project create and run a new build
|
||||
pipeline referring to the [diabetes_regression-ci-build-train.yml](./.pipelines/azdo-ci-build-train.yml)
|
||||
pipeline referring to the [diabetes_regression-ci-build-train.yml](../.pipelines/diabetes_regression-ci-build-train.yml)
|
||||
pipeline definition in your forked repository:
|
||||
|
||||
![configure ci build pipeline](./images/ci-build-pipeline-configure.png)
|
||||
|
@ -193,7 +193,7 @@ specified). Example ML pipelines using R have a single step to train a model. Th
|
|||
|
||||
* The third stage of the pipeline, **Deploy to ACI**, deploys the model to the QA environment in [Azure Container Instances](https://azure.microsoft.com/en-us/services/container-instances/). It then runs a *smoke test* to validate the deployment, i.e. sends a sample query to the scoring web service and verifies that it returns a response in the expected format.
|
||||
|
||||
The pipeline uses a Docker container on the Azure Pipelines agents to accomplish the pipeline steps. The image of the container ***mcr.microsoft.com/mlops/python:latest*** is built with this [Dockerfile](./environment_setup/Dockerfile) and it has all necessary dependencies installed for the purposes of this repository. This image serves as an example of using a custom Docker image that provides a pre-baked environment. This environment is guaranteed to be the same on any building agent, VM or local machine. In your project you will want to build your own Docker image that only contains the dependencies and tools required for your use case. This image will be more likely smaller and therefore faster, and it will be totally maintained by your team.
|
||||
The pipeline uses a Docker container on the Azure Pipelines agents to accomplish the pipeline steps. The image of the container ***mcr.microsoft.com/mlops/python:latest*** is built with this [Dockerfile](../environment_setup/Dockerfile) and it has all necessary dependencies installed for the purposes of this repository. This image serves as an example of using a custom Docker image that provides a pre-baked environment. This environment is guaranteed to be the same on any building agent, VM or local machine. In your project you will want to build your own Docker image that only contains the dependencies and tools required for your use case. This image will be more likely smaller and therefore faster, and it will be totally maintained by your team.
|
||||
|
||||
Wait until the pipeline finishes and verify that there is a new model in the **ML Workspace**:
|
||||
|
||||
|
@ -261,6 +261,7 @@ Make sure your webapp has the credentials to pull the image from the Azure Conta
|
|||
* The provided pipeline definition YAML file is a sample starting point, which you should tailor to your processes and environment.
|
||||
* You should edit the pipeline definition to remove unused stages. For example, if you are deploying to ACI and AKS, you should delete the unused `Deploy_Webapp` stage.
|
||||
* You may wish to enable [manual approvals](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/approvals) before the deployment stages.
|
||||
* You may want to use [Azure DevOps self-hosted agents](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#install) to speed up your ML pipeline execution. The Docker container image for the ML pipeline is sizable, and having it cached on the agent between runs can trim several minutes from your runs.
|
||||
* You can install additional Conda or pip packages by modifying the YAML environment configurations under the `diabetes_regression` directory. Make sure to use fixed version numbers for all packages to ensure reproducibility, and use the same versions across environments.
|
||||
* You can explore aspects of model observability in the solution, such as:
|
||||
* **Logging**: navigate to the Application Insights instance linked to the Azure ML Portal,
|
||||
|
|
Загрузка…
Ссылка в новой задаче