dcff723f7c | ||
---|---|---|
.. | ||
src/deployment | ||
EnterpriseDeployment.vsdx | ||
PresenterNotes.md | ||
README.md | ||
deployment.md |
README.md
Azure Machine Learning Enterprise Deployments Live event
In this session you will learn how to design and implement Azure Machine Learning (AzureML) using enterprise deployment features, so you can create a secure configuration that is compliant with your companies policies Enterprise security and governance for AzureML. The content of this session is available in our github repository.
Agenda
Topic | Feature | Description | |
---|---|---|---|
00. | Intro | Introduction to the presenters and the overall session | |
01. | Azure ML Components | An overview of the resources deployed with AzureML | |
02. | How to Organise | Overview | Guide around the common decision points when planning a deployment |
03. | Team Structure | The way your Data Science teams are organised and collaborate on projects given use case and data segregation, or cost management requirements. | |
04. | Environments | The environments used as part of your development and release workflow to segregate development from production | |
05. | Regions | The location of your data and the audience you need to serve your Machine Learning solution to | |
06. | Enterprise Security | Security and governance features | |
07. | Networks | Virtual Networks, Private Endpoints | |
08. | Identity | Authentication, Users, and Roles | |
09. | Data protection | Failover & disaster recovery | |
10. | Training & deployment | Working with data | Accessing data, Datasets, and Datastores |
11. | Training | Scaling securely | |
12. | Deploy with endpoints | Real-time, Batch, and Pipelines | |
13. | Deployment targets | Where and how to deploy using AKS, ACI, and Managed Endpoints | |
14. | Monitoring | Monitor for availability, performance, and operation | |
15. | Costs | Cost management | Best practices to optimize costs, manage budgets, and share quota with Azure Machine Learning |
Additional samples
A list of curated AzureML samples:
- Fun with AzureML repo
- Many models solution accelerator
- Official AzureML notebook samples
- MLOps starter
Frequently asked questions
Here is a list of great questions that came up during the live sessions:
- What is the difference between a compute cluster and an inference cluster? You can use Azure Machine Learning compute cluster to distribute a training or batch inference process across a cluster of CPU or GPU compute nodes in the cloud. An inference cluster refers to an Azure Kubernetes Service where Azure Machine Learning can deploy trained machine learning models as real time endpoints.
- How do I prepare the compute instance to already have a list of selected python packages? You can use a script while deploying the compute instance as seen in the official Microsoft documentation.
- Is there a way to assign a cluster to specific users or user group? Clusters "belong" to the workspace and all users that have access to the workspace can utilize the clusters. Normally the security boundary of a workspace is around the ML project where all team members have access to the same data and compute resources. Not to be confused with Compute Instances which are dedicated to each member (as they may contain ssh keys to access remote git repositories). One can create a compute instance of behalf of another user but can not use it.
- Can AzureML assist me in distributed training? Yes, the compute clusters of AzureML can support both data and model parallelism. Start from this documentation and then reach out to your FastTrack for Azure Engineer owner or PM to get more technical support.
- How do I ensure that my blob store doesn't get filled with random data overtime? You can use Azure Blob Storage lifecycle management as described in this AzureML cost optimization article.
- Can one restrict access to the functionalities of the workspace? Yes, AzureML integrates with Azure's Role Based Access Control (RBAC) model allowing you to fine tune access as seen in this article.
- Why does AutoML show 100% sampling in the best model summary? Sampling is automatically enabled by AutoML to handle imbalanced data when needed. In our example, all data were used.
- What are the best practices to version my data, since the dataset only keeps a reference to my actual files? Have a look on this article for guidance regarding data versioning.
- What is the role of Key Vault in AzureML? If you are not using the identity-based data access option of AzureML, your datastore’s access key is stored in the associated Key Vault, thus allowing you to audit access to those secrets as seen in this article. Moreover, because Key Vault is a key component of AzureML it allows you to pull secrets using the overall Azureml authentication. For example, from code, you can use azureml.core.keyvault.Keyvault class to pull secrets easily.
- I want to query my corporate databases from AzureML, should I store the credentials in the Key Vault? Although Key Vault is a great place to store credentials, I would not advice querying directly the corporate databases from within AzureML to avoid latency, potential ingestion costs, security considerations and potential accidental denial of service attack from an AzureML compute cluster that may "hammer" the database with parallel requests. Instead of that, I would advise copying the training data into an ADLS Gen 2 using Azure Data Factory of Azure Synapse Analytics and then read the data from there. You can read the hitchhiker’s guide to the data lake for guidance and best practices organizing the files in the ADLS Gen 2.