From 3caedcb140d69c8f3044cd99b11111d384d2aecc Mon Sep 17 00:00:00 2001 From: Elkhan Yusubov Date: Tue, 16 May 2023 18:36:55 -0400 Subject: [PATCH] Fixed typos and updated references in 11 documents (#251) --- SECURITY.md | 2 +- code/aadScim/README.md | 12 +++---- ...nagementAnalytics-AzureDevOpsDeployment.md | 10 +++--- ...ataManagementAnalytics-CreateRepository.md | 4 +-- ...gementAnalytics-GitHubActionsDeployment.md | 8 ++--- docs/DataManagementAnalytics-KnownIssues.md | 2 +- docs/DataManagementAnalytics-Prerequisites.md | 36 +++++++++---------- ...ataManagementAnalytics-ServicePrincipal.md | 12 +++---- ...taManagementAnalytics-DatabricksGeneral.md | 7 ++-- ...nagementAnalytics-DatabricksManualSetup.md | 2 +- ...ManagementAnalytics-DatabricksScimSetup.md | 8 ++--- 11 files changed, 51 insertions(+), 52 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index 8739751..7ad411c 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -4,7 +4,7 @@ Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). -If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below. +If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://learn.microsoft.com/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below. ## Reporting Security Issues diff --git a/code/aadScim/README.md b/code/aadScim/README.md index 57613f3..6d32a2a 100644 --- a/code/aadScim/README.md +++ b/code/aadScim/README.md @@ -1,27 +1,27 @@ # Required Access for Service Principal for AAD SCIM automation for Databricks -[Link](https://docs.microsoft.com/en-us/graph/application-provisioning-configure-api?view=graph-rest-1.0&tabs=http) +[Link](https://learn.microsoft.com/en-us/azure/active-directory/app-provisioning/application-provisioning-configuration-api) Directory.ReadWrite.All, Application.ReadWrite.OwnedBy 1. Instantiate Application Rights: Application.ReadWrite.All, Directory.ReadWrite.All -Link: [Link](https://docs.microsoft.com/en-us/graph/api/applicationtemplate-instantiate?tabs=http&view=graph-rest-beta) +Link: [Link](https://learn.microsoft.com/en-us/graph/api/applicationtemplate-instantiate?tabs=http&view=graph-rest-beta) 2. Create the provisioning job based on the template Rights: Application.ReadWrite.OwnedBy, Directory.ReadWrite.All -Link: [Link](https://docs.microsoft.com/en-us/graph/api/synchronization-synchronizationtemplate-list?tabs=http&view=graph-rest-beta) +Link: [Link](https://learn.microsoft.com/en-us/graph/api/synchronization-synchronizationtemplate-list?tabs=http&view=graph-rest-beta) 3. Create the provisioning job Rights: Application.ReadWrite.OwnedBy, Directory.ReadWrite.All -Link: [Link](https://docs.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-post?tabs=http&view=graph-rest-beta) +Link: [Link](https://learn.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-post?tabs=http&view=graph-rest-beta) 4. Validate Credentials Rights: Application.ReadWrite.OwnedBy, Directory.ReadWrite.All -Link: [Link](https://docs.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-validatecredentials?tabs=http&view=graph-rest-beta) +Link: [Link](https://learn.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-validatecredentials?tabs=http&view=graph-rest-beta) 5. Save your credentials @@ -31,4 +31,4 @@ Link: TBD (not defined in the docs) 6. Start the provisioning job Rights: Application.ReadWrite.OwnedBy, Directory.ReadWrite.All -Link: [Link](https://docs.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-start?tabs=http&view=graph-rest-beta) +Link: [Link](https://learn.microsoft.com/en-us/graph/api/synchronization-synchronizationjob-start?tabs=http&view=graph-rest-beta) diff --git a/docs/DataManagementAnalytics-AzureDevOpsDeployment.md b/docs/DataManagementAnalytics-AzureDevOpsDeployment.md index a205f1e..4ffa388 100644 --- a/docs/DataManagementAnalytics-AzureDevOpsDeployment.md +++ b/docs/DataManagementAnalytics-AzureDevOpsDeployment.md @@ -16,7 +16,7 @@ In the previous step we have generated a JSON output similar to the following, w First, you need to create an Azure Resource Manager service connection. To do so, execute the following steps: -1. First, you need to create an Azure DevOps Project. Instructions can be found [here](https://docs.microsoft.com/azure/devops/organizations/projects/create-project?view=azure-devops&tabs=preview-page). +1. First, you need to create an Azure DevOps Project. Instructions can be found [here](https://learn.microsoft.com/azure/devops/organizations/projects/create-project?view=azure-devops&tabs=preview-page). 1. In Azure DevOps, open the **Project settings**. 1. Now, select the **Service connections** page from the project settings page. 1. Choose **New service connection** and select **Azure Resource Manager**. @@ -33,7 +33,7 @@ First, you need to create an Azure Resource Manager service connection. To do so ![Connection DevOps](/docs/images/ConnectionDevOps.png) -More information can be found [here](https://docs.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). +More information can be found [here](https://learn.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). ## Update Parameters @@ -42,7 +42,7 @@ In order to deploy the Infrastructure as Code (IaC) templates to the desired Azu - `.ado/workflows/dataLandingZoneDeployment.yml` and - `infra/params.dev.json`. -Update these files in a seperate branch and then merge via Pull Request to trigger the initial deployment. +Update these files in a separate branch and then merge via Pull Request to trigger the initial deployment. ### Configure `dataLandingZoneDeployment.yml` @@ -61,7 +61,7 @@ The following table explains each of the parameters: |:--------------------------------------------|:-------------|:-------------| | **AZURE_SUBSCRIPTION_ID** | Specifies the subscription ID of the Data Landing Zone where all the resources will be deployed |
`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`
| | **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/DataManagementAnalytics-Prerequisites.md#supported-regions). | `northeurope` | -| **AZURE_RESOURCE_MANAGER _CONNECTION_NAME** | Specifies the resource manager connection name in Azure DevOps. More details on how to create the resource manager service connection in Azure DevOps was described in the previous paragraph or [here](https://docs.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). | `my-connection-name` | +| **AZURE_RESOURCE_MANAGER _CONNECTION_NAME** | Specifies the resource manager connection name in Azure DevOps. More details on how to create the resource manager service connection in Azure DevOps was described in the previous paragraph or [here](https://learn.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). | `my-connection-name` | ### Configure `params.dev.json` @@ -150,7 +150,7 @@ After following the instructions and updating the parameters and variables in yo **Congratulations!** You have successfully executed all steps to deploy the template into your environment through Azure DevOps. -Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-landing-zone/issues) if you come accross a potential bug in the repository. +Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-landing-zone/issues) if you come across a potential bug in the repository. >[Previous](/docs/DataManagementAnalytics-ServicePrincipal.md) >[Next](/docs/DataManagementAnalytics-KnownIssues.md) diff --git a/docs/DataManagementAnalytics-CreateRepository.md b/docs/DataManagementAnalytics-CreateRepository.md index f7dc4ba..52f67e7 100644 --- a/docs/DataManagementAnalytics-CreateRepository.md +++ b/docs/DataManagementAnalytics-CreateRepository.md @@ -1,6 +1,6 @@ # Data Landing Zone - Create repository from the template -First, you must generate your own respository based off this template respository. To do so, please follow the steps below: +First, you must generate your own repository based off this template repository. To do so, please follow the steps below: 1. On GitHub, navigate to the [main page of this repository](https://github.com/Azure/data-landing-zone). 1. Above the file list, click **Use this template** @@ -12,7 +12,7 @@ First, you must generate your own respository based off this template respositor ![Create Repository from Template](/docs/images/CreateRepoGH.png) 1. Type a name for your repository and an optional description. -1. Choose a repository visibility. For more information, see "[About repository visibility](https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/about-repository-visibility)." +1. Choose a repository visibility. For more information, see "[About repository visibility](https://learn.github.com/en/github/creating-cloning-and-archiving-repositories/about-repository-visibility)." 1. Optionally, to include the directory structure and files from all branches in the template and not just the default branch, select **Include all branches**. 1. Click **Create repository from template**. diff --git a/docs/DataManagementAnalytics-GitHubActionsDeployment.md b/docs/DataManagementAnalytics-GitHubActionsDeployment.md index a772534..e3988ec 100644 --- a/docs/DataManagementAnalytics-GitHubActionsDeployment.md +++ b/docs/DataManagementAnalytics-GitHubActionsDeployment.md @@ -12,9 +12,9 @@ In the previous step we have generated a JSON output similar to the following, w } ``` -## Adding Secrets to GitHub respository +## Adding Secrets to GitHub repository -If you want to use GitHub Actions for deploying the resources, add the JSON output as a [repository secret](https://docs.github.com/en/actions/reference/encrypted-secrets#creating-encrypted-secrets-for-a-repository) with the name `AZURE_CREDENTIALS` in your GitHub repository: +If you want to use GitHub Actions for deploying the resources, add the JSON output as a [repository secret](https://learn.github.com/en/actions/reference/encrypted-secrets#creating-encrypted-secrets-for-a-repository) with the name `AZURE_CREDENTIALS` in your GitHub repository: ![GitHub Secrets](/docs/images/AzureCredentialsGH.png) @@ -35,7 +35,7 @@ In order to deploy the Infrastructure as Code (IaC) templates to the desired Azu - `.github/workflows/dataLandingZoneDeployment.yml` and - `infra/params.dev.json`. -Update these files in a seperate branch and then merge via Pull Request to trigger the initial deployment. +Update these files in a separate branch and then merge via Pull Request to trigger the initial deployment. ### Configure `dataLandingZoneDeployment.yml` @@ -104,7 +104,7 @@ After following the instructions and updating the parameters and variables in yo **Congratulations!** You have successfully executed all steps to deploy the template into your environment through GitHub Actions. -Now, you can navigate to the **Actions** tab of the main page of the repository, where you will see a workflow with the name `Data Landing Zone Deployment` running. Click on it to see how it deploys the environment. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-landing-zone/issues) if you come accross a potential bug in the repository. +Now, you can navigate to the **Actions** tab of the main page of the repository, where you will see a workflow with the name `Data Landing Zone Deployment` running. Click on it to see how it deploys the environment. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-landing-zone/issues) if you come across a potential bug in the repository. >[Previous](/docs/DataManagementAnalytics-ServicePrincipal.md) >[Next](/docs/DataManagementAnalytics-KnownIssues.md) diff --git a/docs/DataManagementAnalytics-KnownIssues.md b/docs/DataManagementAnalytics-KnownIssues.md index f0d59ec..9931b52 100644 --- a/docs/DataManagementAnalytics-KnownIssues.md +++ b/docs/DataManagementAnalytics-KnownIssues.md @@ -18,7 +18,7 @@ ERROR: Deployment failed. Correlation ID: *** **Solution:** -This error message appears, in case during the deployment it tries to create a type of resource which has never been deployed before inside the subscription. We recommend to check prior the deployment whether the required resource providers are registered for your subscription and if needed, register them through the `Azure Portal`, `Azure Powershell` or `Azure CLI` as mentioned [here](https://docs.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). +This error message appears, in case during the deployment it tries to create a type of resource which has never been deployed before inside the subscription. We recommend to check prior the deployment whether the required resource providers are registered for your subscription and if needed, register them through the `Azure Portal`, `Azure Powershell` or `Azure CLI` as mentioned [here](https://learn.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). ## Error: ProvisioningDisabled diff --git a/docs/DataManagementAnalytics-Prerequisites.md b/docs/DataManagementAnalytics-Prerequisites.md index 2f04afe..cee0de8 100644 --- a/docs/DataManagementAnalytics-Prerequisites.md +++ b/docs/DataManagementAnalytics-Prerequisites.md @@ -1,31 +1,31 @@ # Data Landing Zone - Prerequisites -This template repsitory contains all templates to deploy the Data Landing Zone of the Cloud-scale Analytics architecture. The Data Landing Zone is a logical construct and a unit of scale in the Cloud-scale Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data. +This template repository contains all templates to deploy the Data Landing Zone of the Cloud-scale Analytics architecture. The Data Landing Zone is a logical construct and a unit of scale in the Cloud-scale Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data. ## What will be deployed? -By navigating through the deployment steps, you will deploy the folowing setup in a subscription: +By navigating through the deployment steps, you will deploy the following setup in a subscription: -> **Note:** Before deploying the resources, we recommend to check registration status of the required resource providers in your subscription. For more information, see [Resource providers for Azure services](https://docs.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). +> **Note:** Before deploying the resources, we recommend to check registration status of the required resource providers in your subscription. For more information, see [Resource providers for Azure services](https://learn.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). ![Data Landing Zone](/docs/images/DataLandingZone.png) The deployment and code artifacts include the following services: -- [Virtual Network](https://docs.microsoft.com/azure/virtual-network/virtual-networks-overview) -- [Network Security Groups](https://docs.microsoft.com/azure/virtual-network/network-security-groups-overview) -- [Route Tables](https://docs.microsoft.com/azure/virtual-network/virtual-networks-udr-overview) -- [Key Vault](https://docs.microsoft.com/azure/key-vault/general) -- [Storage Account](https://docs.microsoft.com/azure/storage/common/storage-account-overview) -- [Data Lake Storage Gen2](https://docs.microsoft.com/azure/storage/blobs/data-lake-storage-introduction) -- [Data Factory](https://docs.microsoft.com/azure/data-factory/) -- [Self-Hosted Integration Runtime](https://docs.microsoft.com/azure/data-factory/create-self-hosted-integration-runtime) -- [Log Analytics](https://docs.microsoft.com/azure/azure-monitor/learn/quick-create-workspace) -- [SQL Server](https://docs.microsoft.com/sql/sql-server/?view=sql-server-ver15) -- [SQL Database](https://docs.microsoft.com/azure/azure-sql/database/) -- [Synapse Workspace](https://docs.microsoft.com/azure/synapse-analytics/) -- [Databricks](https://docs.microsoft.com/azure/databricks/) -- [Event Hub](https://docs.microsoft.com/azure/event-hubs/) +- [Virtual Network](https://learn.microsoft.com/azure/virtual-network/virtual-networks-overview) +- [Network Security Groups](https://learn.microsoft.com/azure/virtual-network/network-security-groups-overview) +- [Route Tables](https://learn.microsoft.com/azure/virtual-network/virtual-networks-udr-overview) +- [Key Vault](https://learn.microsoft.com/azure/key-vault/general) +- [Storage Account](https://learn.microsoft.com/azure/storage/common/storage-account-overview) +- [Data Lake Storage Gen2](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-introduction) +- [Data Factory](https://learn.microsoft.com/azure/data-factory/) +- [Self-Hosted Integration Runtime](https://learn.microsoft.com/azure/data-factory/create-self-hosted-integration-runtime) +- [Log Analytics](https://learn.microsoft.com/azure/azure-monitor/learn/quick-create-workspace) +- [SQL Server](https://learn.microsoft.com/sql/sql-server/?view=sql-server-ver15) +- [SQL Database](https://learn.microsoft.com/azure/azure-sql/database/) +- [Synapse Workspace](https://learn.microsoft.com/azure/synapse-analytics/) +- [Databricks](https://learn.microsoft.com/azure/databricks/) +- [Event Hub](https://learn.microsoft.com/azure/event-hubs/) ## Code Structure @@ -77,7 +77,7 @@ Before we start with the deployment, please make sure that you have the followin - A **Data Management Landing Zone** deployed. For more information, check the [Data Management Landing Zone](https://github.com/Azure/data-management-zone) repository. - An Azure subscription. If you don't have an Azure subscription, [create your Azure free account today](https://azure.microsoft.com/free/). -- [User Access Administrator](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator) or [Owner](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#owner) access to the subscription to be able to create a service principal and role assignments for it. +- [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator) or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner) access to the subscription to be able to create a service principal and role assignments for it. - For the deployment, please choose one of the **Supported Regions**. ## Deployment diff --git a/docs/DataManagementAnalytics-ServicePrincipal.md b/docs/DataManagementAnalytics-ServicePrincipal.md index 6567e63..a6bf801 100644 --- a/docs/DataManagementAnalytics-ServicePrincipal.md +++ b/docs/DataManagementAnalytics-ServicePrincipal.md @@ -2,7 +2,7 @@ A service principal with *Contributor*, *User Access Administrator*, *Private DNS Zone Contributor* and *Network Contributor* rights needs to be generated for authentication and authorization from GitHub or Azure DevOps to your Azure subscription. This is required to deploy resources to your environment. -> **Note:** The number of role assignments can be further reduced in a production scenario. The **Network Contributor** role assignment is just required in this repository to automatically setup the vnet peering between the data management landing zone and the data landing zone. Without this, DNS resolution will not work and in- and outbound traffic will be dropped because there is no line of sight to the Azure Firewall. The **Private DNS Zone Contributor** is also not required if the deployment of DNS A-records of the Private Endpoints is automated through Azure Policies with `deployIfNotExists` effect. The same is true for the **User Access Administrator** because some of the deployment can be automated using `deployIfNotExists` policies. +> **Note:** The number of role assignments can be further reduced in a production scenario. The **Network Contributor** role assignment is just required in this repository to automatically setup the VNet peering between the data management landing zone and the data landing zone. Without this, DNS resolution will not work and in- and outbound traffic will be dropped because there is no line of sight to the Azure Firewall. The **Private DNS Zone Contributor** is also not required if the deployment of DNS A-records of the Private Endpoints is automated through Azure Policies with `deployIfNotExists` effect. The same is true for the **User Access Administrator** because some of the deployment can be automated using `deployIfNotExists` policies. ## Create Service Principal @@ -34,17 +34,17 @@ This will generate the following JSON output: > **Note:** Take note of the output. It will be required for the next steps. -## Adding additional role assigments +## Adding additional role assignments For automation purposes, more role assignments are required for the service principal. Additional required role assignments include: | Role Name | Description | Scope | |:----------|:------------|:------| -| [Private DNS Zone Contributor](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#private-dns-zone-contributor) | We expect you to deploy all Private DNS Zones for all data services into a single subscription and resource group. Therefor, the service principal needs to be Private DNS Zone Contributor on the global dns resource group which was created during the Data Management Landing Zone deployment. This is required to deploy A-records for the respective private endpoints.|
(Resource Group Scope) `/subscriptions/{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}`
| -| [Network Contributor](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#network-contributor) | In order to setup vnet peering between the Data Landing Zone vnet and the Data Management Landing Zone vnet, the service principal needs **Network Contributor** access rights on the resource group of the remote vnet. |
(Resource Group Scope) `/subscriptions/{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}`
| -| [User Access Administrator](https://docs.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator) | Required to share the self-hosted integration runtime that gets deployed into the `integration-rg` resource group with other Data Factories, like the one in the `shared-integration-rg` resource group, the service principal needs **User Access Administrator** rights on the Data Factory that gets deployed into the `integration-rg` resource group. It is also required to assign the Data Factory and Synapse managed identities access on the respective storage account file systems. |
(Resource Scope) `/subscriptions/{datalandingzone-subscriptionId}`
| -| [Reader](https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles#reader) | Required to read properties of the Purview account from the Data Management Landing Zone and then grant the managed identity of Purview **Reader** and **Storage Blob Data Reader** rights on the subscription. |
(Resource Scope) `/subscriptions/{{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Purview/accounts/{purviewAccountName}`
| +| [Private DNS Zone Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#private-dns-zone-contributor) | We expect you to deploy all Private DNS Zones for all data services into a single subscription and resource group. Therefor, the service principal needs to be Private DNS Zone Contributor on the global dns resource group which was created during the Data Management Landing Zone deployment. This is required to deploy A-records for the respective private endpoints.|
(Resource Group Scope) `/subscriptions/{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}`
| +| [Network Contributor](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#network-contributor) | In order to setup VNet peering between the Data Landing Zone VNet and the Data Management Landing Zone VNet, the service principal needs **Network Contributor** access rights on the resource group of the remote VNet. |
(Resource Group Scope) `/subscriptions/{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}`
| +| [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator) | Required to share the self-hosted integration runtime that gets deployed into the `integration-rg` resource group with other Data Factories, like the one in the `shared-integration-rg` resource group, the service principal needs **User Access Administrator** rights on the Data Factory that gets deployed into the `integration-rg` resource group. It is also required to assign the Data Factory and Synapse managed identities access on the respective storage account file systems. |
(Resource Scope) `/subscriptions/{datalandingzone-subscriptionId}`
| +| [Reader](https://learn.microsoft.com/en-us/azure/role-based-access-control/built-in-roles#reader) | Required to read properties of the Purview account from the Data Management Landing Zone and then grant the managed identity of Purview **Reader** and **Storage Blob Data Reader** rights on the subscription. |
(Resource Scope) `/subscriptions/{{datamanagement-subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Purview/accounts/{purviewAccountName}`
| To add these role assignments, you can use the [Azure Portal](https://portal.azure.com/) or run the following commands using Azure CLI/Azure Powershell: diff --git a/docs/guidance/DataManagementAnalytics-DatabricksGeneral.md b/docs/guidance/DataManagementAnalytics-DatabricksGeneral.md index c06280f..36c9918 100644 --- a/docs/guidance/DataManagementAnalytics-DatabricksGeneral.md +++ b/docs/guidance/DataManagementAnalytics-DatabricksGeneral.md @@ -4,7 +4,7 @@ This document provides guidance for Azure Databricks and some of the considerati ## Enforcing Tags for Cost Management -Cluster tags allow to easily monitor the cost of cloud resources used by various groups in an organization. More information regarding tags can be found [here](https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/usage-detail-tags-azure#tag-propagation) +Cluster tags allow to easily monitor the cost of cloud resources used by various groups in an organization. More information regarding tags can be found [here](https://learn.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/usage-detail-tags-azure#tag-propagation) > **Note:** By default, Azure Databricks applies the following tags to each cluster: **Vendor**, **Creator**, **ClusterName** and **ClusterId** @@ -34,8 +34,8 @@ As a result, when the cluster is created, the VMs provisioned in the Managed Res > *These are just notes on how the external Hive metastore is configured for Azure Databricks. > This should be incorporated more formally into the documentation.* -- Hive metastore settings are changed by a global init script. This script is managed by the new [Global Init Scripts](https://docs.databricks.com/clusters/init-scripts.html#global-init-scripts) API. As of January, 2021, the new global init scripts API is in public preview. However, Microsoft's official position is that public preview features in Azure Databricks are ready for production environments and are supported by the Support Team. For more details, see: -[Azure Databricks Preview Releases](https://docs.microsoft.com/en-us/azure/databricks/release-notes/release-types) +- Hive metastore settings are changed by a global init script. This script is managed by the new [Global Init Scripts](https://learn.databricks.com/clusters/init-scripts.html#global-init-scripts) API. As of January, 2021, the new global init scripts API is in public preview. However, Microsoft's official position is that public preview features in Azure Databricks are ready for production environments and are supported by the Support Team. For more details, see: +[Azure Databricks Preview Releases](https://learn.microsoft.com/en-us/azure/databricks/release-notes/release-types) - This solution uses [Azure Database for MySQL](https://azure.microsoft.com/en-us/services/mysql/) to store the Hive metastore. This database was chosen because it is more cost effective and because MySQL is highly compatible with Hive. @@ -44,4 +44,3 @@ As a result, when the cluster is created, the VMs provisioned in the Managed Res > *These are just notes on about the Spark Monitoring application and how the JAR files are loaded.* - Instead of copying pre-built binaries for the Spark Monitoring solution, the deployment process includes a step that will download the source code from GitHub and build it. This creates the JAR files that are then copied to the DBFS. This is handled in a notebook that is executed whenever a new Databricks workspace is provisioned. - diff --git a/docs/guidance/DataManagementAnalytics-DatabricksManualSetup.md b/docs/guidance/DataManagementAnalytics-DatabricksManualSetup.md index 624b54a..021cfac 100644 --- a/docs/guidance/DataManagementAnalytics-DatabricksManualSetup.md +++ b/docs/guidance/DataManagementAnalytics-DatabricksManualSetup.md @@ -23,7 +23,7 @@ In order to simplify the manual setup and configuration of Databricks, we are pr The following prerequisites are required before executing the Powershell script: -1. First, you have to create an AAD application as described in [this subsection](https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/app-aad-token#configure-an-app-in-azure-portal). Please take note of the **Application (client) ID** and the **Directory (tenant) ID** of the AAD application, after executing the steps successfully. +1. First, you have to create an AAD application as described in [this subsection](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/app-aad-token#configure-an-app-in-azure-portal). Please take note of the **Application (client) ID** and the **Directory (tenant) ID** of the AAD application, after executing the steps successfully. 1. You need to have **Owner** or **Contributor** access rights to the Databricks workspace. ## Execution of Powershell script diff --git a/docs/guidance/DataManagementAnalytics-DatabricksScimSetup.md b/docs/guidance/DataManagementAnalytics-DatabricksScimSetup.md index 96dce8f..4afdcfb 100644 --- a/docs/guidance/DataManagementAnalytics-DatabricksScimSetup.md +++ b/docs/guidance/DataManagementAnalytics-DatabricksScimSetup.md @@ -7,10 +7,10 @@ Azure Databricks supports SCIM, or System for Cross-domain Identity Management, - Databricks Workspace - Global Administrator rights for Azure AD -## Setup Databricks Scim Enterprise Application +## Setup Databricks SCIM Enterprise Application -1. Generate a Databricks Personal Access Token. A detailed step by step guidance can be found [here](https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/authentication#--generate-a-personal-access-token). -2. Create an AAD Application/Service Principal. A detailed step by step guidance can be found [here](https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/app-aad-token#configure-an-app-in-azure-portal). Instead of granting it the `AzureDatabricks` API permission, grant it the following API permissions: +1. Generate a Databricks Personal Access Token. A detailed step by step guidance can be found [here](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/authentication#--generate-a-personal-access-token). +2. Create an AAD Application/Service Principal. A detailed step by step guidance can be found [here](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/app-aad-token#configure-an-app-in-azure-portal). Instead of granting it the `AzureDatabricks` API permission, grant it the following API permissions: - Microsoft Graph > Application permissions > Directory.Readwrite.All - Microsoft Graph > Application permissions > Application.Readwrite.Ownedby - Microsoft Graph > Application permissions > Application.ReadWrite.All @@ -34,4 +34,4 @@ Azure Databricks supports SCIM, or System for Cross-domain Identity Management, -GroupIdList @('{objectId1}', '{objectId2}', ...) ``` -This script will automatically setup the AAD Enterprise Application for the SCIM synch. If you want to synch additional groups to your Databricks workspace, then go to the Enterprise Application with the name `{databricksWorkspaceName}-scim` and follow [these steps](https://docs.microsoft.com/en-us/azure/databricks/administration-guide/users-groups/scim/aad#assign-users-and-groups-to-the-application). +This script will automatically setup the AAD Enterprise Application for the SCIM synch. If you want to synch additional groups to your Databricks workspace, then go to the Enterprise Application with the name `{databricksWorkspaceName}-scim` and follow [these steps](https://learn.microsoft.com/en-us/azure/databricks/administration-guide/users-groups/scim/aad#assign-users-and-groups-to-the-application).