* Renaming ESA to DMA

* * added new gif
* updated portal experience

* filename updates

* linting

* Fixing someone elses linting errors!!

* fixing broken links

* * updated file names
* updated broken links

* check links automatically

* updated link checker

* check failed links

* updated paths and linting yml

* fixed few more links

* esa to dma

* updated default tags across all deployments

Co-authored-by: Marvin Buss <marvin.buss@gmail.com>
This commit is contained in:
Espen Brandt-Kjelsen 2021-12-13 10:00:10 +01:00 коммит произвёл GitHub
Родитель f45e7f7c6b
Коммит 4c33814b13
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
28 изменённых файлов: 155 добавлений и 154 удалений

Просмотреть файл

@ -1,35 +1,35 @@
# Enterprise-Scale Analytics - Data Management Zone
# Data Management & Analytics Scenario - Data Management Zone
## Objective
The [Enterprise-Scale Analytics](https://aka.ms/adopt/datamanagement) architecture provides a prescriptive data platform design coupled with Azure best practices and design principles. These principles serve as a compass for subsequent design decisions across critical technical domains. The architecture will continue to evolve alongside the Azure platform and is ultimately driven by the various design decisions that organizations must make to define their Azure data journey.
The [Data Management & Analytics scenario](https://aka.ms/adopt/datamanagement) provides a prescriptive data platform design coupled with Azure best practices and design principles. These principles serve as a compass for subsequent design decisions across critical technical domains. The architecture will continue to evolve alongside the Azure platform and is ultimately driven by the various design decisions that organizations must make to define their Azure data journey.
The Enterprise-Scale Analytics architecture consists of two core building blocks:
The Data Management & Analytics architecture consists of two core building blocks:
1. *Data Management Zone* which provides all data management and data governance capabilities for the data platform of an organization.
1. *Data Landing Zone* which is a logical construct and a unit of scale in the Enterprise-Scale Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data.
1. *Data Landing Zone* which is a logical construct and a unit of scale in the Data Management & Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data.
The architecture is modular by design and allows organizations to start small with a single Data Management Zone and Data Landing Zone, but also allows to scale to a multi-subscription data platform environment by adding more Data Landing Zones to the architecture. Thereby, the reference design allows to implement different modern data platform patterns like data-mesh, data-fabric as well as traditional datalake architectures. Enterprise-Scale Analytics has been very well aligned with the data-mesh approach, and is ideally suited to help organizations build data products and share these across business units of an organization. If core recommendations are followed, the resulting target architecture will put the customer on a path to sustainable scale.
The architecture is modular by design and allows organizations to start small with a single Data Management Zone and Data Landing Zone, but also allows to scale to a multi-subscription data platform environment by adding more Data Landing Zones to the architecture. Thereby, the reference design allows to implement different modern data platform patterns like data-mesh, data-fabric as well as traditional datalake architectures. Data Management & Analytics has been very well aligned with the data-mesh approach, and is ideally suited to help organizations build data products and share these across business units of an organization. If core recommendations are followed, the resulting target architecture will put the customer on a path to sustainable scale.
![Enterprise-Scale Analytics](/docs/images/EnterpriseScaleAnalytics.gif)
![Data Management & Analytics](/docs/images/DataManagementAnalytics.gif)
---
_The Enterprise-Scale Analytics architecture represents the strategic design path and target technical state for your Azure data platform._
_The Data Management & Analytics architecture represents the strategic design path and target technical state for your Azure data platform._
---
This respository describes the Data Management Zone, which is classified as data management hub. It is the heart of the Enterprise-Scale Analytics architecture pattern and enables central governance of data assets across all Data Landing Zones. Enterprise-Scale Anayltics targets the deployment of a single Data Management Zone instance inside a tenant of an organization.
This repository describes the Data Management Zone, which is classified as data management hub. It is the heart of the Data Management & Analytics architecture pattern and enables central governance of data assets across all Data Landing Zones. Data Management and Analytics scenario targets the deployment of a single Data Management Zone instance inside a tenant of an organization.
> **Note:** Before getting started with the deployment, please make sure you are familiar with the [complementary documentation in the Cloud Adoption Framework](https://aka.ms/adopt/datamanagement). After deploying your Data Management Zone, please move on to the [Data Landing Zone](https://github.com/Azure/data-landing-zone) deployment to create an environment in which you can start working on generating insights and value with data. The minimal recommended setup consists of a single Data Management Zone and a single [Data Landing Zone](https://github.com/Azure/data-landing-zone).
## Deploy Enterprise-Scale Analytics
## Deploy Data Management & Analytics Scenario
The Enterprise-Scale Analytics architecture is modular by design and allows customers to start with a small footprint and grow over time. In order to not end up in a migration project, customers should decide upfront how they want to organize data domains across Data Landing Zones. All Enterprise-Scale Analytics architecture building blocks can be deployed through the Azure Portal as well as through GitHub Actions workflows and Azure DevOps Pipelines. The template repositories contain sample YAML pipelines to more quickly get started with the setup of the environments.
The Data Management & Analytics architecture is modular by design and allows customers to start with a small footprint and grow over time. In order to not end up in a migration project, customers should decide upfront how they want to organize data domains across Data Landing Zones. All Data Management & Analytics architecture building blocks can be deployed through the Azure Portal as well as through GitHub Actions workflows and Azure DevOps Pipelines. The template repositories contain sample YAML pipelines to more quickly get started with the setup of the environments.
| Reference implementation | Description | Deploy to Azure | Link |
|:---------------------------|:------------|:----------------|------|
| Enterprise-Scale Analytics | Deploys a Data Management Zone and one or multiple [Data Landing Zone](https://github.com/Azure/data-landing-zone) all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FenterpriseScaleAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.enterpriseScaleAnalytics.json) | |
| Data Management & Analytics Scenario | Deploys a Data Management Zone and one or multiple [Data Landing Zone](https://github.com/Azure/data-landing-zone) all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FdataManagementAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementAnalytics.json) | |
| Data Management Zone | Deploys a single Data Management Zone to a subscription. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementZone.json) | [Repository](https://github.com/Azure/data-management-zone) |
| Data Landing Zone | Deploys a single Data Landing Zone to a subscription. Please deploy a Data Management Zone first. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-landing-zone%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-landing-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataLandingZone.json) | [Repository](https://github.com/Azure/data-landing-zone) |
| Data Product Batch | Deploys a Data Workload template for Data Batch Analysis to a resource group inside a [Data Landing Zone](https://github.com/Azure/data-landing-zone). Please deploy a Data Management Zone and [Data Landing Zone](https://github.com/Azure/data-landing-zone) first. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-product-batch%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-product-batch%2Fmain%2Fdocs%2Freference%2Fportal.dataProduct.json) | [Repository](https://github.com/Azure/data-product-batch) |
@ -40,13 +40,13 @@ The Enterprise-Scale Analytics architecture is modular by design and allows cust
To deploy the Data Management Zone into your Azure Subscription, please follow the step-by-step instructions:
1. [Prerequisites](/docs/EnterpriseScaleAnalytics-Prerequisites.md)
2. [Create repository](/docs/EnterpriseScaleAnalytics-CreateRepository.md)
3. [Setting up Service Principal](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
1. [Prerequisites](/docs/DataManagementAnalytics-Prerequisites.md)
2. [Create repository](/docs/DataManagementAnalytics-CreateRepository.md)
3. [Setting up Service Principal](/docs/DataManagementAnalytics-ServicePrincipal.md)
4. Template Deployment
1. [GitHub Action Deployment](/docs/EnterpriseScaleAnalytics-GitHubActionsDeployment.md)
2. [Azure DevOps Deployment](/docs/EnterpriseScaleAnalytics-AzureDevOpsDeployment.md)
5. [Known Issues](/docs/EnterpriseScaleAnalytics-KnownIssues.md)
1. [GitHub Action Deployment](/docs/DataManagementAnalytics-GitHubActionsDeployment.md)
2. [Azure DevOps Deployment](/docs/DataManagementAnalytics-AzureDevOpsDeployment.md)
5. [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md)
## Contributing

Просмотреть файл

@ -60,7 +60,7 @@ The following table explains each of the parameters:
| Parameter | Description | Sample value |
|:--------------------------------------------|:-------------|:-------------|
| **AZURE_SUBSCRIPTION_ID** | Specifies the subscription ID of the Data Management Zone where all the resources will be deployed | <div style="width: 36ch">`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`</div> |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/EnterpriseScaleAnalytics-Prerequisites.md). | `northeurope` |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/DataManagementAnalytics-Prerequisites.md#supported-regions). | `northeurope` |
| **AZURE_RESOURCE_MANAGER _CONNECTION_NAME** | Specifies the resource manager connection name in Azure DevOps. More details on how to create the resource manager service connection in Azure DevOps was described in the previous paragraph or [here](https://docs.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). | `my-connection-name` |
### Configure `params.dev.json`
@ -136,11 +136,11 @@ After following the instructions and updating the parameters and variables in yo
**Congratulations!** You have successfully executed all steps to deploy the template into your environment through Azure DevOps.
Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/EnterpriseScaleAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-management-zone/issues) if you come accross a potential bug in the repository.
Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-management-zone/issues) if you come accross a potential bug in the repository.
## Add Root Collection Administrator role to Purview account
Since the Purview account is being deployed using a Service Principal, at this point, the only identity that has access to the Purview data plane, is the Service Principal. Use the [instructions in this guide](https://docs.microsoft.com/azure/purview/tutorial-metadata-policy-collections-apis#add-the-root-collection-administrator-role) to assign additional user accounts to the Purview Root Collection as Collection Admin.
>[Previous](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
>[Next](/docs/EnterpriseScaleAnalytics-KnownIssues.md)
>[Previous](/docs/DataManagementAnalytics-ServicePrincipal.md)
>[Next](/docs/DataManagementAnalytics-KnownIssues.md)

Просмотреть файл

@ -16,5 +16,5 @@ First, you must generate your own respository based off this template respositor
1. Optionally, to include the directory structure and files from all branches in the template and not just the default branch, select **Include all branches**.
1. Click **Create repository from template**.
>[Previous](/docs/EnterpriseScaleAnalytics-Prerequisites.md)
>[Next](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
>[Previous](/docs/DataManagementAnalytics-Prerequisites.md)
>[Next](/docs/DataManagementAnalytics-ServicePrincipal.md)

Просмотреть файл

@ -52,7 +52,7 @@ The following table explains each of the parameters:
| Parameter | Description | Sample value |
|:--------------------------------------------|:-------------|:-------------|
| **AZURE_SUBSCRIPTION_ID** | Specifies the subscription ID of the Data Management Zone where all the resources will be deployed | <div style="width: 36ch">`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`</div> |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/EnterpriseScaleAnalytics-Prerequisites.md) | `northeurope` |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/DataManagementAnalytics-Prerequisites.md#supported-regions) | `northeurope` |
### Configure `params.dev.json`
@ -89,11 +89,11 @@ After following the instructions and updating the parameters and variables in yo
**Congratulations!** You have successfully executed all steps to deploy the template into your environment through GitHub Actions.
Now, you can navigate to the **Actions** tab of the main page of the repository, where you will see a workflow with the name `Data Management Zone Deployment` running. Click on it to see how it deploys the environment. If you run into any issues, please check the [Known Issues](/docs/EnterpriseScaleAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-management-zone/issues) if you come accross a potential bug in the repository.
Now, you can navigate to the **Actions** tab of the main page of the repository, where you will see a workflow with the name `Data Management Zone Deployment` running. Click on it to see how it deploys the environment. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-management-zone/issues) if you come accross a potential bug in the repository.
## Add Root Collection Administrator role to Purview account
Since the Purview account is being deployed using a Service Principal, at this point, the only identity that has access to the Purview data plane, is the Service Principal. Use the [instructions in this guide](https://docs.microsoft.com/azure/purview/tutorial-metadata-policy-collections-apis#add-the-root-collection-administrator-role) to assign additional user accounts to the Purview Root Collection as Collection Admin.
>[Previous](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
>[Next](/docs/EnterpriseScaleAnalytics-KnownIssues.md)
>[Previous](/docs/DataManagementAnalytics-ServicePrincipal.md)
>[Next](/docs/DataManagementAnalytics-KnownIssues.md)

Просмотреть файл

@ -73,7 +73,7 @@ This error message appears, if the Purview soft-limit has been reached inside yo
**Solution:**
This error message appears, when deploying Purview and the `Microsoft.EventHub`, `Microsoft.Storage` and/or `Microsoft.Purview` Resource Provider (RP) is not registered for the subscription or it was registered during early preview phase. If the `Microsoft.Purview` RP is already registered, please un-register and re-register it. We have released a fix to our Deploy To Azure Buttons to overcome this issue until it gets fixed on the Purview side. If you still run into this problem, please open an Issue in this repo and start (re-)registering the three Resource Providers manually through the `Azure Portal`, `Azure Powershell` or `Azure CLI` as mentioned [here](https://docs.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). Once that is done, use the "Redeploy" button on the failed deployment to restart the deployment and successfully deploy the components.
This error message appears, when deploying Purview and the `Microsoft.EventHub`, `Microsoft.Storage` and/or `Microsoft.Purview` Resource Provider (RP) is not registered for the subscription or it was registered during early preview phase. If the `Microsoft.Purview` RP is already registered, please un-register and re-register it. We have released a fix to our Deploy To Azure Buttons to overcome this issue until it gets fixed on the Purview side. If you still run into this problem, please open an Issue in this repository and start (re-)registering the three Resource Providers manually through the `Azure Portal`, `Azure Powershell` or `Azure CLI` as mentioned [here](https://docs.microsoft.com/azure/azure-resource-manager/management/resource-providers-and-types). Once that is done, use the "Redeploy" button on the failed deployment to restart the deployment and successfully deploy the components.
>[Previous (Option (a) GitHub Actions)](/docs/EnterpriseScaleAnalytics-GitHubActionsDeployment.md)
>[Previous (Option (b) Azure DevOps)](/docs/EnterpriseScaleAnalytics-AzureDevOpsDeployment.md)
>[Previous (Option (a) GitHub Actions)](/docs/DataManagementAnalytics-GitHubActionsDeployment.md)
>[Previous (Option (b) Azure DevOps)](/docs/DataManagementAnalytics-AzureDevOpsDeployment.md)

Просмотреть файл

@ -1,6 +1,6 @@
# Data Management Zone - Prerequisites
This template repsitory contains all templates to deploy the Data Management Zone of the Enterprise-Scale Analytics architecture. The Data Management Zone is the central management instance to govern all data assets across all Data Landing Zones and possible even beyond that.
This template repository contains all templates to deploy the Data Management Zone of the Data Management & Analytics architecture. The Data Management Zone is the central management instance to govern all data assets across all Data Landing Zones and possible even beyond that.
## What will be deployed?
@ -83,9 +83,9 @@ To use the Deploy to Azure Button, please click on the button below:
| Reference implementation | Description | Deploy to Azure |
|:---------------------------|:------------|:----------------|
| Enterprise-Scale Analytics | Deploys a Data Management Zone and one or multiple [Data Landing Zone](https://github.com/Azure/data-landing-zone) all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FenterpriseScaleAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.enterpriseScaleAnalytics.json) |
| Data Management & Analytics Scenario | Deploys a Data Management Zone and one or multiple [Data Landing Zone](https://github.com/Azure/data-landing-zone) all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FdataManagementAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementAnalytics.json) |
| Data Management Zone | Deploys a single Data Management Zone to a subscription. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.githubusercontent.com%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementZone.json) |
Alternatively, click on `Next` to follow the steps required to successfully deploy the Data Management Zone through GitHub Actions or Azure DevOps.
>[Next](/docs/EnterpriseScaleAnalytics-CreateRepository.md)
>[Next](/docs/DataManagementAnalytics-CreateRepository.md)

Просмотреть файл

@ -92,6 +92,6 @@ New-AzRoleAssignment `
-ResourceGroupName "{resourceGroupName}
```
>[Previous](/docs/EnterpriseScaleAnalytics-CreateRepository.md)
>[Next (Option (a) GitHub Actions)](/docs/EnterpriseScaleAnalytics-GitHubActionsDeployment.md)
>[Next (Option (b) Azure DevOps)](/docs/EnterpriseScaleAnalytics-AzureDevOpsDeployment.md)
>[Previous](/docs/DataManagementAnalytics-CreateRepository.md)
>[Next (Option (a) GitHub Actions)](/docs/DataManagementAnalytics-GitHubActionsDeployment.md)
>[Next (Option (b) Azure DevOps)](/docs/DataManagementAnalytics-AzureDevOpsDeployment.md)

Просмотреть файл

@ -1,6 +1,6 @@
# Connecting to Environments Privately
The Enterprise-Scale reference architecture is secure by design and uses a multi-layered security approach to overcome common data exfiltration risks raised by customers. Features on a network, identity, data and service layer, enable customers to define granular access controls to only expose required data to a user. Even if some of these security mechanisms fail, data within the ESA platform stays secure.
The Data Management & Analytics Scenario reference architecture is secure by design and uses a multi-layered security approach to overcome common data exfiltration risks raised by customers. Features on a network, identity, data and service layer, enable customers to define granular access controls to only expose required data to a user. Even if some of these security mechanisms fail, data within the ESA platform stays secure.
Network features like private endpoints and disabled public network access greatly reduce the attack surface of a data platform of an organization. However, with these features enabled, additional steps need to be taken to successfully connect to the respective services like Storage Accounts, Synapse workspaces, Purview or Azure Machine Learning from the public internet. Therefore, this document will describe the most common options for connecting to services inside the Data Management Zone or Data Landing Zone in a simple and secure way.
@ -26,7 +26,7 @@ More details about Azure bAstion can be found [here](https://docs.microsoft.com/
### Deployment
To simplify the setup for Enterprise-Scale Aalytics users, we have been working on a Bicep/ARM template to quickly recreate this setup inside your Data Management Zone or Data Landing Zone. Our template will create the following setup inside your subscription:
To simplify the setup for Data Management & Analytics Scenario users, we have been working on a Bicep/ARM template to quickly recreate this setup inside your Data Management Zone or Data Landing Zone. Our template will create the following setup inside your subscription:
![Azure Bastion Architecture](/docs/images/AzureBastionArchitecture.png)
@ -66,7 +66,7 @@ If you want to connect to the VM using Azure Bastion, execute the following step
![Connect to New SQL Script](/docs/images/new-sql-script.png)
Only a single jumpbox in one of the Data Landing Zone is required to access services across all Data Landing Zones and Data Management Zones, if all the virtual networks have been peered with each other. More details on why this this network setup is recommended can be found [here](/docs/guidance/EnterpriseScaleAnalytics-NetworkArchitecture.md). A maximum of one Azure Bastion service is recommended per Data Landing Zone. If more users require access to the environment, additional Azure VMs can be added to the Data Landing Zone.
Only a single jumpbox in one of the Data Landing Zone is required to access services across all Data Landing Zones and Data Management Zones, if all the virtual networks have been peered with each other. More details on why this this network setup is recommended can be found [here](/docs/guidance/DataManagementAnalytics-NetworkArchitecture.md). A maximum of one Azure Bastion service is recommended per Data Landing Zone. If more users require access to the environment, additional Azure VMs can be added to the Data Landing Zone.
## Point to Site (P2S) Connection

Просмотреть файл

@ -1,10 +1,10 @@
# Cost Estimates
This document will provide users an overview of what monthly cost can be expected when using Enterprise-Scale Analytics. Links to the official Cost Calculator will be provided where users can make changes depending on the expected amount of data and data throughput. All cost calculations will specify base cost of Enterprise-Scale Analytics. Base cost means the cost that occur if no data workloads run inside the respective subscriptions.
This document will provide users an overview of what monthly cost can be expected when using Data Management & Analytics Scenario. Links to the official Cost Calculator will be provided where users can make changes depending on the expected amount of data and data throughput. All cost calculations will specify base cost of Data Management & Analytics Scenario. Base cost means the cost that occur if no data workloads run inside the respective subscriptions.
## Data Management Zone
In a production scenario, it is recommended to rely on the Azure Firewall and Private DNS Zones hosted in the connectivity Hub of [Enterprise-Scale Landing Zones](https://github.com/Azure/Enterprise-Scale). In MVPs, users may rely on the Azure Firewall and Private DNS Zones bundled with Enterprise-Scale Analytics. Hence, two different cost calclations are provided which can be found below:
In a production scenario, it is recommended to rely on the Azure Firewall and Private DNS Zones hosted in the connectivity Hub of [Enterprise-Scale Landing Zones](https://github.com/Azure/Enterprise-Scale). In MVPs, users may rely on the Azure Firewall and Private DNS Zones bundled with Data Management & Analytics. Hence, two different cost calclations are provided which can be found below:
- [Pricing Calculator - Data Management Zone w/o Azure Firewall and Private DNS Zones](https://azure.com/e/070df56959aa4ee89d42e60a1dc3c77b)
- [Pricing Calculator - Data Management Zone w/ Azure Firewall and Private DNS Zones](https://azure.com/e/3ebdcf80bc9b4d7bb385e555c027c9de)

Просмотреть файл

@ -30,7 +30,7 @@ For the first scenario, a rogue admin requires the following rights in a subscri
With these rights, a rogue admin has the possibility to create a private endpoint in the customers Azure AD tenant that is linked to a service in a separate subscription and Azure AD tenant. The scenario is illustrated in Figure 1 (represented as connection A).
To do so, the user first needs to setup an external Azure AD tenant and Azure subscription. As a next step, the private endpoint needs to be created in the customer environment, by manually specifying the resource id of the service. Finally, the rogue admin needs to approve the private endpoint on the linked service hosted in the external Azure AD tenant to allow traffic over the connection.
To do so, the user first needs to setup an external Azure AD tenant and Azure subscription. As a next step, the private endpoint needs to be created in the customer environment, by manually specifying the resource ID of the service. Finally, the rogue admin needs to approve the private endpoint on the linked service hosted in the external Azure AD tenant to allow traffic over the connection.
Once the private endpoint connection is approved by the rogue admin, corporate data could be copied over the corporate virtual network to an Azure service on an external Azure AD tenant (in case access was granted via Azure RBAC).
@ -112,7 +112,7 @@ For the second scenario, a rogue admin requires the following rights in the cust
With these rights, a rogue admin has the possibility to create a private endpoint in an external Azure AD tenant and subscription that is linked to a service in the customers Azure AD tenant. The scenario is illustrated in Figure 1 (represented as connection B).
On this scenario, the rogue admin needs to first setup an external private Azure AD tenant and Azure subscription. As a next step, a private endpoint needs to be created in the environment of the rogue admin, by manually specifying the resource id and group id of the service in the corporate Azure AD tenant. Finally, the rogue admin needs to approve the private endpoint on the linked service to allow traffic over the connection across Azure AD tenants.
On this scenario, the rogue admin needs to first setup an external private Azure AD tenant and Azure subscription. As a next step, a private endpoint needs to be created in the environment of the rogue admin, by manually specifying the resource ID and group ID of the service in the corporate Azure AD tenant. Finally, the rogue admin needs to approve the private endpoint on the linked service to allow traffic over the connection across Azure AD tenants.
Once the private endpoint is approved by the rogue admin or service owner, data can be accessed from the external virtual network.
@ -234,7 +234,7 @@ Azure Synapse also uses managed virtual networks and therefore a similar policy
}
```
The policy above enforces the use of the data exfiltration feature of Synapse. Synapse also allows to deny any private endpoint that is coming from a service that is hosted outside of the customer tenant or a specified set of tenant ids. The policy above enforces exactly that and only allows the creation of managed private endpoints that are linked to services that are hosted in the customer tenant.
The policy above enforces the use of the data exfiltration feature of Synapse. Synapse also allows to deny any private endpoint that is coming from a service that is hosted outside of the customer tenant or a specified set of tenant IDs. The policy above enforces exactly that and only allows the creation of managed private endpoints that are linked to services that are hosted in the customer tenant.
These policies are now available as built-in:

Просмотреть файл

@ -1,6 +1,6 @@
# Network Architecture Considerations
Enterprise-Scale Analytics promises the possibility to easily share and access datasets across multiple data domains and Data Landing Zones without critical bandwidth or latency limitations and without creating multiple copies of the same dataset. To deliver on that promise, different network designs had to be considered, evaluated and tested to make sure that these are compatible with the existing Hub and Spoke and vWAN deployments of corporations. Based on the current capabilities of Azure Networking Services it is recommended to rely on a meshed network architecture. What this means is that it is recommended to setup Vnet peering between
The Data Management and Analytics Scenario promises the possibility to easily share and access datasets across multiple data domains and Data Landing Zones without critical bandwidth or latency limitations and without creating multiple copies of the same dataset. To deliver on that promise, different network designs had to be considered, evaluated and tested to make sure that these are compatible with the existing Hub and Spoke and vWAN deployments of corporations. Based on the current capabilities of Azure Networking Services it is recommended to rely on a meshed network architecture. What this means is that it is recommended to setup Vnet peering between
1. The Connectivity Hub and Data Management Zone,
2. The Connectivity Hub and each Data Landing Zone,
@ -9,7 +9,7 @@ Enterprise-Scale Analytics promises the possibility to easily share and access d
![Meshed Network Architecture](/docs/images/NetworkOptions-NetworkMesh.png)
To explain the rationale behind the recommended design, this article will illustrate the advantages and disadvantages that come with each of the different network architecture approaches that were considered when designing Enterprise-Scale Analytics. Every design pattern will be evaluated along the following criteria: Cost, User Access Management, Service Management and Bandwidth & Latency. Each scenario will be analyzed with the following cross-Data Landing Zone use-case in mind:
To explain the rationale behind the recommended design, this article will illustrate the advantages and disadvantages that come with each of the different network architecture approaches that were considered when designing the Data Management and Analytics Scenario. Every design pattern will be evaluated along the following criteria: Cost, User Access Management, Service Management and Bandwidth & Latency. Each scenario will be analyzed with the following cross-Data Landing Zone use-case in mind:
---
@ -31,7 +31,7 @@ Summary: :heavy_plus_sign::heavy_plus_sign::heavy_plus_sign:
### Service Management
This network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since Enterprise-Scale Analytics can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/). Also, this pattern allows for an Azure native DNS solution based on Private DNS Zones and allows for the automation of the DNS A-record lifecycle through [Azure Policies](/infra/Policies/PolicyDefinitions/PrivateDnsZoneGroups/).
This network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since the Data Management and Analytics Scenario can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/). Also, this pattern allows for an Azure native DNS solution based on Private DNS Zones and allows for the automation of the DNS A-record lifecycle through [Azure Policies](/infra/Policies/PolicyDefinitions/PrivateDnsZoneGroups/).
Summary: :heavy_plus_sign::heavy_plus_sign::heavy_plus_sign:
@ -55,7 +55,7 @@ Summary: :heavy_plus_sign::heavy_plus_sign::heavy_plus_sign:
### Summary
The meshed network design offers maximum bandwidth and low latency at minimal cost without any compromises from a user access management perspective or on the DNS layer. Hence, this network architecture design is the recommended for customers wanting to adopt Enterprise-Scale Analytics. If additional network policies need to be enforced within the Data Platform, it is advised to use Network Security Groups instead of central NVAs.
The meshed network design offers maximum bandwidth and low latency at minimal cost without any compromises from a user access management perspective or on the DNS layer. Hence, this network architecture design is the recommended for customers wanting to adopt the Data Management and Analytics Scenario. If additional network policies need to be enforced within the Data Platform, it is advised to use Network Security Groups instead of central NVAs.
## 2. Traditional Hub & Spoke Design (NOT Recommended)
@ -117,7 +117,7 @@ Summary: :heavy_minus_sign:
### Service Management
Similar to [Option 1](#1-meshed-network-architecture-recommended),this network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since Enterprise-Scale Analytics can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/).
Similar to [Option 1](#1-meshed-network-architecture-recommended),this network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since the Data Management and Analytics Scenario can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/).
On the other hand, the exponential growth of the number of required Private Endpoints also increases the network address space required by the data platform, which is not optimal. Lastly, the above-mentioned DNS challenges are the biggest concern of this network architecture. An Azure native solution in the form of Private DNS Zones cannot be used. Hence, a third-party solution will be required that is capable of resolving FQDNS based on the origin/IP-address of the requestor. In addition, tools and workflows must be developed and maintained to automate the lifecycle of Private DNS A-records which drastically increases the management overhead compared to the proposed [Azure Policy driven solution](/infra/Policies/PolicyDefinitions/PrivateDnsZoneGroups/). It would also be possible to create a distributed DNS infrastructure using Private DNS Zones, but with this solution DNS islands would be created which ultimately lead to issues when trying to access private link services hosted other Landing Zones within the tenant. Therefore, this is not a viable alternative.
@ -161,7 +161,7 @@ Summary: :heavy_minus_sign::heavy_minus_sign::heavy_minus_sign:
### Service Management
Similar to [Option 1](#1-meshed-network-architecture-recommended), this network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since Enterprise-Scale Analytics can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/). Also, this pattern allows for an Azure native DNS solution based on Private DNS Zones and allows for the automation of the DNS A-record lifecycle through [Azure Policies](/infra/Policies/PolicyDefinitions/PrivateDnsZoneGroups/).
Similar to [Option 1](#1-meshed-network-architecture-recommended), this network design has the benefit that there is no NVA acting as a single point of failure or throttling throughput. Not sending the datasets through the Connectivity Hub also reduces the management overhead for the central Azure platform team, as there is no need for scaling out the virtual appliance. This has the implication that the central Azure platform team can no longer inspect and log all traffic that is sent between Data Landing Zones. Nonetheless, this is not seen as disadvantage since the Data Management and Analytics Scenario can be considered as coherent platform that spans across multiple subscriptions to allow for scale and overcome platform level limitations. If all resources would be hosted inside a single subscription, traffic would also not be inspected in the central Connectivity Hub. In addition, network logs can still be captured through the use of Network Security Group Flow Logs and additional application and service level logs can be consolidated and stored through the use of service specific Diagnostic Settings. All of these logs can be captured at scale through the use of [Azure Policies](/infra/Policies/PolicyDefinitions/DiagnosticSettings/). Also, this pattern allows for an Azure native DNS solution based on Private DNS Zones and allows for the automation of the DNS A-record lifecycle through [Azure Policies](/infra/Policies/PolicyDefinitions/PrivateDnsZoneGroups/).
Summary: :heavy_plus_sign::heavy_plus_sign::heavy_plus_sign:
@ -189,4 +189,4 @@ There are many benefits that come with this network architecture design. However
## Conclusion
After reviewing all network architecture options from various angles and identifying pros and cons of each proposed pattern, [Option 1](#1-meshed-network-architecture-recommended) is the clear winner. Not only from a throughput perspective, but also from a cost and management perspective the solution has tremendous benefits and therefore should be used when deploying Enterprise-Scale Analytics. Peering spoke Virtual Networks has not been common in the past, but not doing so has also led to various issues when starting to share datasets across domains and business units. In addition, one can also argue that Enterprise-Scale Analytics can be seen as coherent solution that just spans across multiple subscriptions. In a single subscription setup, the network traffic flow would be equal to the flow in the meshed network architecture, with the difference that within a single subsciption users will most likely hit [subscription level limits and quotas](https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits) of the platform, which is something that Enterprise-Scale Analytics wants to protect against.
After reviewing all network architecture options from various angles and identifying pros and cons of each proposed pattern, [Option 1](#1-meshed-network-architecture-recommended) is the clear winner. Not only from a throughput perspective, but also from a cost and management perspective the solution has tremendous benefits and therefore should be used when deploying the Data Management and Analytics Scenario. Peering spoke Virtual Networks has not been common in the past, but not doing so has also led to various issues when starting to share datasets across domains and business units. In addition, one can also argue that the Data Management and Analytics Scenario can be seen as coherent solution that just spans across multiple subscriptions. In a single subscription setup, the network traffic flow would be equal to the flow in the meshed network architecture, with the difference that within a single subsciption users will most likely hit [subscription level limits and quotas](https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits) of the platform, which is something that the Data Management and Analytics Scenario wants to protect against.

Просмотреть файл

@ -1,38 +1,39 @@
# Azure Policies for Enterprise-Scale Analytics and AI
# Azure Policies for Data Management & Analytics Scenario
[Implementing custom policies](/azure/governance/policy/tutorials/create-and-manage) allows you to do more with Azure Policy. Enterprise-Scale Analytics and AI comes with a set of pre-created policies to help you implement the required guard rails in your environment.
[Implementing custom policies](/azure/governance/policy/tutorials/create-and-manage) allows you to do more with Azure Policy. The Data Management & Analytics Scenario comes with a set of pre-created policies to help you implement the required guard rails in your environment.
Enterprise-Scale Analytics and AI contains custom policies pertaining to **resource and cost management, authentication, encryption, network isolation, logging, resilience and more** that apply to the following services and areas:
Data Management & Analytics contains custom policies pertaining to **resource and cost management, authentication, encryption, network isolation, logging, resilience and more** that apply to the following services and areas:
- [All Services](#all-services)
- [Storage](#storage)
- [Key Vault](#key-vault)
- [Azure Data Factory](#azure-data-factory)
- [Azure Synapse Analytics](#azure-synapse-analytics)
- [Azure Purview](#azure-purview)
- [Azure Databricks](#azure-databricks)
- [Azure IoT Hub](#azure-iot-hub)
- [Azure Event Hubs](#azure-event-hubs)
- [Azure Stream Analytics](#azure-stream-analytics)
- [Azure Data Explorer](#azure-data-explorer)
- [Azure Cosmos DB](#azure-cosmos-db)
- [Azure Container Registry](#azure-container-registry)
- [Azure Cognitive Services](#azure-cognitive-services)
- [Azure Machine Learning](#azure-machine-learning)
- [Azure SQL Managed Instance](#azure-sql-managed-instance)
- [Azure SQL](#azure-sql)
- [Azure Database for MariaDB](#azure-database-for-mariadb)
- [Azure Database for MySQL](#azure-database-for-mysql)
- [Azure Database for PostgreSQL](#azure-database-for-postgresql)
- [Azure Cognitive Search](#azure-cognitive-search)
- [Azure DNS](#azure-dns)
- [Network Security Group](#network-security-group)
- [Batch](#batch)
- [Azure Cache for Redis](#azure-cache-for-redis)
- [Container Instances](#container-instances)
- [Azure Firewall](#azure-firewall)
- [HDInsight](#hdinsight)
- [Power BI](#power-bi)
- [Azure Policies for Data Management & Analytics Scenario](#azure-policies-for-data-management--analytics-scenario)
- [All Services](#all-services)
- [Storage](#storage)
- [Key Vault](#key-vault)
- [Azure Data Factory](#azure-data-factory)
- [Azure Synapse Analytics](#azure-synapse-analytics)
- [Azure Purview](#azure-purview)
- [Azure Databricks](#azure-databricks)
- [Azure IoT Hub](#azure-iot-hub)
- [Azure Event Hubs](#azure-event-hubs)
- [Azure Stream Analytics](#azure-stream-analytics)
- [Azure Data Explorer](#azure-data-explorer)
- [Azure Cosmos DB](#azure-cosmos-db)
- [Azure Container Registry](#azure-container-registry)
- [Azure Cognitive Services](#azure-cognitive-services)
- [Azure Machine Learning](#azure-machine-learning)
- [Azure SQL Managed Instance](#azure-sql-managed-instance)
- [Azure SQL](#azure-sql)
- [Azure Database for MariaDB](#azure-database-for-mariadb)
- [Azure Database for MySQL](#azure-database-for-mysql)
- [Azure Database for PostgreSQL](#azure-database-for-postgresql)
- [Azure Cognitive Search](#azure-cognitive-search)
- [Azure DNS](#azure-dns)
- [Network Security Group](#network-security-group)
- [Batch](#batch)
- [Azure Cache for Redis](#azure-cache-for-redis)
- [Container Instances](#container-instances)
- [Azure Firewall](#azure-firewall)
- [HDInsight](#hdinsight)
- [Power BI](#power-bi)
> [!NOTE]
> The policies provided below are not applied by default during deployment. They should be viewed as guidance-only and can be applied depending on business requirements. Policies should always be applied to the highest level possible and in most cases this will be a [management group](/azure/governance/management-groups/overview). All the policies are available in our GitHub repository.
@ -60,7 +61,7 @@ Enterprise-Scale Analytics and AI contains custom policies pertaining to **resou
|Deny-Storage-NetworkAclsIpRules|Network Isolation|Enforces network ip rules for storage account.|
|Deny-Storage-NetworkAclsVirtualNetworkRules|Network Isolation|Denies virtual network rules for storage account.|
|Deny-Storage-Sku|Resource Management|Enforces storage account SKUs.|
|Deny-Storage-SupportsHttpsTrafficOnly|Encryption|Enforces https traffic for storage account.|
|Deny-Storage-SupportsHttpsTrafficOnly|Encryption|Enforces HTTPS traffic for storage account.|
|Deploy-Storage-BlobServices|Resource Management|Deploy blob services default settings for storage account.|
|Deny-Storage-RoutingPreference|Network Isolation||
|Deny-Storage-Kind|Resource Management||
@ -152,10 +153,10 @@ Additional policies that are applied in the Databricks workspace through cluster
|Policy Name |Policy Area |Description |
|---------|---------|---------|
|Append-IotHub-MinimalTlsVersion|Encryption|Enforces minimal tls version for iot hub.|
|Audit-IotHub-PrivateEndpointId|Network Isolation|Audit public endpoints that are created in other subscriptions for iot hubs.|
|Deny-IotHub-PublicNetworkAccess|Network Isolation|Denies public network access for iot hub.|
|Deny-IotHub-Sku|Resource Management|Enforces iot hub SKUs.|
|Append-IotHub-MinimalTlsVersion|Encryption|Enforces minimal tls version for IoT hub.|
|Audit-IotHub-PrivateEndpointId|Network Isolation|Audit public endpoints that are created in other subscriptions for IoT hubs.|
|Deny-IotHub-PublicNetworkAccess|Network Isolation|Denies public network access for IoT hub.|
|Deny-IotHub-Sku|Resource Management|Enforces IoT hub SKUs.|
|Deploy-IotHub-IoTSecuritySolutions|Security|Deploy Azure defender for IoT for IoT Hubs.|
## Azure Event Hubs

Просмотреть файл

@ -1,10 +1,10 @@
# Self-hosted CI/CD Agents
The Enterprise-Scale Analytics (ESA) reference implementation uses a multi-layered security approach to overcome common data exfiltration risks raised by customers and builds on top of existing network security considerations provided by the [Enterprise-Scale Landing Zones reference design](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/network-topology-and-connectivity). One of the security layers in ESA is focussed on the networking configuration and proposes network isolation as a security construct to reduce the attack surface and exfiltration risk within a customer tenant. The network isolation within the data platform is mainly accomplished through the usage of Private Endpoints and the blocking of public network traffic through the configuration of service specific firewalls. All these design considerations are very important from a security standpoint to overcome the much discussed data exfiltration risk. However, all of this also comes at the cost of increased complexity when Data Product teams start deploying solutions into their environments through CI/CD pipelines.
The Data Management & Analytics Scenario (DMA) reference implementation uses a multi-layered security approach to overcome common data exfiltration risks raised by customers and builds on top of existing network security considerations provided by the [Enterprise-Scale Landing Zones reference design](https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/network-topology-and-connectivity). One of the security layers in DMA is focussed on the networking configuration and proposes network isolation as a security construct to reduce the attack surface and exfiltration risk within a customer tenant. The network isolation within the data platform is mainly accomplished through the usage of Private Endpoints and the blocking of public network traffic through the configuration of service specific firewalls. All these design considerations are very important from a security standpoint to overcome the much discussed data exfiltration risk. However, all of this also comes at the cost of increased complexity when Data Product teams start deploying solutions into their environments through CI/CD pipelines.
When a Data Product team wants to deploy application code to an Azure service (e.g. a Web Application configured with Private Endpoints) through the means of CI/CD pipelines and the usage of the default Microsoft-hosted or GitHub-hosted agents for Azure DevOps Pipelines or GitHub Actions, the deployment will fail as the Azure service will not be accessible by these publicly hosted virtual machines. The root cause is that agents will not be able to resolve the private IP address of the services as they are not able to leverage the DNS services in the customer environment. And even if the traffic coming from the public agent would be routed to the correct endpoint, the central firewall in the customer tenant would block the traffic. Lastly, it is most likely needless to say that opening firewall ports for such scenarios is also not a recommended pattern.
As a result, customers have to setup self-hosted agents that are connected to the corporate vnet. As these are hosted on the corporate network, the agents will be able to resolve the private IPs of the services and therefore will be able to deploy application code on services that are hosted using private endpoints. In order to simplify this setup, this doc will describe step-by-step how self-hosted agents can be deployed into an Enterprise-Scale Analytics environment.
As a result, customers have to setup self-hosted agents that are connected to the corporate vnet. As these are hosted on the corporate network, the agents will be able to resolve the private IPs of the services and therefore will be able to deploy application code on services that are hosted using private endpoints. In order to simplify this setup, this doc will describe step-by-step how self-hosted agents can be deployed into an Data Management & Analytics environment.
As each service has its own way of setting up agents, we will split the following sections by service.
@ -20,7 +20,7 @@ For more details, please visit the [docs](https://docs.microsoft.com/en-us/azure
### Deployment
Please follow the steps below to deploy a [Linux based Virtual Machine Scale Set Agents](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/scale-set-agents?view=azure-devops) into an existing Enterprise-Scale Analytics Data Landing Zone or Data Management Zone:
Please follow the steps below to deploy a [Linux based Virtual Machine Scale Set Agents](https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/scale-set-agents?view=azure-devops) into an existing Data Management & Analytics Data Landing Zone or Data Management Zone:
1. Use the [Bicep templates](/docs/reference/buildagents/main.bicep), [ARM template](/docs/reference/buildagents/main.json) or the "Deploy To Azure" Button to deploy a virtual machine scale set (VMSS) into the environment. We are recommending to use the already existing `{prefix}-{environment}-mgmt` resource group and the `ServicesSubnet` for the deployment into your Data Management Zone or Data Landing Zone.

Просмотреть файл

@ -1,4 +1,4 @@
# Scaling and Self-Service in Enterprise-Scale Anayltics
# Scaling and Self-Service in Data Management & Analytics Scenario
Efficient scaling within an enterprise data platform is a much-desired goal of many organizations and as business units should be enabled to build their own (data) solutions and applications that fulfill their requirements. However, achieving this goal is a challenge as many existing data platforms are not built around the core concepts of scalability and decentralized ownership. This is not only true from an architectural standpoint but also becomes clear when looking at the teams structure and ops model which underpin these data platforms.
@ -16,11 +16,11 @@ Also, many organizations compose their data platform architecture and teams arou
If we now take into consideration that within a large organization there is not a single data consumer but thousands of data consumers, the process described above makes clear why velocity is heavily impacted by such an architectural and organizational pattern. The centralized teams quickly become a bottleneck for the business units, which will ultimately result in slower innovations on the platform, limited effectiveness of data consumers and possibly even to a situation where individual business units decide to build their own data platform.
## Methods of scaling in Enterprise-Scale Analytics
## Methods of scaling in the Data Management & Analytics Scenario
![Enterprise-Scale Analytics - Multiple Data Landing Zone Pattern](/docs/images/MultipleLandingZones.png)
![Data Management & Analytics Scenario - Multiple Data Landing Zone Pattern](/docs/images/MultipleLandingZones.png)
Enterprise-Scale Analytics (ESA) uses two core concepts to address the scaling challenges mentioned in the [introduction](#introduction):
Data Management & Analytics Scenario (DMA) uses two core concepts to address the scaling challenges mentioned in the [introduction](#introduction):
- Scaling through the concept of Data Landing Zones.
- Scaling through the concept of Data Product or Data Integrations to enable distributed and decentralized data ownership.
@ -29,7 +29,7 @@ The following paragraphs will elaborate on these core concepts and will also des
### Scaling with Data Landing Zones
Enterprise-Scale Analytics is centred around the concepts of Data Management Zone and Data Landing Zone. Each of these artifacts should land in a separate Azure subscription to allow for clear separation of duties, to follow the least privilege principal and to partially address the first issue mentioned in the [introduction](#introduction) around subscription scale issues. Hence, the minimal setup of Enterprise-Scale Analytics consists of a single Data Management Zone and a single Data Landing Zone.
Data Management & Analytics Scenario is centred around the concepts of Data Management Zone and Data Landing Zone. Each of these artifacts should land in a separate Azure subscription to allow for clear separation of duties, to follow the least privilege principal and to partially address the first issue mentioned in the [introduction](#introduction) around subscription scale issues. Hence, the minimal setup of Data Management & Analytics Scenario consists of a single Data Management Zone and a single Data Landing Zone.
For large-scale data platform deployments, a single Data Landing Zone and hence a single subscription will not be sufficient, especially if we consider that companies build these platforms and make the respective investments to consistently and efficiently scale their data and analytics efforts in the upcoming years. Therefore, to fully overcome the subscription-level limitations and embrace the "subscription as scale unit" concept from [Enterprise-Scale Landing Zones](https://github.com/Azure/Enterprise-Scale), ESA allows an institution to further increase the data platform footprint by adding additional Data Landing Zones to the architecture. Furthermore, this also addresses the concern of a single Azure Data Lake Gen2 being used for a whole company as each Data Landing Zone comes with a set of three Data Lakes. Ultimately, this leads to a distribution of projects and activities of multiple domains across more than one Azure subscription, allowing for greater scalibility.
@ -45,18 +45,18 @@ The following factors should be considered when deciding about how many Data Lan
Considering all these factors, an enterprise should target no more than 15 Data Landing Zones as there is also some management overhead attached to each Data Landing Zone.
It is important to emphasize that Data Landing Zones are not creating data silos within an organization, as the recommended network setup in ESA enables secure and in-place data sharing across Landing Zones and therefore enables innovation across data domains and business units. Please read the [network design guidance](/docs/guidance/EnterpriseScaleAnalytics-NetworkArchitecture.md) to find out more about how this is achieved. The same is true for the identity layer. Since a single Azure AD tenant is recommended to be used, identities can be granted access to data assets within multiple Data Landing Zones. For the actual authorization process of users and identities to data assets it is recommended to leverage Azure Active Directory Entitlement Management and Access Packages.
It is important to emphasize that Data Landing Zones are not creating data silos within an organization, as the recommended network setup in ESA enables secure and in-place data sharing across Landing Zones and therefore enables innovation across data domains and business units. Please read the [network design guidance](/docs/guidance/DataManagementAnalytics-NetworkArchitecture.md) to find out more about how this is achieved. The same is true for the identity layer. Since a single Azure AD tenant is recommended to be used, identities can be granted access to data assets within multiple Data Landing Zones. For the actual authorization process of users and identities to data assets it is recommended to leverage Azure Active Directory Entitlement Management and Access Packages.
Lastly, it needs to be highlighted that the prescriptive ESA architecture and the concept of Data Landing Zones allows corporations to naturally increase the size of their data platform over time. Additional Data Landing Zones can be added in a phased approach and customers are not forced to start with a multi-Data Landing Zone setup right from the start. When adopting the pattern, companies should start prioritizing few Data Landing Zones as well as Data Products that should land inside them respectively, to make sure that the adoption of ESA is successful.
### Scaling with Data Products
Within a Data Landing Zone, Enterprise-Scale Analytics allows organizations scale through the concept of Data Integrations and Data Products. A Data Product is a unit or component of the data architecture that encapsulates functionality for making read-optimized datasets available for consumption by other data products. In the Azure context, a Data Product is an environment in form of resource group that allows cross-functional teams to implement data solutions and workloads on the platform. The associated team takes care of the end-to-end lifecycle of their data solution including ingest (only for Data Integration), cleansing, aggregation and serving tasks for a particular data-domain, sub-domain, dataset or project.
Within a Data Landing Zone, the Data Management & Analytics Scenario allows organizations scale through the concept of Data Integrations and Data Products. A Data Product is a unit or component of the data architecture that encapsulates functionality for making read-optimized datasets available for consumption by other data products. In the Azure context, a Data Product is an environment in form of resource group that allows cross-functional teams to implement data solutions and workloads on the platform. The associated team takes care of the end-to-end lifecycle of their data solution including ingest (only for Data Integration), cleansing, aggregation and serving tasks for a particular data-domain, sub-domain, dataset or project.
A Data Integration pattern is a special kind of Data Product that is mainly concerned with the integration of data assets from source systems outside of the data platform onto a Data Landing Zone. These consistent data integration implementations reduce the impact on the transactional systems and allow multiple Data Product teams to consume the same dataset version without being concerned about the integration and without having to repeat the integration task.
Data Products on the other hand are consuming one or multiple data assets within the same Data Landing Zone or across multiple Data Landing Zones to generate new data assets, insights or business value. The resulting data assets may again be shared with other Data Product teams to enhance the value being created within the business even further.
With the Data Integration concept, Enterprise-Scale Analytics addresses the data integration and responsibility issue mentioned in the [introduction](#introduction). Instead of having an architectural design build around monolithic functional responsibilities for the ingestion of tables and integration of source systems, the reference design is pivoting the design towards a distributed, data domains driven architecture, where cross functional teams take over the end-to-end functional responsibility and ownership for the respective data scope. In summary, this means that instead of having a centralized technical stack and team being responsible for each and every orthogonal solution of the data processing workflow, we are distributing the end-to-end responsibility from ingestion to the serving layer for data domains or sub-domains across multiple autonomously working cross-functional Data Integration teams. The Data Integration teams own a domain or sub-domain capability and must be encouraged to serve datasets for access by any team for any purpose or project they may be working on.
With the Data Integration concept, Data Management & Analytics Scenario addresses the data integration and responsibility issue mentioned in the [introduction](#introduction). Instead of having an architectural design build around monolithic functional responsibilities for the ingestion of tables and integration of source systems, the reference design is pivoting the design towards a distributed, data domains driven architecture, where cross functional teams take over the end-to-end functional responsibility and ownership for the respective data scope. In summary, this means that instead of having a centralized technical stack and team being responsible for each and every orthogonal solution of the data processing workflow, we are distributing the end-to-end responsibility from ingestion to the serving layer for data domains or sub-domains across multiple autonomously working cross-functional Data Integration teams. The Data Integration teams own a domain or sub-domain capability and must be encouraged to serve datasets for access by any team for any purpose or project they may be working on.
This architectural paradigm shift ultimately leads to an increased velocity within the data platform as data consumers do no longer have to rely on a set of centralized teams or have to fight for prioritization of the requested changes. As smaller teams take ownership of the end-to-end integration workflow, the feedback loop between data provider and data consumer is much shorter and therefore allows for much quicker prioritization, development cycles and a more agile development process. Additionally, complex synchronization processes and release plans between teams are no longer required as the cross-functional Data Integration team has full visibility of the end-to-end technical stack as well as any implications that may arise due to the implemented changes. The team can apply software engineering practices to run unit and integration tests to minimize overall impact on consumers.
@ -66,13 +66,13 @@ Integrated datasets from source systems become the foundation of the data platfo
### Summary
The sections above are summarizing the scaling mechanisms within Enterprise-Scale Analytics that organizations can use to grow their data estate within Azure over time without running into well-known technical limitations. Both scaling mechanisms are helping to overcome different technical complexities and can be used in an efficient and simple way.
The sections above are summarizing the scaling mechanisms within Data Management & Analytics Scenario that organizations can use to grow their data estate within Azure over time without running into well-known technical limitations. Both scaling mechanisms are helping to overcome different technical complexities and can be used in an efficient and simple way.
In the previous section, we discussed strategies to enable scale; improve agility; and reduce development lifecycle within Data Integration and Data Product environments. However, this will only be possible if teams have appropriate access which enables self-serve. How this can be achieved will be discussed in the following section.
## Enabling Self-service for Data Products in Enterprise-Scale Analytics
## Enabling Self-service for Data Products in Data Management & Analytics Scenario
Self-service is one of the key building blocks of Enterprise-Scale Analytics to allow agility, quick development and sharing of Data Products within the respective Data Landing Zones of an organization. Data Product teams are usually the ones demanding such capabilities whereas central IT teams usually have their concerns when enabling business units to deploy their own services, features and application code.
Self-service is one of the key building blocks of Data Management & Analytics Scenario to allow agility, quick development and sharing of Data Products within the respective Data Landing Zones of an organization. Data Product teams are usually the ones demanding such capabilities whereas central IT teams usually have their concerns when enabling business units to deploy their own services, features and application code.
However, as an organization scales beyond a handful of Data Products and projects, it will become impossible for a centralized team to address all requests in a short and timely manner. Hence, the only solution to this is self-service, where support can still be requested by the Data Product teams to overcome technical challenges, to troubleshoot connectivity issues or other kinds of problems. To address the concerns of both teams, we will look at a number of aspects that will eventually enable self-service in a secure and controlled way without loosing control over the overall compliance of the Data Platform environment.
@ -80,7 +80,7 @@ However, as an organization scales beyond a handful of Data Products and project
Azure Policy should be the core instrument of the Azure (Data) Platform team to ensure compliance of resources within the Data Management Zone, Data Landing Zones as well as other landing zones within the organization's tenant. This platform feature should be used to introduce guardrails and enforce adherence to the overall approved service configuration within the respective management group scope. The platform teams can use Azure Policy to, for example, enforce private endpoints for any storage accounts that are being hosted within the data platform environment or enforce TLS 1.2 encryption in transit for any connections being made to the storage accounts. When done right, this will prohibit any Data Product teams from hosting services in an incompliant state within the respective tenant scope.
The responsible IT teams should use this platform feature to address their security and compliance concerns and open up for a self-service approach within (Data) Landing Zones. [Please follow this link](/docs/guidance/EnterpriseScaleAnalytics-Policies.md) to read more about the Azure policies that are recommended for the ESA data platform environment.
The responsible IT teams should use this platform feature to address their security and compliance concerns and open up for a self-service approach within (Data) Landing Zones. [Please follow this link](/docs/guidance/DataManagementAnalytics-Policies.md) to read more about the Azure policies that are recommended for the ESA data platform environment.
### Azure RBAC assignments
@ -88,7 +88,7 @@ In order to develop, deliver and serve data products autonomously within the dat
The development environment should be allowed to be accessed by the development team and their respective user identities to enable them to iterate more quickly, learn about certain capabilities within Azure services and troubleshoot issues effectively. Access to a development environment will help when developing or enhancing the infrastructure as code (IaC) as well as other code artifacts. Once an implementation within the development environment works as expected, it can be rolled out continuously to the higher environments. Higher environments, such as test and prod, should be locked off for the Data Product team. Only a service principal should have access to these environments and therefore all deployments must be executed through the service principal identity by using CI/CD pipelines. To summarize, in the development environment access rights should be provided to a service principal AND user identities and in higher environments access rights should only be provided to a service principal identity.
To be able to create resources and role assignments between resources within the Data Product resource group, `Contributor` and `User Access Administrator` rights must be provided. This will allow the teams to create and control services within their environment within the [boundaries of Azure Policy](#azure-policies). As Enterprise-Scale Analytics recommends the usage of private endpoints to overcome the data exfiltration risk and as other connectivity options should be blocked by the Azure Platform team via policies, Data Product teams will require access rights to the shared virtual network of a Data Landing Zone to be able to successfully setup the required network connectivity for the services they are planning to use. To follow the least privilege principle, overcome conflicts between different Data Product teams and have a clear separation of teams, ESA proposes to create a dedicated subnet per Data Product team and create a `Network Contributor` role assignment to that subnet (child resource scope). This role assignment allows the teams to join the subnet using private endpoints.
To be able to create resources and role assignments between resources within the Data Product resource group, `Contributor` and `User Access Administrator` rights must be provided. This will allow the teams to create and control services within their environment within the [boundaries of Azure Policy](#azure-policies). As Data Management & Analytics Scenario recommends the usage of private endpoints to overcome the data exfiltration risk and as other connectivity options should be blocked by the Azure Platform team via policies, Data Product teams will require access rights to the shared virtual network of a Data Landing Zone to be able to successfully setup the required network connectivity for the services they are planning to use. To follow the least privilege principle, overcome conflicts between different Data Product teams and have a clear separation of teams, ESA proposes to create a dedicated subnet per Data Product team and create a `Network Contributor` role assignment to that subnet (child resource scope). This role assignment allows the teams to join the subnet using private endpoints.
These two first role assignments will enable self-service deployment of data services within these environments. To address the cost management concern, organizations should add a cost centre tag to the resource groups to enable cross-charging and distributed cost ownership. This raises awareness within the teams and enforces them to make the right decisions with respect to required SKUs and service tiers.
To also enable self-service use of other shared resources within the Data Landing Zone, few additional role assignments are required. If access to a Databricks environment is required, organizations should use the [SCIM Synch from AAD](https://docs.microsoft.com/en-us/azure/databricks/administration-guide/users-groups/scim/aad) to provide access. This is important, as this mechanism automatically synchs users and groups from AAD to the Databricks data plane and also automatically removes access rights when an individual leaves the organization or business. This may not be the case, if separate user accounts are used in Aure Databricks. Data Product teams should be added to the Databricks workspace in the `shared-product-rg`, whereas Data Integration teams should be added to the Databricks workspace in the `shared-integration-rg`. Within Azure Databricks, the Data Product teams should be given `Can Restart` access rights to a predefined cluster to be able to run workloads within the workspace.
@ -107,7 +107,7 @@ Lastly, the CI/CD automation will require setting up a connection to Azure which
When onboarding a Data Product or Data Integration team onto a Data Landing Zone, the team will be granted access to their dedicated resource group, subnet as well as the shared resources. From this point in time, the ownership of the environment is handed over to the Data Product or Data Integration team respectively. These teams have to take over responsibility from an end-to-end implementation as well as cost ownership perspective.
To simplify the way to get started and reduce the lead time to create an environment for a specific use-case, organizations can provide Data Product reference patterns internally. These reference implementations consist of the Infrastructure as Code (IaC) definitions to successfully create a set of services for a specific use case such as batch data processing, streaming data processing or data science and demonstrate a path to success. Potentially, these patterns also include generic application code that can be used as a baseline when implementing data solutions. Data Product reference patterns may vary between organizations and highly depend on the utilized tools and common and repeatedly used data implementation patterns across Data Landing Zones. Enterprise-scale Analytics also provides a set of curated Data Product reference designs that can be used as baseline and that can be further enhanced by enterprises depending on their requirements. These can be found here:
To simplify the way to get started and reduce the lead time to create an environment for a specific use-case, organizations can provide Data Product reference patterns internally. These reference implementations consist of the Infrastructure as Code (IaC) definitions to successfully create a set of services for a specific use case such as batch data processing, streaming data processing or data science and demonstrate a path to success. Potentially, these patterns also include generic application code that can be used as a baseline when implementing data solutions. Data Product reference patterns may vary between organizations and highly depend on the utilized tools and common and repeatedly used data implementation patterns across Data Landing Zones. Data Management & Analytics Scenario also provides a set of curated Data Product reference designs that can be used as baseline and that can be further enhanced by enterprises depending on their requirements. These can be found here:
- [Data Product Batch](https://github.com/Azure/data-product-batch)
- [Data Product Streaming](https://github.com/Azure/data-product-streaming)

Двоичные данные
docs/images/DataManagementAnalytics.gif Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 3.4 MiB

Двоичные данные
docs/images/EnterpriseScaleAnalytics.gif

Двоичный файл не отображается.

До

Ширина:  |  Высота:  |  Размер: 2.8 MiB

Просмотреть файл

@ -50,8 +50,8 @@ param administratorPassword string
// Variables
var name = toLower('${prefix}-${environment}')
var tagsDefault = {
Owner: 'Enterprise Scale Analytics'
Project: 'Enterprise Scale Analytics'
Owner: 'Data Management and Analytics Scenario'
Project: 'Data Management and Analytics Scenario'
Environment: environment
Toolkit: 'bicep'
Name: name

Просмотреть файл

@ -5,7 +5,7 @@
"_generator": {
"name": "bicep",
"version": "0.4.1008.15138",
"templateHash": "4335549807263718061"
"templateHash": "12653787604786881755"
}
},
"parameters": {
@ -110,8 +110,8 @@
"variables": {
"name": "[toLower(format('{0}-{1}', parameters('prefix'), parameters('environment')))]",
"tagsDefault": {
"Owner": "Enterprise Scale Analytics",
"Project": "Enterprise Scale Analytics",
"Owner": "Data Management and Analytics Scenario",
"Project": "Data Management and Analytics Scenario",
"Environment": "[parameters('environment')]",
"Toolkit": "bicep",
"Name": "[variables('name')]"

Просмотреть файл

@ -3,18 +3,18 @@
"view": {
"kind": "Form",
"properties": {
"title": "Enterprise-Scale Analytics - Bastion Host Deployment",
"title": "Data Management & Analytics Scenario - Bastion Host Deployment",
"steps": [
{
"name": "basics",
"label": "Bastion",
"elements": [
{
"name": "infoBoxEnterpriseScaleAnalytics",
"name": "infoBoxDataManagementAnalytics",
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"text": "Data Management & Analytics Scenario is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"style": "Info",
"uri": "https://aka.ms/adopt/datamanagement"
}
@ -67,7 +67,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Since not all service features are available in all regions, Enterprise-Scale Analytics is available in a subset of regions.",
"text": "Since not all service features are available in all regions, this deployment is available in a subset of regions.",
"style": "Info"
}
},

Просмотреть файл

@ -36,8 +36,8 @@ param administratorPassword string
// Variables
var name = toLower('${prefix}-${environment}')
var tagsDefault = {
Owner: 'Enterprise Scale Analytics'
Project: 'Enterprise Scale Analytics'
Owner: 'Data Management and Analytics Scenario'
Project: 'Data Management and Analytics Scenario'
Environment: environment
Toolkit: 'bicep'
Name: name

Просмотреть файл

@ -4,8 +4,8 @@
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.4.613.9944",
"templateHash": "15262263412229329785"
"version": "0.4.1008.15138",
"templateHash": "10224713317377197829"
}
},
"parameters": {
@ -73,8 +73,8 @@
"variables": {
"name": "[toLower(format('{0}-{1}', parameters('prefix'), parameters('environment')))]",
"tagsDefault": {
"Owner": "Enterprise Scale Analytics",
"Project": "Enterprise Scale Analytics",
"Owner": "Data Management and Analytics Scenario",
"Project": "Data Management and Analytics Scenario",
"Environment": "[parameters('environment')]",
"Toolkit": "bicep",
"Name": "[variables('name')]"
@ -85,7 +85,7 @@
"resources": [
{
"type": "Microsoft.Resources/deployments",
"apiVersion": "2019-10-01",
"apiVersion": "2020-06-01",
"name": "selfHostedAgentDevOps001",
"properties": {
"expressionEvaluationOptions": {
@ -127,8 +127,8 @@
"metadata": {
"_generator": {
"name": "bicep",
"version": "0.4.613.9944",
"templateHash": "16656094663692197516"
"version": "0.4.1008.15138",
"templateHash": "4038817811886062802"
}
},
"parameters": {

Просмотреть файл

@ -3,18 +3,18 @@
"view": {
"kind": "Form",
"properties": {
"title": "Enterprise-Scale Analytics - Self-Hosted Agent",
"title": "Data Management & Analytics Scenario - Self-Hosted Agent",
"steps": [
{
"name": "basics",
"label": "Self-Hosted Agent",
"elements": [
{
"name": "infoBoxEnterpriseScaleAnalytics",
"name": "infoBoxDataManagementAnalytics",
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"text": "Data Management & Analytics Scenario is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"style": "Info",
"uri": "https://aka.ms/adopt/datamanagement"
}
@ -101,7 +101,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Since not all service features are available in all regions, Enterprise-Scale Analytics is available in a subset of regions.",
"text": "Since not all service features are available in all regions, this deployment is available in a subset of regions.",
"style": "Info"
}
},

Просмотреть файл

@ -5,7 +5,7 @@
"_generator": {
"name": "bicep",
"version": "0.4.1008.15138",
"templateHash": "7920090618025606816"
"templateHash": "12110763976061491219"
}
},
"parameters": {
@ -359,7 +359,7 @@
"_generator": {
"name": "bicep",
"version": "0.4.1008.15138",
"templateHash": "4335549807263718061"
"templateHash": "12653787604786881755"
}
},
"parameters": {
@ -464,8 +464,8 @@
"variables": {
"name": "[toLower(format('{0}-{1}', parameters('prefix'), parameters('environment')))]",
"tagsDefault": {
"Owner": "Enterprise Scale Analytics",
"Project": "Enterprise Scale Analytics",
"Owner": "Data Management and Analytics Scenario",
"Project": "Data Management and Analytics Scenario",
"Environment": "[parameters('environment')]",
"Toolkit": "bicep",
"Name": "[variables('name')]"

Просмотреть файл

@ -3,18 +3,18 @@
"view": {
"kind": "Form",
"properties": {
"title": "Enterprise-Scale Analytics",
"title": "Data Management & Analytics Scenario",
"steps": [
{
"name": "basics",
"label": "Enterprise-Scale Analytics",
"label": "Data Management & Analytics Scenario",
"elements": [
{
"name": "infoBoxEnterpriseScaleAnalytics",
"name": "infoBoxDataManagementAnalytics",
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"text": "Data Management & Analytics Scenario is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"style": "Info",
"uri": "https://aka.ms/adopt/datamanagement"
}
@ -99,7 +99,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics & AI uses a single landing zone for all data governance tasks. We are advising to use a separate subscription for the Data Management Zone and each Data Landing Zone.",
"text": "Data Management & Analytics Scenario uses a single landing zone for all data governance tasks. We are advising to use a separate subscription for the Data Management Zone and each Data Landing Zone.",
"style": "Info"
}
},
@ -175,7 +175,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Since not all service features are available in all regions, Enterprise-Scale Analytics is available in a subset of regions.",
"text": "Since not all service features are available in all regions, this deployment is available in a subset of regions.",
"style": "Info"
}
},
@ -308,7 +308,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics advises to scale data workloads through the means of one or multiple Data Landing Zones. Each Data Landing Zone should be deployed to a separate subscription.",
"text": "Data Management & Analytics Scenario advises to scale data workloads through the means of one or multiple Data Landing Zones. Each Data Landing Zone should be deployed to a separate subscription.",
"style": "Info"
}
},
@ -335,7 +335,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Since not all service features are available in all regions, Enterprise-Scale Analytics is available in a subset of regions.",
"text": "Since not all service features are available in all regions, this deployment is available in a subset of regions.",
"style": "Info"
}
},
@ -761,7 +761,7 @@
"type": "Microsoft.Common.TagsByResource",
"visible": true,
"resources": [
"EnterpriseScaleAnalytics"
"DataManagementAnalytics"
]
}
]
@ -783,7 +783,7 @@
"dataLandingZonePrefix": "[steps('dataLandingZoneSettings').dataLandingZonesName.dataLandingZonesPrefix]",
"enableBastionHostDeployment": "[steps('dataLandingZoneSettings').azureBastionSettings.azureBastionCheckbox]",
"virtualMachineImage": "[if(empty(steps('dataLandingZoneSettings').azureBastionSettings.virtualMachineImage), '', steps('dataLandingZoneSettings').azureBastionSettings.virtualMachineImage)]",
"tags": "[if(not(contains(steps('tags').tagsByResource, 'EnterpriseScaleAnalytics')), parse('{}'), first(map(parse(concat('[', string(steps('tags').tagsByResource), ']')), (item) => item.EnterpriseScaleAnalytics)))]"
"tags": "[if(not(contains(steps('tags').tagsByResource, 'DataManagementAnalytics')), parse('{}'), first(map(parse(concat('[', string(steps('tags').tagsByResource), ']')), (item) => item.DataManagementAnalytics)))]"
}
}
}

Просмотреть файл

@ -3,18 +3,18 @@
"view": {
"kind": "Form",
"properties": {
"title": "Enterprise-Scale Analytics - Data Management Zone",
"title": "Data Management & Analytics Scenario - Data Management Zone",
"steps": [
{
"name": "basics",
"label": "Data Management Zone",
"elements": [
{
"name": "infoBoxEnterpriseScaleAnalytics",
"name": "infoBoxDataManagementAnalytics",
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Enterprise-Scale Analytics is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"text": "Data Management & Analytics Scenario is a prescriptive reference architecture for data with reference implementation provided by Microsoft. Visit 'aka.ms/adopt/datamanagement' for more details about the solution pattern.",
"style": "Info",
"uri": "https://aka.ms/adopt/datamanagement"
}
@ -101,7 +101,7 @@
"type": "Microsoft.Common.InfoBox",
"visible": true,
"options": {
"text": "Since not all service features are available in all regions, Enterprise-Scale Analytics is available in a subset of regions.",
"text": "Since not all service features are available in all regions, this deployment is available in a subset of regions.",
"style": "Info"
}
},
@ -421,7 +421,7 @@
"type": "Microsoft.Common.TextBlock",
"visible": true,
"options": {
"text": "Specify whether the deployment takes place in an existing Enterprise-Scale Landing Zone environment. If so, it is recommended to make use of the existing network infrastructure in your connectivity hub in your tenant.",
"text": "Specify whether the deployment takes place in an existing Azure Landing Zone (former Enterprise-Scale Landing Zone) environment. If so, it is recommended to make use of the existing network infrastructure in your connectivity hub in your tenant.",
"link": {
"label": "Learn more",
"uri": "https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/architecture#high-level-architecture"
@ -430,7 +430,7 @@
},
{
"name": "disableDnsAndFirewallDeployment",
"label": "Deploy into Enterprise-Scale Landing Zone environment",
"label": "Deploy into Azure Landing Zone (former Enterprise-Scale Landing Zone) environment",
"type": "Microsoft.Common.OptionsGroup",
"visible": true,
"toolTip": "If 'Yes' is selected, you can configure the deployment to make use of the networking infrastructure in your ESLZ Connectivity Hub.",
@ -879,7 +879,7 @@
"type": "Microsoft.Common.TagsByResource",
"visible": true,
"resources": [
"EnterpriseScaleAnalytics"
"DataManagementAnalytics"
]
}
]
@ -894,7 +894,7 @@
"location": "[if(empty(steps('basics').deploymentDetails.locationName), '', steps('basics').deploymentDetails.locationName)]",
"environment": "[if(empty(steps('basics').dataManagementZoneName.environment), '', steps('basics').dataManagementZoneName.environment)]",
"prefix": "[if(empty(steps('basics').dataManagementZoneName.dataManagementZonePrefix), '', steps('basics').dataManagementZoneName.dataManagementZonePrefix)]",
"tags": "[if(not(contains(steps('tags').tagsByResource, 'EnterpriseScaleAnalytics')), parse('{}'), first(map(parse(concat('[', string(steps('tags').tagsByResource), ']')), (item) => item.EnterpriseScaleAnalytics)))]",
"tags": "[if(not(contains(steps('tags').tagsByResource, 'DataManagementAnalytics')), parse('{}'), first(map(parse(concat('[', string(steps('tags').tagsByResource), ']')), (item) => item.DataManagementAnalytics)))]",
"purviewRootCollectionAdminObjectIds": "[if(empty(steps('generalSettings').purviewSettings.purviewRootCollectionAdminObjectIds), parse('[]'), split(replace(steps('generalSettings').purviewSettings.purviewRootCollectionAdminObjectIds, ' ', ''), ','))]",
"vnetAddressPrefix": "[if(empty(steps('connectivitySettings').virtualNetworkConfiguration.virtualNetworkAddressCidrRange), '', steps('connectivitySettings').virtualNetworkConfiguration.virtualNetworkAddressCidrRange)]",
"azureFirewallSubnetAddressPrefix": "[if(empty(steps('connectivitySettings').virtualNetworkConfiguration.azureFirewallSubnetCidrRange), '', steps('connectivitySettings').virtualNetworkConfiguration.azureFirewallSubnetCidrRange)]",

Просмотреть файл

@ -63,8 +63,8 @@ param privateDnsZoneIdSynapse string = ''
// Variables
var name = toLower('${prefix}-${environment}')
var tagsDefault = {
Owner: 'Enterprise Scale Analytics'
Project: 'Enterprise Scale Analytics'
Owner: 'Data Management and Analytics Scenario'
Project: 'Data Management and Analytics Scenario'
Environment: environment
Toolkit: 'bicep'
Name: name

Просмотреть файл

@ -5,7 +5,7 @@
"_generator": {
"name": "bicep",
"version": "0.4.1008.15138",
"templateHash": "16584189192184544370"
"templateHash": "271023822918263702"
}
},
"parameters": {
@ -161,8 +161,8 @@
"variables": {
"name": "[toLower(format('{0}-{1}', parameters('prefix'), parameters('environment')))]",
"tagsDefault": {
"Owner": "Enterprise Scale Analytics",
"Project": "Enterprise Scale Analytics",
"Owner": "Data Management and Analytics Scenario",
"Project": "Data Management and Analytics Scenario",
"Environment": "[parameters('environment')]",
"Toolkit": "bicep",
"Name": "[variables('name')]"