Update toc.md
This commit is contained in:
Родитель
eca298ad15
Коммит
f1b75c66ee
76
toc.md
76
toc.md
|
@ -446,21 +446,27 @@ You can also use Grafana to visualize your data from Log Analytics.
|
||||||
## Cost Management, Chargeback and Analysis
|
## Cost Management, Chargeback and Analysis
|
||||||
|
|
||||||
This section will focus on Azure Databricks billing, tools to manage and analyze cost and how to charge back to the team.
|
This section will focus on Azure Databricks billing, tools to manage and analyze cost and how to charge back to the team.
|
||||||
|
|
||||||
Azure Databricks Billing:
|
Azure Databricks Billing:
|
||||||
First, it is important to understand the different workloads and tiers available with Azure Databricks. Azure Databricks is available in 2 tiers – Standard and Premium. Premium Tier offers additional features on top of what is available in Standard tier. These include Role-based access control for notebooks, jobs, and tables, Audit logs, Azure AD conditional pass-through, conditional authentication and many more. Please refer to https://azure.microsoft.com/en-us/pricing/details/databricks/ for the complete list.
|
First, it is important to understand the different workloads and tiers available with Azure Databricks. Azure Databricks is available in 2 tiers – Standard and Premium. Premium Tier offers additional features on top of what is available in Standard tier. These include Role-based access control for notebooks, jobs, and tables, Audit logs, Azure AD conditional pass-through, conditional authentication and many more. Please refer to https://azure.microsoft.com/en-us/pricing/details/databricks/ for the complete list.
|
||||||
Both Premium and Standard tier come with 3 types of workload
|
|
||||||
• Jobs Compute (previously called Data Engineering)
|
Both Premium and Standard tier come with 3 types of workload:
|
||||||
• Jobs Light Compute (previously called Data Engineering Light)
|
|
||||||
• All-purpose Compute (previously called Data Analytics)
|
1. Jobs Compute (previously called Data Engineering)
|
||||||
|
2. Jobs Light Compute (previously called Data Engineering Light)
|
||||||
|
3. All-purpose Compute (previously called Data Analytics)
|
||||||
The Jobs Compute and Jobs Light Compute make it easy for data engineers to build and execute jobs, and All-purpose make it easy for data scientists to explore, visualize, manipulate, and share data and insights interactively. Depending upon the use-case, one can also use All-purpose Compute for data engineering or automated scenarios especially if the incoming job rate is higher.
|
The Jobs Compute and Jobs Light Compute make it easy for data engineers to build and execute jobs, and All-purpose make it easy for data scientists to explore, visualize, manipulate, and share data and insights interactively. Depending upon the use-case, one can also use All-purpose Compute for data engineering or automated scenarios especially if the incoming job rate is higher.
|
||||||
When you create an Azure Databricks workspace and spin up a cluster, below resources are consumed
|
|
||||||
• DBUs – A DBU is a unit of processing capability, billed on a per-second usage
|
When you create an Azure Databricks workspace and spin up a cluster, below resources are consumed:
|
||||||
• Virtual Machines – These represent your Databricks clusters that run the Databricks Runtime
|
|
||||||
• Public IP Addresses – These represent the IP Addresses consumed by the Virtual Machines when the cluster is running
|
1. DBUs – A DBU is a unit of processing capability, billed on a per-second usage
|
||||||
• Blob Storage – Each workspace comes with a default storage
|
2. Virtual Machines – These represent your Databricks clusters that run the Databricks Runtime
|
||||||
• Managed Disk
|
3. Public IP Addresses – These represent the IP Addresses consumed by the Virtual Machines when the cluster is running
|
||||||
• Bandwidth – Bandwidth charges for any data transfer
|
4. Blob Storage – Each workspace comes with a default storage
|
||||||
Service/Resource Pricing
|
5. Managed Disk
|
||||||
|
6. Bandwidth – Bandwidth charges for any data transfer
|
||||||
|
|
||||||
|
Service/Resource Pricing"
|
||||||
DBUs https://azure.microsoft.com/en-us/pricing/details/databricks/
|
DBUs https://azure.microsoft.com/en-us/pricing/details/databricks/
|
||||||
VMs https://azure.microsoft.com/en-us/pricing/details/databricks/
|
VMs https://azure.microsoft.com/en-us/pricing/details/databricks/
|
||||||
Public IP Addresses https://azure.microsoft.com/en-us/pricing/details/ip-addresses/
|
Public IP Addresses https://azure.microsoft.com/en-us/pricing/details/ip-addresses/
|
||||||
|
@ -469,32 +475,38 @@ Managed Disk https://azure.microsoft.com/en-us/pricing/details/managed-disks/
|
||||||
Bandwidth https://azure.microsoft.com/en-us/pricing/details/bandwidth/
|
Bandwidth https://azure.microsoft.com/en-us/pricing/details/bandwidth/
|
||||||
|
|
||||||
In addition, if you use additional services as part of your end-2-end solution, such as Azure CosmosDB, or Azure Event Hub, then they are charged per their pricing plan.
|
In addition, if you use additional services as part of your end-2-end solution, such as Azure CosmosDB, or Azure Event Hub, then they are charged per their pricing plan.
|
||||||
Per the details in Azure Databricks pricing page, there are 2 options
|
Per the details in Azure Databricks pricing page, there are 2 options:
|
||||||
|
|
||||||
1. Pay as you go – Pay for the DBUs as you use: Refer to the pricing page for the DBU prices based on the SKU. Note: The DBU per hour price for different SKUs differs across Azure public cloud, Azure Gov and Azure China region.
|
1. Pay as you go – Pay for the DBUs as you use: Refer to the pricing page for the DBU prices based on the SKU. Note: The DBU per hour price for different SKUs differs across Azure public cloud, Azure Gov and Azure China region.
|
||||||
|
|
||||||
2. Pre-purchase or Reservations – You can get up to 37% savings over pay-as-you-go DBU when you pre-purchase Azure Databricks Units (DBU) as Databricks Commit Units (DBCU) for either 1 or 3 years. A Databricks Commit Unit (DBCU) normalizes usage from Azure Databricks workloads and tiers into to a single purchase. Your DBU usage across those workloads and tiers will draw down from the Databricks Commit Units (DBCU) until they are exhausted, or the purchase term expires. The draw down rate will be equivalent to the price of the DBU, as per the table above. Refer to the pricing page for the pre-purchase pricing.
|
2. Pre-purchase or Reservations – You can get up to 37% savings over pay-as-you-go DBU when you pre-purchase Azure Databricks Units (DBU) as Databricks Commit Units (DBCU) for either 1 or 3 years. A Databricks Commit Unit (DBCU) normalizes usage from Azure Databricks workloads and tiers into to a single purchase. Your DBU usage across those workloads and tiers will draw down from the Databricks Commit Units (DBCU) until they are exhausted, or the purchase term expires. The draw down rate will be equivalent to the price of the DBU, as per the table above. Refer to the pricing page for the pre-purchase pricing.
|
||||||
Since, you are also billed for the VMs, you have both the above options for VMs as well
|
|
||||||
|
Since, you are also billed for the VMs, you have both the above options for VMs as well:
|
||||||
1. Pay as you go
|
1. Pay as you go
|
||||||
2. Reservations - https://azure.microsoft.com/en-us/pricing/reserved-vm-instances/
|
2. Reservations - https://azure.microsoft.com/en-us/pricing/reserved-vm-instances/
|
||||||
Below are few examples of a billing for Azure Databricks with Pay as you go
|
Below are few examples of a billing for Azure Databricks with Pay as you go
|
||||||
Depending on the type of workload your cluster runs, you will either be charged for Jobs Compute, Jobs Light Compute, or All-purpose Compute workload. For example, if the cluster runs workloads triggered by the Databricks jobs scheduler, you will be charged for the Jobs Compute workload. If your cluster runs interactive features such as ad-hoc commands, you will be billed for All-purpose Compute workload.
|
Depending on the type of workload your cluster runs, you will either be charged for Jobs Compute, Jobs Light Compute, or All-purpose Compute workload. For example, if the cluster runs workloads triggered by the Databricks jobs scheduler, you will be charged for the Jobs Compute workload. If your cluster runs interactive features such as ad-hoc commands, you will be billed for All-purpose Compute workload.
|
||||||
|
|
||||||
Accordingly, the pricing will be dependent on below components
|
Accordingly, the pricing will be dependent on below components
|
||||||
1. DBU SKU – DBU price based on the workload and tier
|
1. DBU SKU – DBU price based on the workload and tier
|
||||||
2. VM SKU – VM price based on the VM SKU
|
2. VM SKU – VM price based on the VM SKU
|
||||||
3. DBU Count – Each VM SKU has an associated DBU count. Example – D3v2 has DBU count of 0.75
|
3. DBU Count – Each VM SKU has an associated DBU count. Example – D3v2 has DBU count of 0.75
|
||||||
4. Region
|
4. Region
|
||||||
5. Duration
|
5. Duration
|
||||||
• If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for All-purpose Compute:
|
|
||||||
• VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
|
* If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for All-purpose Compute:
|
||||||
• DBU cost for All-purpose Compute workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.55/DBU = $1,100
|
* VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
|
||||||
• The total cost would therefore be $598 (VM Cost) + $1,100 (DBU Cost) = $1,698.
|
* DBU cost for All-purpose Compute workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.55/DBU = $1,100
|
||||||
• If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for Jobs Compute workload:
|
* The total cost would therefore be $598 (VM Cost) + $1,100 (DBU Cost) = $1,698.
|
||||||
• VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
|
* If you run Premium tier cluster for 100 hours in East US 2 with 10 DS13v2 instances, the billing would be the following for Jobs Compute workload:
|
||||||
• DBU cost for Jobs Compute workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.30/DBU = $600
|
* VM cost for 10 DS13v2 instances —100 hours x 10 instances x $0.598/hour = $598
|
||||||
• The total cost would therefore be $598 (VM Cost) + $600 (DBU Cost) = $1,198.
|
* DBU cost for Jobs Compute workload for 10 DS13v2 instances —100 hours x 10 instances x 2 DBU per node x $0.30/DBU = $600
|
||||||
|
* The total cost would therefore be $598 (VM Cost) + $600 (DBU Cost) = $1,198.
|
||||||
In addition to VM and DBU charges, there will be additional charges for managed disks, public IP address, bandwidth, or any other resource such as Azure Storage, Azure Cosmos DB depending on your application.
|
In addition to VM and DBU charges, there will be additional charges for managed disks, public IP address, bandwidth, or any other resource such as Azure Storage, Azure Cosmos DB depending on your application.
|
||||||
|
|
||||||
Azure Databricks Trial: If you are new to Azure Databricks, you can also use a Trial SKU that gives you free DBUs for Premium tier for 14 days. You will still need to pay for other resources like VM, Storage etc. that are consumed during this period. After the trial is over, you will need to start paying for the DBUs.
|
Azure Databricks Trial: If you are new to Azure Databricks, you can also use a Trial SKU that gives you free DBUs for Premium tier for 14 days. You will still need to pay for other resources like VM, Storage etc. that are consumed during this period. After the trial is over, you will need to start paying for the DBUs.
|
||||||
Chargeback scenarios
|
|
||||||
|
Chargeback scenarios:
|
||||||
There are 2 broad scenarios we have seen with respect to chargeback internal teams for sharing Databricks resources
|
There are 2 broad scenarios we have seen with respect to chargeback internal teams for sharing Databricks resources
|
||||||
1. Chargeback across a single Azure Databricks workspace: In this case, a single workspace is shared across multiple teams and user would like to chargeback the individual teams. Individual teams would use their own Databricks cluster and can be charged back at cluster level.
|
1. Chargeback across a single Azure Databricks workspace: In this case, a single workspace is shared across multiple teams and user would like to chargeback the individual teams. Individual teams would use their own Databricks cluster and can be charged back at cluster level.
|
||||||
2. Chargeback across multiple Databricks workspace: In this case, teams use their own workspace and would like to chargeback at workspace level.
|
2. Chargeback across multiple Databricks workspace: In this case, teams use their own workspace and would like to chargeback at workspace level.
|
||||||
|
@ -513,14 +525,16 @@ In addition to the default tags, customers can add custom tags to the resources
|
||||||
1. Cluster Tags : You can create custom tags as key-value pairs when you create a cluster, and Azure Databricks applies these tags to underlying cluster resources – VMs, DBUs, Public IP Addresses, Disks.
|
1. Cluster Tags : You can create custom tags as key-value pairs when you create a cluster, and Azure Databricks applies these tags to underlying cluster resources – VMs, DBUs, Public IP Addresses, Disks.
|
||||||
2. Pool Tags : You can create custom tags as key-value pairs when you create a pool, and Azure Databricks applies these tags to underlying pool resources – VMs, Public IP Addresses, Disks. Pool-backed clusters inherit default and custom tags from the pool configuration.
|
2. Pool Tags : You can create custom tags as key-value pairs when you create a pool, and Azure Databricks applies these tags to underlying pool resources – VMs, Public IP Addresses, Disks. Pool-backed clusters inherit default and custom tags from the pool configuration.
|
||||||
3. Workspace Tags: You can create custom tags as key-value pairs when you create an Azure Databricks workspaces. These tags apply to underlying resources within the workspace – VMs, DBUs, and others.
|
3. Workspace Tags: You can create custom tags as key-value pairs when you create an Azure Databricks workspaces. These tags apply to underlying resources within the workspace – VMs, DBUs, and others.
|
||||||
Please see below on how tags propagate for DBUs and VMs
|
|
||||||
|
Please see below on how tags propagate for DBUs and VMs:
|
||||||
1. Clusters created from pools
|
1. Clusters created from pools
|
||||||
a. DBU Tag = Workspace Tag + Pool Tag + Cluster Tag
|
a. DBU Tag = Workspace Tag + Pool Tag + Cluster Tag
|
||||||
b. VM Tag = Workspace Tag + Pool Tag
|
b. VM Tag = Workspace Tag + Pool Tag
|
||||||
2. Clusters not from Pools
|
2. Clusters not from Pools
|
||||||
a. DBU Tag = Workspace Tag + Cluster Tag
|
a. DBU Tag = Workspace Tag + Cluster Tag
|
||||||
b. VM Tag = Workspace Tag + Cluster Tag
|
b. VM Tag = Workspace Tag + Cluster Tag
|
||||||
These tags (default and custom) propagate to Cost Analysis Reports that you can access in the Azure Portal. The below section will explain how to do cost/usage analysis using these tags.
|
These tags (default and custom) propagate to Cost Analysis Reports that you can access in the Azure Portal. The below section will explain how to do cost/usage analysis using these tags.
|
||||||
|
|
||||||
Cost/Usage Analysis
|
Cost/Usage Analysis
|
||||||
The Cost Analysis report is available under Cost Management within Azure Portal. Please refer to Cost Management section to get a detailed overview on how to use Cost Management.
|
The Cost Analysis report is available under Cost Management within Azure Portal. Please refer to Cost Management section to get a detailed overview on how to use Cost Management.
|
||||||
|
|
||||||
|
@ -574,10 +588,10 @@ Following are the key things to note about pre-purchase plan
|
||||||
2. To view the overall consumption for pre-purchase, you can find it in Azure Portal by going to Reservations page. If you have multiple Reservations, you can find all of them in under Reservations in Azure Portal. This will allow one to track to-date usage of different reservations separately. Please see Reservations page on how to access this information from various tools including REST API, PowerShell, and CLI.
|
2. To view the overall consumption for pre-purchase, you can find it in Azure Portal by going to Reservations page. If you have multiple Reservations, you can find all of them in under Reservations in Azure Portal. This will allow one to track to-date usage of different reservations separately. Please see Reservations page on how to access this information from various tools including REST API, PowerShell, and CLI.
|
||||||
|
|
||||||
3. To get the detailed utilized and reports (like Pay as you go), the same Cost Management section above would apply with few below changes
|
3. To get the detailed utilized and reports (like Pay as you go), the same Cost Management section above would apply with few below changes
|
||||||
a. Use the field Amortized Cost instead of Actual Cost in Azure Portal
|
a. Use the field Amortized Cost instead of Actual Cost in Azure Portal
|
||||||
b. For EA, and Modern customers, the Meter Name would reflect the exact DBU workload and tier in the cost reports. The report would also show the exact tier of reservation – as 1 year or 3 year. One would still need to download the same Usage Details Version 2 report as mentioned here or use the Power BI Cost Management connector. For Web, and Direct customers, the product and meter name would show as Azure Databricks Reservations-DBU and DBU respectively. To identify the workload SKU, you can find the MeterID under “additionalinfo” as consumption meter.
|
b. For EA, and Modern customers, the Meter Name would reflect the exact DBU workload and tier in the cost reports. The report would also show the exact tier of reservation – as 1 year or 3 year. One would still need to download the same Usage Details Version 2 report as mentioned here or use the Power BI Cost Management connector. For Web, and Direct customers, the product and meter name would show as Azure Databricks Reservations-DBU and DBU respectively. To identify the workload SKU, you can find the MeterID under “additionalinfo” as consumption meter.
|
||||||
4. For Web and Direct customers, one can calculate the normalized consumption for DBCUs using the below steps:
|
4. For Web and Direct customers, one can calculate the normalized consumption for DBCUs using the below steps:
|
||||||
a. Refer to this table to get the Cost Management Ratio
|
a. Refer to this table to get the Cost Management Ratio
|
||||||
Key Things to Note:
|
Key Things to Note:
|
||||||
1. Cost is shown as 0 for the reservation. That is because the reservation is pre-paid. To calculate cost, one needs to start by looking at "consumedquantity"
|
1. Cost is shown as 0 for the reservation. That is because the reservation is pre-paid. To calculate cost, one needs to start by looking at "consumedquantity"
|
||||||
2. meterid changes from the SKU-based IDs, to a reservation specific meterid
|
2. meterid changes from the SKU-based IDs, to a reservation specific meterid
|
||||||
|
|
Загрузка…
Ссылка в новой задаче