prometheus-collector/Hotfix-417488465-08272023.md

2.4 KiB

Issue identified

Azure Monitor managed service for Prometheus had an issue during regular deployments starting august-13-2023, whereby , Managed Prometheus metrics got auto-disabled on some AKS clusters that satisfies one or more of the below criteria.

  1. AKS cluster had both Azure Monitor logs & Azure Monitor managed service for Prometheus enabled, and also cluster had any PUT operation.
  2. AKS cluster had Azure Monitor managed service for Prometheus enabled, and Azure Monitor logs was enabled after that or vice versa.

Mitigation steps

Issue has been mitigated on the Azure service side on august-27-2023, but some clusters might still see Prometheus metrics collection disabled and they require customer action as stated below. Please follow the below steps to re-enable Azure Monitor managed service for Prometheus on the impacted AKS cluster(s) you might have.

Step-1 : Dis-associate the Data collection rule for Managed Prometheus with the impacted AKS cluster.

You can dis-associate the DCR (Data Collection Rule) association with the cluster by several ways. Usually in most cases, DCR is created in the same resource group as AKS cluster, in the same region as Azure Monitor Workspace, and it has MSProm profix in its name. Once Prometheus DCR is idenfied for your AKS cluster, you would have to dis-associate that DCR with the AKS cluster resource, which can be done in several ways as below.

  • Method-1 : Navigate to Datacollection Rules in Azure portal, and then choose Resources menu, which will show you all the Data collection Rules in the selected Subscriptions and Resource groups, that you would need to filter with AKS cluster's Subscription and Resource group. Once the DCR is identified, choose the Resources menu under Configuration group for the DCR, which will show the AKS cluster that you can choose and delete.

    image

  • Method-2 : If you had chosen custom DCR names & Resource Group for your Prometheus DCR, you would have to identify the correct DCR to perform the above step.

Step-2: After step-1 above has been complete, please re-enable Azure Monitor managed service for Prometheus using any of the supported deployment mechanisms as documented here

If you need further help on recovering from this issue, please reach out to us at askazprom@microsoft.com