Merge pull request #7 from Microsoft/springstone

Springstone
This commit is contained in:
Sacha Narinx 2017-05-31 17:00:26 +04:00 коммит произвёл GitHub
Родитель 1bb7a16656 f019424b65
Коммит 803d1c8c57
6 изменённых файлов: 81 добавлений и 22 удалений

Просмотреть файл

@ -1,4 +1,4 @@
#Average CPU usage calculated over 10 minutes for the last 1 hours #Average CPU usage calculated over 10 minutes for the last 1 hour
.\RB-ProcessLogs.ps1 ` .\RB-ProcessLogs.ps1 `
-ReportName "perfavgcpu" ` -ReportName "perfavgcpu" `
-dynamicQuery "Type=Perf CounterName=""% Processor Time"" TimeGenerated>=NOW-1HOURS | measure avg(CounterValue) by Computer interval 30MINUTE" -dynamicQuery "Type=Perf CounterName=""% Processor Time"" TimeGenerated>=NOW-1HOURS | measure avg(CounterValue) by Computer interval 30MINUTE"
@ -6,7 +6,7 @@
#All detected threats based on threat status rank #All detected threats based on threat status rank
.\RB-ProcessLogs.ps1 ` .\RB-ProcessLogs.ps1 `
-ReportName "securitydetectedthreats" ` -ReportName "securitydetectedthreats" `
-dynamicQuery "Type=ProtectionStatus ThreatStatusRank > 199 ThreatStatusRank != 470 | measure max(ThreatStatusRank) as Rank by Computer | Top 5000" -dynamicQuery "Type=ProtectionStatus ThreatStatusRank > 199 ThreatStatusRank != 470 | measure max(ThreatStatusRank) as Rank by Computer"
#All Windows security login failures in the past 1 hour #All Windows security login failures in the past 1 hour
.\RB-ProcessLogs.ps1 ` .\RB-ProcessLogs.ps1 `
@ -16,4 +16,4 @@
#All Linux Syslog errors in the past 1 hour #All Linux Syslog errors in the past 1 hour
.\RB-ProcessLogs.ps1 ` .\RB-ProcessLogs.ps1 `
-ReportName "linuxsyslogerrors" ` -ReportName "linuxsyslogerrors" `
-dynamicQuery "Type=Syslog SeverityLevel=error TimeGenerated>NOW-1HOUR" -dynamicQuery "Type=Syslog SeverityLevel=error TimeGenerated>NOW-1HOURS"

Просмотреть файл

@ -0,0 +1,32 @@
# zMonitor
## Cost Estimates
Cost estimates for the zMonitor solution really depend on what you are monitoring, and how often.
All pricing estimates in this document are based on region "West Europe".
### Tenant / Subscription
For tenants or subscriptions with a small footprint, a couple of VMs, the free versions should be sufficient. Keep an eye on Log Analytics log usage, as you may need change to the paid version, depending on the amount of data you are collecting. If you're processing logs, you can quickly exceed the 500NB daily limit for the free tier of Log Analytics.
| Component | Assumptions | Cost (monthly) |
| ----------------------------- | ------------------------- | ------------------ |
| Log Analytics | Free tier | $ 0.00 |
| Azure Automation | 500 minutes (Free) | $ 0.00 |
| | | **$ 0.00** |
### Service Provider / Central
| Component | Assumptions | Cost (monthly) |
| ----------------------------- | ------------------------- | ------------------ |
| Azure Storage Account (BLOB) | 10 GB stored | $ 0.20 |
| Stream Analytics | 1 Unit | $ 89.28 |
| Azure Cosmos DB | 2 GB stored, 400 RUs | $ 24.31 |
| Azure Automation | 500 minutes (Free) | $ 0.00 |
| | | **$ 113.78** |
* [Optional] Power BI - assume you have a license for PowerBI Desktop.

Просмотреть файл

@ -17,22 +17,32 @@ The overall process for tenant monitoring for the service provider is:
1. Receive tenant OMS logs as CSV in storage account container 1. Receive tenant OMS logs as CSV in storage account container
1. Use Stream Analytics to move the CSV into Cosmos DB (formerly DocumentDB) 1. Use Stream Analytics to move the CSV into Cosmos DB (formerly DocumentDB)
1. Run cleanup process through Azure Automation at least daily (cleans up the CSV container) 1. Run cleanup process through Azure Automation at least daily (cleans up the CSV container, and archives processed CSVs)
1. Visualize. This solution provides a work in progress PowerBI sample for viewing data. Viusalization can be done through any mechanism familiar to you, including existing tools as long as they can query Cosmos DB. PowerBI is provided for convenience as a starting point. 1. Visualize. This solution provides a work in progress PowerBI sample for viewing data. Viusalization can be done through any mechanism familiar to you, including existing tools as long as they can query Cosmos DB. PowerBI is provided for convenience as a starting point.
## Deployment ## Deployment
Below are the basic steps required to deploy the service provider component of the solution, provided as interim guidance while working on the ARM template. Below are the basic steps required to deploy the service provider component of the solution, provided as interim guidance while working on the ARM template (currently limited by stream analytics configuration).
What's needed to setup the service provider components of zMontior. What's needed to setup the service provider components of zMontior.
* Storage Account (BLOBs) * Storage Account (BLOBs)
Storage for the CSV logs, Hot Locally Redundant (LRS) BLOB storage is sufficient. Cold may work but hasn't been tested.
* Two containers * Two containers
* Main logs container - the container where the logs get dropped from subscriptions/tenants. * Main logs container
* Archive lgos container
The container where the logs get dropped from subscriptions/tenants.
* Archive logs container
Long term retention of CSV logs, useful for later processing. Not directly required by this solution.
* Azure Autoamtion * Azure Autoamtion
Runs the CSV cleanup and archiving jobs.
* Deploy runbook: [RB-Ops-CleanupDaily][1] * Deploy runbook: [RB-Ops-CleanupDaily][1]
* Schedule to run at least once a day * Schedule to run at least once a day, recommended to run every hour or two
* Update storage details in RB-Ops-CleanupDaily: * Update storage details in RB-Ops-CleanupDaily:
```PowerShell ```PowerShell
$StorageAccountName = "<STORAGE ACCOUNT>" $StorageAccountName = "<STORAGE ACCOUNT>"
@ -41,11 +51,19 @@ What's needed to setup the service provider components of zMontior.
$StorageAccountKey = "<STORAGE ACCOUNT KEY>" $StorageAccountKey = "<STORAGE ACCOUNT KEY>"
``` ```
* Azure Cosmos DB * Azure Cosmos DB
* Create database and collection
Where the log data gets stored in JSON format, and where we report from. When querying Cosmos DB, we'll need the connection details, including the URI and key (read-only is sufficient) - both available under the "Keys" property under "Settings" on the Cosmos DB blade.
* Create database and create a collection
* Remember to set Time To Live (TTL) - recommended to set to 7 days (604800 seconds) * Remember to set Time To Live (TTL) - recommended to set to 7 days (604800 seconds)
* Scale according to number of tenants, a starting scale on a single partition is 2000 RUs.
This auto-deletes records in Cosmos DB older than what's specficied in the TTL setting. This keeps the collection size constrained and query performance reasonable. Adjust this according to your specific requirements. Remember, the original data is archived in the BLOB archive container.
* Scale according to number of tenants and query performance
Start scale on a single partition with 400 RUs. Increase RUs as query performance is impacted. Data ingest should not be impacted at 400 RUs as we add data in short bursts.
* Stream Analytics * Stream Analytics
* Configure input : storage account main logs containers * Configure input : storage account main logs container
* Configure output : Cosmos DB collection * Configure output : Cosmos DB collection
* Define the query: * Define the query:
```SQL ```SQL
@ -56,8 +74,11 @@ What's needed to setup the service provider components of zMontior.
FROM FROM
[StorageContainerCSVs] [StorageContainerCSVs]
``` ```
* Start the stream job
![Stream Analytics - Running](images/centralStreamAnalytics.png)
* Visualize - PowerBI * Visualize - PowerBI
* Configure connection to CosmosDB * Configure connection to CosmosDB using URI and key (read-only)
NOTE: Use the datasource connector "DocumentDB (Beta)" NOTE: Use the datasource connector "DocumentDB (Beta)"

Просмотреть файл

@ -6,15 +6,15 @@ The queries included in this sample solution are designed to be flexible and hig
### Daily Reports ### Daily Reports
* Alerts generated in the past 24 hours that are still open * Alerts generated in the past 24 hours that are still open ("activealertscritical")
* All Windows VMs that require updates * All Windows VMs that require updates ("anyupdatesrequired")
* All VMs that are missing critical updates * All VMs that are missing critical updates ("criticalupdatesrequired")
* All VMs with more than 2GB RAM available on average * All VMs with more than 2GB RAM available on average ("vmswithover2gbramavailable")
* All computers with their most recent data * All computers with their most recent data ("allvmsmostrecentdata")
### Hourly Reports ### Hourly Reports
* Average CPU usage calculated over 10 minutes for the last 1 hours * Average CPU usage calculated over 10 minutes for the last 1 hours ("perfavgcpu")
* All detected threats based on threat status rank * All detected threats based on threat status rank ("securitydetectedthreats")
* All Windows security login failures in the past 1 hours * All Windows security login failures in the past 1 hours ("acctloginfailurepasthour")
* All Linux Syslog errors in the past 1 hour * All Linux Syslog errors in the past 1 hour ("linuxsyslogerrors")

Двоичные данные
Documentation/images/centralStreamAnalytics.png Normal file

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 112 KiB

Просмотреть файл

@ -1,11 +1,17 @@
# zMonitor # zMonitor
An Azure platform native monitoring solution that enables Azure monitoring across multiple tenants or subscriptions.
## Overview ## Overview
An Azure platform native monitoring solution that enables monitoring across multiple tenants or subscriptions. Problem statement: A service provider with 50 tenants with Azure subscriptions provisioned through CSP (Cloud Solution Provider), needing to consolidate operational telemetry to optimize running costs as well as deliver higher SLAs with a minimum amount of administrative overhead.
Enter zMonitor, a platform for reporting based on Log Analytics data collected, quickly gaining insights across tenants or subscriptions. Gain insights on disks capacity status, VM performance - over or under utilized CPU/Memory/Disk/etc, security vulnerabilities - failed logons, update/patch status, application errors, etc.
The primary goal is to utilize Azure native components, and deliver an as simple as possible, highly configurable and scalable, cost-effective monitoring solution. The driving force behind this solution was the need to be able to monitor Azure resources across tenants (for service providers), using Azure native tools. While there are many metrics and logs generated in Azure, surfacing this information across subscriptions/tenants and effectively monitoring proved challenging. The primary goal is to utilize Azure native components, and deliver an as simple as possible, highly configurable and scalable, cost-effective monitoring solution. The driving force behind this solution was the need to be able to monitor Azure resources across tenants (for service providers), using Azure native tools. While there are many metrics and logs generated in Azure, surfacing this information across subscriptions/tenants and effectively monitoring proved challenging.
For an indication of potential costs for the solution, review the [cost estimate](Documentation/Cost-Estimate.md) documentation.
## Solution ## Solution
For monitoring within subscriptions, OMS Log Analytics is leveraged as the native log and metric aggregation toolset in Azure. Using the free tier of Log Analytics will be sufficient for most cases, but depends on the number of resources being monitored and the metrics being collected. For monitoring within subscriptions, OMS Log Analytics is leveraged as the native log and metric aggregation toolset in Azure. Using the free tier of Log Analytics will be sufficient for most cases, but depends on the number of resources being monitored and the metrics being collected.