Telemetry and logs generator for benchmarks
Перейти к файлу
Hauke Mallow 5034703008
Benchmark queries for Kusto and Spark SQL
Benchmark queries for Kusto and Spark SQL
2021-12-07 12:25:24 +01:00
.github/workflows Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Data Fixed double quotes issue in data generated for 'Message' column. 2021-02-22 12:25:49 +11:00
Flows Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Infrastructure Updated templates 2021-03-01 22:45:21 +11:00
Properties Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Utilities Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
queries Benchmark queries for Kusto and Spark SQL 2021-12-07 12:25:24 +01:00
.gitignore adds arm template for storage accounts and batch account 2021-02-22 09:00:09 +01:00
BenchmarkLogGenerator.csproj Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
BenchmarkLogGenerator.sln Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
CODE_OF_CONDUCT.md Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
CommandLineArgs.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Enums.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Generator.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
LICENSE Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
LogWriter.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
PriorityQueue.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Program.cs Removed splitting of string to get storage acc connection strings in code and restricted EH to 1 GB. 2021-03-01 12:31:18 +11:00
README.md Update README.md 2021-03-01 23:10:59 +11:00
SECURITY.md Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
Scheduler.cs Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00
runtimeconfig.template.json Initial check-in of sample logs data generator tool 2021-02-21 12:35:40 +11:00

README.md

Sample Data Generator

BenchmarkLogGenerator is a command line tool to generate sample trace and error logs data. This data can be used for proof of concepts or performance benchmarking scenarios. This tool supports following configurable command line parameters –

  1. -output: This is the location where output should be written to. Supported values are: LocalDisk, AzureStorage, EventHub
  2. -size: The data size to be generated. Supported values are: OneGB, OneTB, HundredTB Default value is OneGB
  3. -partition: The value for data partition, it could be between -1 to 9, where -1 means single partition. Default value is -1. Its only relevant for HundredTB data size.

Examples of commands – change CAPS values in following examples with your environment's values.

  1. OneGB size

    BenchmarkLogGenerator.exe -output:AzureStorage -size:OneGB -cc:BLOB STORAGE CONN STR

    BenchmarkLogGenerator -output:LocalDisk -size:OneGB -localPath:"C:\DATA"

    BenchmarkLogGenerator -output:EventHub -eventHubConnection:Endpoint=sb://EHNAMESPACE.servicebus.windows.net/;EntityPath=EHNAME;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=KEYVALUE -size:OneGB"

    Note – Data size is restricted to 1 GB for event hub.

  2. OneTB size

    BenchmarkLogGenerator.exe -output:AzureStorage -size:OneTB -cc:BLOB STORAGE CONN STR BenchmarkLogGenerator -output:LocalDisk -size:OneTB -localPath:"C:\DATA"

    You can also use below mentioned Azure batch templates to generate 1 TB data, 3 Standard_D32_v3 VMs would be enough to generate 1 TB .

  3. HundredTB size

    Use Azure Batch compute to generate 100 TBs of data. Tool has Azure batch templates for generating required batch pools and jobs. Follow these steps to generate 100TBs of data -

    a. Create required ten Azure storage accounts and Azure batch account using this script that calls relevant ARM templates- “TelemetryLogsGeneratorAndBenchmark\Infrastructure\ARM\deploy.ps1” -

    b. Create application package that has all required files and dependencies for running BenchmarkLogGenerator.exe. Application package will help to upload required files on all batch pool nodes. Publish self-contained .net app by using this command -

    dotnet publish -r win-x64 -c Release --self-contained

    This will create folder named publish under bin/Release with all dependencies, zip publish folder to be uploaded as an application package in next step. Use this command to create application package -

    az batch application package create --application-name GENERATOR --name testbatchacc --package-file publish.zip --resource-group sample-rg --version-name 1.0

    c. Create Azure batch pool using following template – 10 Standard_D64_v3 VMs with 0 to 9 partitions takes approximately 12 hrs to generate 100 TBs of data.

    az batch pool create --template generator-pool.json

    d. Create batch job using following template – Provide 10 storage account connection strings that were created in step 3a and create 10 tasks with 0 to 9 partitions respectively.

    Note – Number of VMs should match the number of tasks and partitions in generator-job.json

    az batch job create --template generator-job.json

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.