* update rules

* markdownlint fixes

* fixup GC markdowns

* Disable bare link rule

* skip checking eng folder

* fix headings

* Fix rule ordering
This commit is contained in:
Bill Wert 2021-10-18 14:00:02 -07:00 коммит произвёл GitHub
Родитель e7e94530cd
Коммит 5d6fc74933
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
21 изменённых файлов: 448 добавлений и 306 удалений

1
.github/workflows/markdownlint.yml поставляемый
Просмотреть файл

@ -7,6 +7,7 @@ on:
- ".markdownlint.json"
- ".github/workflows/markdownlint.yml"
- ".github/workflows/markdownlint-problem-matcher.json"
- "!eng/**"
jobs:
lint:

Просмотреть файл

@ -1,6 +1,6 @@
{
"default": false,
"MD009": {
"br_spaces": 0
}
"MD004": false,
"MD013": false,
"MD024": false,
"MD034": false
}

Просмотреть файл

@ -1,36 +1,59 @@
# Basic Scenarios
An introduction of how to run scenario tests can be found in [Scenarios Tests Guide](./scenarios-workflow.md). The current document has specific instruction to run:
- [Basic Scenarios](#basic-scenarios)
- [Basic Startup Scenarios](#basic-startup-scenarios)
- [Basic Size On Disk Scenarios](#basic-size-on-disk-scenarios)
- [Step 1 Initialize Environment](#step-1-initialize-environment)
- [Step 2 Run Precommand](#step-2-run-precommand)
- [Step 3 Run Test](#step-3-run-test)
- [Step 4 Run Postcommand](#step-4-run-postcommand)
- [Command Matrix](#command-matrix)
- [Relevant Links](#relevant-links)
## Basic Startup Scenarios
Startup is a performance metric that measures the time to main (from process start to Main method) of a running application. [Startup Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/Startup) is a test harness that meausres throughputs in general, and the "TimeToMain" parser of it supports this metric and it's used in all of the **Basic Startup Scenarios**.
[Scenarios Tests Guide](./scenarios-workflow.md) already walks through **startup time of an empty console template** as an example. For other startup scenarios, refer to [Command Matrix](#command-matrix).
## Basic Size On Disk Scenarios
Size On Disk, as the name suggests, is a metric that recursively measures the sizes of a directory and its children. [4Disk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/4Disk) is the test harness that provides this functionality and it's used in all of the **Basic Size On Disk Scenarios**.
We will walk through **Self-Contained Empty Console App Size On Disk** scenario as an example.
### Step 1 Initialize Environment
Same instruction of [Scenario Tests Guide - Step 1](./scenarios-workflow.md#step-1-initialize-environment).
### Step 2 Run Precommand
For **Self-Contained Empty Console App Size On Disk** scenario, run precommand to create an empty console template and publish it:
```
For **Self-Contained Empty Console App Size On Disk** scenario, run precommand to create an empty console template and publish it:
```cmd
cd emptyconsoletemplate
python3 pre.py publish -f net6.0 -c Release -r win-x64
```
`-f net6.0` sets the new template project targeting `net6.0` framework; `-c Release` configures the publish to be in release; `-r win-x64` takes an [RID](https://docs.microsoft.com/en-us/dotnet/core/rid-catalog)(Runtime Identifier) and specifies which runtime it supports.
**Note that by specifying RID option `-r <RID>`, it defaults to publish the app into a [SCD](https://docs.microsoft.com/en-us/dotnet/core/deploying/#publish-self-contained)(Self-contained Deployment) app; without it, a [FDD](https://docs.microsoft.com/en-us/dotnet/core/deploying/#publish-framework-dependent)(Framework Dependent Deployment) app will be published.**
Now there should be source code of the empty console template project under `app\` and published output under `pub\`.
Now there should be source code of the empty console template project under `app\` and published output under `pub\`.
### Step 3 Run Test
Now run the test:
```
```cmd
python3 test.py sod
```
[Size On Disk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/4Disk) checks the default `pub\` directory and shows the sizes of the directory and its children:
```
```cmd
[2020/09/29 04:21:35][INFO] ----------------------------------------------
[2020/09/29 04:21:35][INFO] Initializing logger 2020-09-29 04:21:35.865708
[2020/09/29 04:21:35][INFO] ----------------------------------------------
@ -50,35 +73,41 @@ python3 test.py sod
[2020/09/29 04:21:36][INFO] pub\api-ms-win-core-file-l1-2-0.dll |18696.000 bytes |18696.000 bytes |18696.000 bytes
[2020/09/29 04:21:36][INFO] pub\api-ms-win-core-file-l2-1-0.dll |18696.000 bytes |18696.000 bytes |18696.000 bytes34
```
### Step 4 Run Postcommand
Same instruction of [Scenario Tests Guide - Step 4](./scenarios-workflow.md#step-4-run-postcommand).
## Command Matrix
- \<tfm> values:
- netcoreapp2.1
- netcoreapp3.1
- net5.0
- net6.0
- netcoreapp2.1
- netcoreapp3.1
- net5.0
- net6.0
- \<-r RID> values:
- ""(WITHOUT `-r <RID>` --> FDD app)
- `"-r <RID>"` (WITH `-r` --> SCD app, [list of RID](https://docs.microsoft.com/en-us/dotnet/core/rid-catalog))
- ""(WITHOUT `-r <RID>` --> FDD app)
- `"-r <RID>"` (WITH `-r` --> SCD app, [list of RID](https://docs.microsoft.com/en-us/dotnet/core/rid-catalog))
| Scenario | Asset Directory | Precommand | Testcommand | Postcommand | Supported Framework | Supported Platform |
|-----------------------------------------------|-------------------------|-----------------------------------------------|-----------------|-------------|--------------------------------------------------|--------------------|
| Static Console Template Publish Startup | staticconsoletemplate | pre.py publish -f /<tfm> -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| Static Console Template Publish SizeOnDisk | staticconsoletemplate | pre.py publish -f /<tfm> -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static Console Template Build SizeOnDisk | staticconsoletemplate | pre.py build -f /<tfm> -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static VB Console Template Publish Startup | staticvbconsoletemplate | pre.py publish -f /<tfm> -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| Static VB Console Template Publish SizeOnDisk | staticvbconsoletemplate | pre.py publish -f /<tfm> -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static VB Console Template Build SizeOnDisk | staticvbconsoletemplate | pre.py build -f /<tfm> -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static Winforms Template Publish Startup | staticwinformstemplate | pre.py publish -f /<tfm> -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1 | Windows |
| Static Winforms Template Publish SizeOnDisk | staticwinformstemplate | pre.py publish -f /<tfm> -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1 | Windows;Linux |
| Static Winforms Template Build SizeOnDisk | staticwinformstemplate | pre.py build -f /<tfm> -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1 | Windows;Linux |
| New Console Template Publish Startup | emptyconsoletemplate | pre.py publish -f /<tfm> -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| New Console Template Publish SizeOnDisk | emptyconsoletemplate | pre.py publish -f /<tfm> -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New Console Template Build SizeOnDisk | emptyconsoletemplate | pre.py build -f /<tfm> -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New VB Console Template Publish Startup | emptyvbconsoletemplate | pre.py publish -f /<tfm> -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| New VB Console Template Publish SizeOnDisk | emptyvbconsoletemplate | pre.py publish -f /<tfm> -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New VB Console Template Build SizeOnDisk | emptyvbconsoletemplate | pre.py build -f /<tfm> -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static Console Template Publish Startup | staticconsoletemplate | pre.py publish -f TFM -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| Static Console Template Publish SizeOnDisk | staticconsoletemplate | pre.py publish -f TFM -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static Console Template Build SizeOnDisk | staticconsoletemplate | pre.py build -f TFM -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static VB Console Template Publish Startup | staticvbconsoletemplate | pre.py publish -f TFM -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| Static VB Console Template Publish SizeOnDisk | staticvbconsoletemplate | pre.py publish -f TFM -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static VB Console Template Build SizeOnDisk | staticvbconsoletemplate | pre.py build -f TFM -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| Static Winforms Template Publish Startup | staticwinformstemplate | pre.py publish -f TFM -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1 | Windows |
| Static Winforms Template Publish SizeOnDisk | staticwinformstemplate | pre.py publish -f TFM -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1 | Windows;Linux |
| Static Winforms Template Build SizeOnDisk | staticwinformstemplate | pre.py build -f TFM -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1 | Windows;Linux |
| New Console Template Publish Startup | emptyconsoletemplate | pre.py publish -f TFM -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| New Console Template Publish SizeOnDisk | emptyconsoletemplate | pre.py publish -f TFM -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New Console Template Build SizeOnDisk | emptyconsoletemplate | pre.py build -f TFM -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New VB Console Template Publish Startup | emptyvbconsoletemplate | pre.py publish -f TFM -c Release | test.py startup | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows |
| New VB Console Template Publish SizeOnDisk | emptyvbconsoletemplate | pre.py publish -f TFM -c Release /<-r RID> | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
| New VB Console Template Build SizeOnDisk | emptyvbconsoletemplate | pre.py build -f TFM -c Release | test.py sod | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |
## Relevant Links
- [SCD App](https://docs.microsoft.com/en-us/dotnet/core/deploying/#publish-self-contained)
- [FDD App](https://docs.microsoft.com/en-us/dotnet/core/deploying/#publish-framework-dependent)

Просмотреть файл

@ -2,17 +2,21 @@
## Table of Contents
- [Introduction](#Introduction)
- [Code Organization](#Code-Organization)
- [dotnet/runtime Prerequisites](#dotnet-runtime-Prerequisites)
- [Preventing Regressions](#Preventing-Regressions)
- [Solving Regressions](#Solving-Regressions)
- [Repro Case](#Repro-Case)
- [Profiling](#Profiling)
- [Running against Older Versions](#Running-against-Older-Versions)
- [Benchmarking new API](#Benchmarking-new-API)
- [Reference](#Reference)
- [PR](#PR)
- [Benchmarking workflow for dotnet/runtime repository](#benchmarking-workflow-for-dotnetruntime-repository)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Code Organization](#code-organization)
- [dotnet runtime Prerequisites](#dotnet-runtime-prerequisites)
- [Preventing Regressions](#preventing-regressions)
- [Running against the latest .NET Core SDK](#running-against-the-latest-net-core-sdk)
- [Solving Regressions](#solving-regressions)
- [Repro Case](#repro-case)
- [Profiling](#profiling)
- [Running against Older Versions](#running-against-older-versions)
- [Confirmation](#confirmation)
- [Benchmarking new API](#benchmarking-new-api)
- [Reference](#reference)
- [PR](#pr)
## Introduction
@ -201,7 +205,6 @@ The next step is to send a PR to this repository with the aforementioned benchma
The real performance investigation starts with profiling. We have a comprehensive guide about profiling [dotnet/runtime](https://github.com/dotnet/runtime), we really encourage you to read it: [Profiling dotnet/runtime workflow](./profiling-workflow-dotnet-runtime.md).
To profile the benchmarked code and produce an ETW Trace file ([read more](./benchmarkdotnet.md#Profiling)):
```cmd
@ -232,7 +235,6 @@ When you identify and fix the regression, you should use [ResultsComparer](../sr
Please take a moment to consider how the regression managed to enter the product. Are we now properly protected?
## Benchmarking new API
When developing new [dotnet/runtime](https://github.com/dotnet/runtime) features, we should be thinking about the performance from day one. One part of doing this is writing benchmarks at the same time when we write our first unit tests. Keeping the benchmarks in a separate repository makes it a little bit harder to run the benchmarks against new API, but it's still very easy.
@ -275,7 +277,7 @@ The first thing you need to do is send a PR with the new API to the [dotnet/runt
```cmd
/home/adsitnik/projects/performance>python3 ./scripts/benchmarks_ci.py --filter $YourFilter -f net6.0
```
This script will try to pull the latest .NET Core SDK from [dotnet/runtime](https://github.com/dotnet/runtime) nightly build, which should contain the new API that you just merged in your first PR, and use that to build MicroBenchmarks project and then run the benchmarks that satisfy the filter you provided.
```cmd
This script will try to pull the latest .NET Core SDK from [dotnet/runtime](https://github.com/dotnet/runtime) nightly build, which should contain the new API that you just merged in your first PR, and use that to build MicroBenchmarks project and then run the benchmarks that satisfy the filter you provided.
After you have confirmed your benchmarks successfully run locally, then your PR should be ready for performance repo.

Просмотреть файл

@ -1,43 +1,61 @@
# Blazor Scenarios
An introduction of how to run scenario tests can be found in [Scenarios Tests Guide](./scenarios-workflow.md). The current document has specific instruction to run:
- [New Blazorwasm Template Size On Disk](#new-blazorwasm-template-size-on-disk)
## New Blazorwasm Template Size On Disk
**New Blazorwasm Template Size On Disk** is a scenario test that meausres the size of published output of blazorwasm template. In other words, our test harness *implicitly* calls
```
**New Blazorwasm Template Size On Disk** is a scenario test that meausres the size of published output of blazorwasm template. In other words, our test harness *implicitly* calls
```cmd
dotnet new blazorwasm
```
and then
```
```cmd
dotnet publish -c Release -o pub
```
and measures the sizes of the `pub\` directory and its children with [SizeOnDisk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/SizeOnDisk).
**For more information about scenario tests in general, an introduction of how to run scenario tests can be found in [Scenario Tests Guide](link). The current document has specific instruction to run blazor scenario tests.**
### Prerequisites
- python3 or newer
- dotnet runtime 5.0 or newer
### Step 1 Initialize Environment
Same instruction of [Step 1 in Scenario Tests Guide](scenarios-workflow.md#step-1-initialize-environment).
### Step 2 Run Precommand
Run precommand to create and publish a new blazorwasm template:
```
```cmd
cd blazor
python3 pre.py publish --msbuild "/p:_TrimmerDumpDependencies=true"
```
Now there should be source code of the blazorwasm project under `app\` and published output under `pub\`. The `--msbuild "/p:_TrimmerDumpDependencies=true"` argument is optional and can be added to generate [linker dump](https://github.com/mono/linker/blob/main/src/analyzer/README.md) from the build, which will be saved to `blazor\app\obj\<Configuration>\<Runtime>\linked\linker-dependencies.xml.gz`.
Now there should be source code of the blazorwasm project under `app\` and published output under `pub\`. The `--msbuild "/p:_TrimmerDumpDependencies=true"` argument is optional and can be added to generate [linker dump](https://github.com/mono/linker/blob/main/src/analyzer/README.md) from the build, which will be saved to `blazor\app\obj\<Configuration>\<Runtime>\linked\linker-dependencies.xml.gz`.
### Step 3 Run Test
Run testcommand to measure the size on disk of the published output:
```
```cmd
py -3 test.py sod --scenario-name "SOD - New Blazor Template - Publish"
```
In the command, `sod` refers to the "Size On Disk" metric and [SizeOnDisk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/SizeOnDisk) will be used for this scenario. Note that `--scenario-name` is optional and the value can be changed for your own reference.
In the command, `sod` refers to the "Size On Disk" metric and [SizeOnDisk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/SizeOnDisk) will be used for this scenario. Note that `--scenario-name` is optional and the value can be changed for your own reference.
The test output should look like the following:
```
```cmd
[2020/09/25 11:24:44][INFO] ----------------------------------------------
[2020/09/25 11:24:44][INFO] Initializing logger 2020-09-25 11:24:44.727500
[2020/09/25 11:24:44][INFO] ----------------------------------------------
@ -54,18 +72,21 @@ The test output should look like the following:
|72.000 count |72.000 count |72.000 count
[2020/09/25 11:24:46][INFO] Synthetic Wire Size - .br
```
[SizeOnDisk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/SizeOnDisk) recursively measures the size of each folder and its children under the specified directory. In addition to the folders and files (path-like counters such as `pub\wwwroot\_framework\blazor.webassembly.js.gz` ), it also generates aggregate counters for each file type (such as `Aggregate - .dll`). For this **New Blazorwasm Template Size On Disk** scenario, Counter names starting with ` Synthetic Wire Size` is a unique counter type for blazorwasm, which simulates the size of files actually transferred over the wire when the webpage loads.
[SizeOnDisk Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/SizeOnDisk) recursively measures the size of each folder and its children under the specified directory. In addition to the folders and files (path-like counters such as `pub\wwwroot\_framework\blazor.webassembly.js.gz` ), it also generates aggregate counters for each file type (such as `Aggregate - .dll`). For this **New Blazorwasm Template Size On Disk** scenario, Counter names starting with `Synthetic Wire Size` is a unique counter type for blazorwasm, which simulates the size of files actually transferred over the wire when the webpage loads.
### Step 4 Run Postcommand
Same instruction of [Step 4 in Scenario Tests Guide](scenarios-workflow.md#step-4-run-postcommand).
## Command Matrix
For the purpose of quick reference, the commands can be summarized into the following matrix:
| Scenario | Asset Directory | Precommand | Testcommand | Postcommand | Supported Framework | Supported Platform |
|-------------------------------------|-----------------|-------------------------------------------------------------|-------------------------------------------------------------------|-------------|---------------------|--------------------|
| SOD - New Blazor Template - Publish | blazor | pre.py publish --msbuild "/p:_TrimmerDumpDependencies=true" | test.py sod --scenario-name "SOD - New Blazor Template - Publish" | post.py | net6.0 | Windows;Linux |
## Relevant Links
- [Blazorwasm](https://github.com/dotnet/aspnetcore/tree/main/src/Components)
- [IL Linker](https://github.com/mono/linker)

Просмотреть файл

@ -1,5 +1,6 @@
# Crossgen Scenarios
An introduction of how to run scenario tests can be found in [Scenarios Tests Guide](./scenarios-workflow.md). The current document has specific instruction to run:
- [Crossgen Throughput Scenario](#crossgen-throughput-scenario)
@ -7,50 +8,67 @@ An introduction of how to run scenario tests can be found in [Scenarios Tests Gu
- [Size on Disk Scenario](#Size-on-Disk-Scenario)
## Before Running Any Scenario
### Prerequisites
- python3 or newer
- dotnet runtime 3.0 or newer
- terminal/command prompt **in Admin Mode** (for collecting kernel traces)
- clean state of the test machine (anti-virus scan is off and no other user program's running -- to minimize the influence of environment on the test)
### 1. Generate Core_Root
These performance tests use the built runtime test directory [Core_Root](https://github.com/dotnet/runtime/blob/main/docs/workflow/testing/using-corerun.md) for the crossgen tool itself and other runtime assmblies as compilation input. Core_Root is an intermediate output from the runtime build, which contains runtime assemblies and tools.
You can skip this step if you already have Core_Root. To generate Core_Root directory, first clone [dotnet/runtime repo](https://github.com/dotnet/runtime) and run:
```
```cmd
src\tests\build.cmd Release <arch> generatelayoutonly
```
[the instruction of building coreclr tests](https://github.com/dotnet/runtime/blob/main/docs/workflow/testing/coreclr/windows-test-instructions.md), which creates Core_Root directory.
If the build's successful, you should have Core_Root with the path like:
```
```cmd
runtime\artifacts\tests\coreclr\<OS>.<Arch>.<BuildType>\Tests\Core_Root
```
### 2. Initialize Environment
Same instruction of [Scenario Tests Guide - Step 1](./scenarios-workflow.md#step-1-initialize-environment).
## Crossgen Throughput Scenario
**Crossgen Throughput** is a scenario test that measures the throughput of [crossgen compilation](https://github.com/dotnet/runtime/blob/main/docs/workflow/building/coreclr/crossgen.md). To be more specific, our test *implicitly* calls
```
```cmd
.\crossgen.exe <assembly to compile>
```
```
with other applicable arguments and measures its throughput. We will walk through crossgen compiling `System.Private.Xml.dll` as an example.
Ensure you've first followed the [preparatory steps](#Before-Running-Any-Scenario).
### 1. Run Precommand
For **Crossgen Throughput** scenario, unlike other scenarios there's no need to run any precommand (`pre.py`). Just switch to the test asset directory:
```
```terminal
cd crossgen
```
### 2. Run Test
Now run the test, in our example we use `System.Private.Xml.dll` under Core_Root as the input assembly to compile, and you can replace it with other assemblies **under Core_Root**.
```
```cmd
python3 test.py crossgen --core-root <path to core_root>\Core_Root --single System.Private.Xml.dll
```
This will run the test harness [Startup Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/Startup), which runs crossgen compilation in several iterations and measures its throughput. The result will be something like this:
```
```cmd
[2020/09/25 09:54:48][INFO] Parsing traces\Crossgen Throughput - System.Private.Xml.etl
[2020/09/25 09:54:49][INFO] Crossgen Throughput - System.Private.Xml.dll
[2020/09/25 09:54:49][INFO] Metric |Average |Min |Max
@ -59,40 +77,51 @@ This will run the test harness [Startup Tool](https://github.com/dotnet/performa
[2020/09/25 09:54:49][INFO] Time on Thread |7276.019 ms |7276.019 ms |7276.019 ms
```
### 3. Run Postcommand
Same instructions as [Scenario Tests Guide - Step 4](./scenarios-workflow.md#step-4-run-postcommand).
## Crossgen2 Throughput Scenario
Compared to `Crossgen Throughput` scenario, `Crossgen2 Throughput` Scenario measures more metrics, which are:
- Process Time (Throughput)
- Loading Interval
- Emitting Interval
- Jit Interval
- Compilation Interval
Steps to run **Crossgen2 Throughput** scenario are very similar to those of **Crossgen Throughput**. In addition to compilation of a single file, composite compilation is enabled in crossgen2, so the test command is different.
Ensure you've first followed the [preparatory steps](#Before-Running-Any-Scenario).
### 1. Run Precommand
Same as **Crossgen Throughput** scenario, there's no need to run any precommand (`pre.py`). Just switch to the test asset directory:
```
```cmd
cd crossgen2
```
### 2. Run Test
For scenario which compiles a **single assembly**, we use `System.Private.Xml.dll` as an example, you can replace it with other assembly **under Core_Root**:
```
```cmd
python3 test.py crossgen2 --core-root <path to core_root>\Core_Root --single System.Private.Xml.dll
```
For scenario which does **composite compilation**, we try to compile the majority of runtime assemblies represented by [framework-r2r.dll.rsp](https://github.com/dotnet/performance/blob/main/src/scenarios/crossgen2/framework-r2r.dll.rsp):
```
```cmd
python3 test.py crossgen2 --core-root <path to core_root>\Core_Root --composite <repo root>/src/scenarios/crossgen2/framework-r2r.dll.rsp
```
Note that for the composite scenario, the command line can exceed the maximum length if it takes a list of paths to assemblies, so an `.rsp` file is used to avoid it. `--composite <rsp file>` option refers to a rsp file that contains a list of assemblies to compile. A sample file [framework-r2r.dll.rsp](https://github.com/dotnet/performance/blob/main/src/scenarios/crossgen2/framework-r2r.dll.rsp) can be found under `crossgen2\` folder.
The test command runs the test harness [Startup Tool](https://github.com/dotnet/performance/tree/main/src/tools/ScenarioMeasurement/Startup), which runs crossgen2 compilation in several iterations and measures its throughput. The result should partially look like:
```
```cmd
[2020/09/25 10:25:09][INFO] Merging traces\Crossgen2 Throughput - Single - System.Private.perflabkernel.etl,traces\Crossgen2 Throughput - Single - System.Private.perflabuser.etl...
[2020/09/25 10:25:11][INFO] Trace Saved to traces\Crossgen2 Throughput - Single - System.Private.etl
[2020/09/25 10:25:11][INFO] Parsing traces\Crossgen2 Throughput - Single - System.Private.etl
@ -105,30 +134,39 @@ The test command runs the test harness [Startup Tool](https://github.com/dotnet/
[2020/09/25 10:25:15][INFO] Jit Interval |9464.402 ms |9464.402 ms |9464.402 ms
[2020/09/25 10:25:15][INFO] Compilation Interval |12827.350 ms |12827.350 ms |12827.350 ms
```
### 3. Run Postcommand
```
```cmd
python3 post.py
```
## Size on Disk Scenario
The size on disk scenario for crossgen/crossgen2 measures the sizes of generated ready-to-run images. These tests use the precommand to generate a ready-to-run image, then run the size-on-disk tool on the containing directory.
Ensure you've first followed the [preparatory steps](#Before-Running-Any-Scenario).
### 1. Run Precommand
```
```cmd
cd crossgen|crossgen2
python3 pre.py crossgen|crossgen2 --core-root <path to core_root> --single System.Private.Xml.dll
```
`--single` takes any framework assembly available in core_root.
### 2. Run Test
For scenario which compiles a **single assembly**, we use `System.Private.Xml.dll` as an example, you can replace it with other assembly **under Core_Root**:
```
```cmd
python3 test.py sod --dirs ./crossgen.out
```
The size-on-disk tool outputs an accounting of the file sizes under the crossgen output directory:
```
```cmd
[2020/10/26 19:00:06][INFO] Crossgen2 Size On Disk
[2020/10/26 19:00:06][INFO] Metric |Average |Min |Max
[2020/10/26 19:00:06][INFO] -----------------------------------------|------------------|------------------|------------------
@ -140,13 +178,15 @@ The size-on-disk tool outputs an accounting of the file sizes under the crossgen
[2020/10/26 19:00:06][INFO] Aggregate - .dll |8412672.000 bytes |8412672.000 bytes |8412672.000 bytes
[2020/10/26 19:00:06][INFO] Aggregate - .dll - Count |1.000 count |1.000 count |1.000 count
```
### 3. Run Postcommand
```
```cmd
python3 post.py
```
## Command Matrix
For the purpose of quick reference, the commands can be summarized into the following matrix:
| Scenario | Asset Directory | Precommand | Testcommand | Postcommand | Supported Framework | Supported Platform |
|----------------------------------------|-----------------|--------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|-------------|---------------------|-------------------------|
@ -157,4 +197,5 @@ For the purpose of quick reference, the commands can be summarized into the foll
| Crossgen2 Size on Disk | crossgen2 | pre.py crossgen2 --core-root \<path to Core_Root> --single \<assembly name> | test.py sod --dirs crossgen.out | post.py | N/A | Windows-x64;Linux |
## Relevant Links
[Crossgen2 Compilation Structure Enhancements](https://github.com/dotnet/runtime/blob/main/docs/design/features/crossgen2-compilation-structure-enhancements.md)

Просмотреть файл

@ -13,31 +13,34 @@
## Table of Contents
- [Mindset](#Mindset)
- [Benchmarks are not Unit Tests](#Benchmarks-are-not-Unit-Tests)
- [Benchmarks are Immutable](#Benchmarks-are-Immutable)
- [BenchmarkDotNet](#BenchmarkDotNet)
- [Setup](#Setup)
- [GlobalSetup](#GlobalSetup)
- [IterationSetup](#IterationSetup)
- [OperationsPerInvoke](#OperationsPerInvoke)
- [Test Cases](#Test-Cases)
- [Code Paths](#Code-Paths)
- [Array.Reverse](#Array.Reverse)
- [Buffer.CopyMemory](#Buffer.CopyMemory)
- [Always the same input data](#Always-the-same-input-data)
- [BenchmarkDotNet](#BenchmarkDotNet)
- [Arguments](#Arguments)
- [Params](#Params)
- [ArgumentsSource](#ArgumentsSource)
- [Generic benchmarks](#Generic-benchmarks)
- [Best Practices](#Best-Practices)
- [Single Responsibility Principle](#Single-Responsibility-Principle)
- [No Side-Effects](#No-Side-Effects)
- [Dead Code Elimination](#Dead-Code-Elimination)
- [Loops](#Loops)
- [Method inlining](#Method-Inlining)
- [Be explicit](#Be-explicit)
- [Microbenchmark Design Guidelines](#microbenchmark-design-guidelines)
- [General Overview](#general-overview)
- [Table of Contents](#table-of-contents)
- [Mindset](#mindset)
- [Benchmarks are not Unit Tests](#benchmarks-are-not-unit-tests)
- [Benchmarks are Immutable](#benchmarks-are-immutable)
- [BenchmarkDotNet](#benchmarkdotnet)
- [Setup](#setup)
- [GlobalSetup](#globalsetup)
- [IterationSetup](#iterationsetup)
- [OperationsPerInvoke](#operationsperinvoke)
- [Test Cases](#test-cases)
- [Code Paths](#code-paths)
- [Array.Reverse](#arrayreverse)
- [Buffer.CopyMemory](#buffercopymemory)
- [Always the same input data](#always-the-same-input-data)
- [BenchmarkDotNet](#benchmarkdotnet-1)
- [Arguments](#arguments)
- [Params](#params)
- [ArgumentsSource](#argumentssource)
- [Generic benchmarks](#generic-benchmarks)
- [Best Practices](#best-practices)
- [Single Responsibility Principle](#single-responsibility-principle)
- [No Side-Effects](#no-side-effects)
- [Dead Code Elimination](#dead-code-elimination)
- [Loops](#loops)
- [Method Inlining](#method-inlining)
- [Be explicit](#be-explicit)
## Mindset
@ -47,7 +50,7 @@ Writing Benchmarks is much different than writing Unit Tests. So before you star
When writing Unit Tests, we ideally want to test all methods and properties of the given type. We also test both the happy and unhappy paths. The result of every Unit Test run is a single value: passed or failed.
Benchmarks are different. First of all, the result of a benchmark run is never a single value. It's a whole distribution, described with values like mean, standard deviation, min, max and so on. To get a meaningful distribution, the benchmark has to be executed many, many times. **This takes a lot of time**. With the current [recommended settings](https://github.com/dotnet/performance/blob/51d8f8483b139bb1edde97f917fa436671693f6f/src/harness/BenchmarkDotNet.Extensions/RecommendedConfig.cs#L17-L20) used in this repository, it takes on average six seconds to run a single benchmark.
Benchmarks are different. First of all, the result of a benchmark run is never a single value. It's a whole distribution, described with values like mean, standard deviation, min, max and so on. To get a meaningful distribution, the benchmark has to be executed many, many times. **This takes a lot of time**. With the current [recommended settings](https://github.com/dotnet/performance/blob/51d8f8483b139bb1edde97f917fa436671693f6f/src/harness/BenchmarkDotNet.Extensions/RecommendedConfig.cs#L17-L20) used in this repository, it takes on average six seconds to run a single benchmark.
The public surface of .NET Standard 2.0 API has tens of thousands of methods. If we had 1 benchmark for every public method, it would take two and a half days to run the benchmarks. Not to speak about the time it would take to analyze the results, filter the false positives, etc..
This is only one of the reasons why writing Benchmarks is different than writing Unit Tests.
@ -58,7 +61,7 @@ The goal of benchmarking is to test the performance of all the methods that are
The results of benchmark runs are exported to an internal Reporting System. Every benchmark is identified using the following `xUnit` ID pattern:
```
```cmd
namespace.typeName.methodName(paramName: paramValue)
```
@ -76,7 +79,7 @@ If you have some good reasons for changing the implementation of the benchmark y
BenchmarkDotNet is the benchmarking harness used in this repository. If you are new to BenchmarkDotNet, you should read [this introduction to BenchmarkDotNet](./benchmarkdotnet.md).
Key things that you need to remember:
Key things that you need to remember:
* BenchmarkDotNet **does not require the user to provide the number of iterations and invocations per iteration**, it implements a smart heuristic based on standard error and runs the benchmark until the results are stable.
* BenchmarkDotNet runs every benchmark in a separate process, process isolation allows avoiding side-effects. The more memory allocated by given benchmark, the bigger the difference for in-proc vs out-proc execution.
@ -172,7 +175,7 @@ public OperationStatus Base64EncodeInPlace() => Base64.EncodeToUtf8InPlace(_dest
If you want to get a better understanding of it, you should read [this blog post](https://aakinshin.net/posts/stopwatch/#pitfalls) about stopwatch and follow the GitHub discussions in this [issue](https://github.com/dotnet/BenchmarkDotNet/issues/730) and [PR](https://github.com/dotnet/BenchmarkDotNet/pull/760).
<ins>**If using `[GlobalSetup]` is enough, you should NOT be using `[IterationSetup]`**</ins>
**If using `[GlobalSetup]` is enough, you should NOT be using `[IterationSetup]`**
### OperationsPerInvoke
@ -231,7 +234,7 @@ public Span<byte> Slice()
BenchmarkDotNet is going to scale the result by the number provided in `OperationsPerInvoke` so the cost of creating the `Span` is going to be amortized:
```
```cmd
reportedResult = 1/16*SpanCtor + 1*Slice
```
@ -277,10 +280,10 @@ public static void Reverse<T>(T[] array, int index, int length)
Does it make sense to test the code paths that throw?
* No, because we would be measuring the performance of throwing and catching the exceptions. That was not the goal of this benchmark.
* No, because throwing exceptions should be exceptional and [exceptions should not be used to control flow](https://docs.microsoft.com/en-US/visualstudio/profiling/da0007-avoid-using-exceptions-for-control-flow?view=vs-2019). It's an edge case, we should focus on [common use cases, not edge cases](#Benchmarks-are-not-Unit-Tests).
* No, because we would be measuring the performance of throwing and catching the exceptions. That was not the goal of this benchmark.
* No, because throwing exceptions should be exceptional and [exceptions should not be used to control flow](https://docs.microsoft.com/en-US/visualstudio/profiling/da0007-avoid-using-exceptions-for-control-flow?view=vs-2019). It's an edge case, we should focus on [common use cases, not edge cases](#Benchmarks-are-not-Unit-Tests).
Should we test the code path for an array with one or zero elements?
Should we test the code path for an array with one or zero elements?
* No, because it does not perform any actual work. We would be benchmarking a branch and return from the method. If `Reverse` is inlinable, such a benchmark would be measuring the performance of `if (length <= 1)` and the throw checks.
* No, because it's not a common case. Moreover, it's very unlikely that removing this check from the code would pass the [dotnet/runtime](https://github.com/dotnet/runtime) repository code review and regress the performance in the future.
@ -524,7 +527,7 @@ public void Add() => _numbers.Add(12345);
In this particular benchmark, the list is growing with every benchmark invocation. `List<T>` is internally using an `Array` to store all the elements. When the array is not big enough to store one more element, a two times bigger array is allocated and all elements are copied from the old array to the new one. It means that every next `Add` operation takes more time. We might also get `OutOfMemoryException` at some point in time.
**Benchmarks should not have any side effects**.
**Benchmarks should not have any side effects**.
### Dead Code Elimination

Просмотреть файл

@ -2,41 +2,42 @@
## Table of Contents
- [Introduction](#Introduction)
- [Prerequisites](#Prerequisites)
- [Build](#Build)
- [Repro](#Repro)
- [Project Settings](#Project-Settings)
- [Visual Studio Profiler](#Visual-Studio-Profiler)
- [dotnet](#dotnet)
- [CPU Usage](#CPU-Usage)
- [CoreRun](#CoreRun)
- [Allocation Tracking](#Allocation-Tracking)
- [PerfView](#PerfView)
- [CPU Investigation](#CPU-Investigation)
- [Filtering](#Filtering)
- [Analyzing the Results](#Analyzing-the-Results)
- [Viewing Source Code](#Viewing-Source-Code)
- [Identifying Regressions](#Identifying-Regressions)
- [VTune](#VTune)
- [When to use](#When-to-use)
- [Identifying Hotspots](#Identifying-Hotspots)
- [Troubleshooting](#Troubleshooting)
- [Code](#Code)
- [Skids](#Skids)
- [Linux](#Linux)
- [PerfCollect](#PerfCollect)
- [Preparing Your Machine](#Preparing-Your-Machine)
- [Preparing Repro](#Preparing-Repro)
- [Collecting a Trace](#Collecting-a-Trace)
- [Analyzing the Trace](#Analyzing-the-Trace)
- [Profiling workflow for dotnet/runtime repository](#profiling-workflow-for-dotnetruntime-repository)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Prerequisites](#prerequisites)
- [Build](#build)
- [Repro](#repro)
- [Project Settings](#project-settings)
- [Visual Studio Profiler](#visual-studio-profiler)
- [dotnet](#dotnet)
- [CPU Usage](#cpu-usage)
- [CoreRun](#corerun)
- [Allocation Tracking](#allocation-tracking)
- [PerfView](#perfview)
- [CPU Investigation](#cpu-investigation)
- [Filtering](#filtering)
- [Analyzing the Results](#analyzing-the-results)
- [Viewing Source Code](#viewing-source-code)
- [Identifying Regressions](#identifying-regressions)
- [VTune](#vtune)
- [When to use](#when-to-use)
- [Identifying Hotspots](#identifying-hotspots)
- [Troubleshooting](#troubleshooting)
- [Code](#code)
- [Skids](#skids)
- [Linux](#linux)
- [PerfCollect](#perfcollect)
- [Preparing Your Machine](#preparing-your-machine)
- [Preparing Repro](#preparing-repro)
- [Collecting a Trace](#collecting-a-trace)
- [Analyzing the Trace](#analyzing-the-trace)
# Introduction
## Introduction
**This doc explains how to profile local [dotnet/runtime](https://github.com/dotnet/runtime) builds and it's targetted at [dotnet/runtime](https://github.com/dotnet/runtime) repository contributors.**
Before you start any performance investigation, you need to [build](#Build) [dotnet/runtime](https://github.com/dotnet/runtime) in **Release**, create a small [repro](#Repro) app and change the default [project settings](#Project-Settings). If you want to profile a BenchmarkDotNet test (like those in this repo), [BenchmarkDotNet has built-in profiling option](https://github.com/dotnet/performance/blob/main/docs/benchmarkdotnet.md#profiling) to collect trace.
Before you start any performance investigation, you need to [build](#Build) [dotnet/runtime](https://github.com/dotnet/runtime) in **Release**, create a small [repro](#Repro) app and change the default [project settings](#Project-Settings). If you want to profile a BenchmarkDotNet test (like those in this repo), [BenchmarkDotNet has built-in profiling option](https://github.com/dotnet/performance/blob/main/docs/benchmarkdotnet.md#profiling) to collect trace.
The next step is to choose the right profiler depending on the OS:
@ -49,9 +50,9 @@ The next step is to choose the right profiler depending on the OS:
If you clearly need information on CPU instruction level, then depending on the hardware you should use [Intel VTune](#VTune) or [AMD uProf](https://developer.amd.com/amd-uprof/).
# Prerequisites
## Prerequisites
## Build
### Build
You need to build [dotnet/runtime](https://github.com/dotnet/runtime) in Release first:
@ -81,7 +82,7 @@ Once you rebuild the part of [dotnet/runtime](https://github.com/dotnet/runtime)
C:\Projects\runtime\src\libraries\System.Text.RegularExpressions\src> dotnet msbuild /p:Configuration=Release
```
## Repro
### Repro
The next step is to prepare a small console app that executes the code that you want to profile. The app **should run for at least a few seconds** and **keep the overhead as small as possible to make sure it does not dominate the profile**.
@ -124,7 +125,7 @@ class Program
}
```
## Project Settings
### Project Settings
It's recommended to disable Tiered JIT (to avoid the need of warmup) and emit full symbols (not enabled by default for Release builds):
@ -145,13 +146,13 @@ It's recommended to disable Tiered JIT (to avoid the need of warmup) and emit fu
```
# Visual Studio Profiler
## Visual Studio Profiler
Visual Studio Profiler is not as powerful as PerfView, but it's definitely more intuitive to use. If you don't know which profiler to use, you should use it by default.
To profile a local build of [dotnet/runtime](https://github.com/dotnet/runtime) and get symbol solving working in Visual Studio Profiler you can use the produced `dotnet` or `CoreRun`.
## dotnet
### dotnet
Following script launches a Visual Studio solution with environment variables required to use a local version of the .NET Core SDK:
@ -180,7 +181,7 @@ You can just save it as `startvs.cmd` file and run providing path to the `testho
startvs.cmd "C:\Projects\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\" "C:\Projects\repro\ProfilingDocs.sln"
```
## CPU Usage
### CPU Usage
Once you started the VS with the right environment variables you need to click on the `Debug` menu item and then choose `Performance Profiler` or just press `Alt+F2`:
@ -226,7 +227,7 @@ If you have configured everything properly you are able to see the CPU time spen
![External code](img/vs_profiler_7_source_code.png)
## CoreRun
### CoreRun
If you prefer to use CoreRun instead of dotnet you need to select `Launch an executable`
@ -242,8 +243,7 @@ The alternative is to run the repro app using CoreRun yourself and use VS Profil
![Choose CoreRun process](img/vs_profiler_11_corerun_attach.png)
## Allocation Tracking
### Allocation Tracking
Since `DateTime.UtcNow` does not allocate managed memory, we are going to profile a different app:
@ -283,13 +283,13 @@ Again, if you have configured everything properly you are able to right click on
![Actual Source File](img/vs_profiler_15_memory_source_file.png)
# PerfView
## PerfView
PerfView is the ultimate .NET Profiler and if you are new to PerfView **it's recommended to read it's tutorial or watch the tutorial [videos](https://channel9.msdn.com/Series/PerfView-Tutorial)**.
![Welcome Screen](img/perfview_0_welcome.png)
## CPU Investigation
### CPU Investigation
We can **Collect** profile data by either **Run**ning a standalone executable (or command) or **Collect**ing the data machine wide with explicit start and stop.
@ -321,7 +321,7 @@ The `Metric/Interval` is a quick measurement of how CPU bound the trace is as a
![CPU Metric](img/perfview_7_cpu_metric.png)
## Filtering
### Filtering
Fundamentally, what is collected by the PerfView profiler is a sequence of stacks. A stack is collected every millisecond for each hardware processor on the machine. This is very detailed information and hence by default PerfView groups the stacks. This is very useful when you are profiling a real-world application in a production environment, but when you work on the .NET Team and you profile some simple repro app you care about all details and you don't want the results to be grouped by modules.
@ -357,7 +357,7 @@ As you can see, all the methods that were executed before the first and after la
This simple text representation of histogram can be very useful when profiling more complex scenarios, but in this case it just shows us that `DateTime.UtcNow` was executed all the time. But this is exactly what we wanted!
## Analyzing the Results
### Analyzing the Results
Once we get the data filtered we can start the analysis.
@ -379,7 +379,7 @@ If you wish you can see the entire `Call Tree` by clicking on the `Call Tree` t
![Flame Graph](img/perfview_19_flame_graph.png)
The graph starts at the bottom. Each box represents a method in the stack (inclusive CPU time). Every parent is the caller, children are the callees. The wider the box, the more time it was on-CPU.
The graph starts at the bottom. Each box represents a method in the stack (inclusive CPU time). Every parent is the caller, children are the callees. The wider the box, the more time it was on-CPU.
For the leaf nodes the inclusive time == exclusive time. The difference between the parent and children box width (marked with red on the image below) is the exclusive parent (caller) time.
@ -389,7 +389,7 @@ parent.InclusiveTime - children.InclusiveTime = parent.ExclusiveTime
![Flame Graph Exclusive time](img/perfview_20_flame_graph_exclusive_time.png)
## Viewing Source Code
### Viewing Source Code
If you want to view the Source Code of the given method you need to right-click on it and select `Goto Source (Def)` menu item. Or just press `Alt+D`.
@ -401,7 +401,7 @@ If PerfView fails to show you the source code you should read the `Log` output.
**Note:** As of today, PerfView keeps the `.pdb` files [opened](https://github.com/microsoft/perfview/pull/979) after showing the source code. It means that if you keep the trace file opened in PerfView and try to rebuild [dotnet/runtime](https://github.com/dotnet/runtime) the build is going to fail. You might need to close PerfView to rebuild [dotnet/runtime](https://github.com/dotnet/runtime).
## Identifying Regressions
### Identifying Regressions
PerfView has a built-in support for identifying regressions. To use it you need to:
@ -415,7 +415,7 @@ PerfView has a built-in support for identifying regressions. To use it you need
It's recommended to use it instead of trying to eyeball complex Flame Graphs.
# VTune
## VTune
Intel VTune is a very powerful profiler that allows for low-level profiling:
@ -425,9 +425,9 @@ Intel VTune is a very powerful profiler that allows for low-level profiling:
VTune **supports Windows, Linux and macOS!**
## When to use
### When to use
Let's use PerfView to profile the following app that tries to reproduce [Potential regression: Dictionary of Value Types #25842 ](https://github.com/dotnet/coreclr/issues/25842):
Let's use PerfView to profile the following app that tries to reproduce [Potential regression: Dictionary of Value Types #25842](https://github.com/dotnet/coreclr/issues/25842):
```cs
using System.Collections.Generic;
@ -482,7 +482,7 @@ When we open the Flame Graph we can see that the Call Stack ends at `FindEntry`
**You should start every investigation with VS Profiler or PerfView. When you get to a point where you clearly need information on CPU instruction level and you are using Intel hardware, use VTune.**
## Identifying Hotspots
### Identifying Hotspots
Run VTune **as Administrator|sudo** and click `New Project`:
@ -538,7 +538,7 @@ To go to the next hotsopot you can to click the `Go to Smaller Function Hotspot`
![Go to hotspot](img/vtune_goto_hotspot.png)
## Troubleshooting
### Troubleshooting
If you ever run into any problem with VTune, you should check the `Collection Log`:
@ -546,7 +546,7 @@ If you ever run into any problem with VTune, you should check the `Collection Lo
If the error message does not tell you anything and you can't find any similar reports on the internet, you can ask for help on the [Intel VTune Amplifier forum](https://software.intel.com/en-us/forums/intel-vtune-amplifier).
## Code
### Code
VTune is capable of showing not only the output assembly code but also native and managed source code.
@ -558,7 +558,7 @@ If it ever fails to show the source code (the `Source` button is then greyed out
![Specify Sources](img/vtune_folders.png)
## Skids
### Skids
Hardware Event-Based Sampling is vulnerable to [skids](https://github.com/brendangregg/skid-testing). When the event occurs, the counter increments and when it reaches the max interval value the event is fired with **current** Instruction Pointer. As an example we can use following source code:
@ -570,9 +570,9 @@ The profiler shows that a lot of inclusive CPU time was spent on the `xor` opera
![Skids](img/vtune_skids.png)
## Linux
### Linux
VTune works great on Linux and as of today it's the only fully featured profiler that works with .NET Core on Linux.
VTune works great on Linux and as of today it's the only fully featured profiler that works with .NET Core on Linux.
It works best when installed and run as `sudo`:
@ -594,7 +594,7 @@ It can show the disassembly of profiled methods:
![VTune Linux ASM](img/vtune_linux_asm.png)
# PerfCollect
## PerfCollect
PerfCollect is a simple, yet very powerful script that allows for profiling .NET Core apps on Linux. It is internally leveraging LTTng and using perf.
@ -602,7 +602,7 @@ In contrary to `dotnet trace` it gives you native call stacks which are very use
It has it's own excellent [documentation](https://github.com/dotnet/runtime/blob/main/docs/project/linux-performance-tracing.md) (a **highly recommended read**), the goal of this doc is not to duplicate it, but rather show **how to profile local [dotnet/runtime](https://github.com/dotnet/runtime) build running on a Linux VM from a Windows developer machine**. We need two OSes because as of today only PerfView is capable of opening a `PerfCollect` trace file.
## Preparing Your Machine
### Preparing Your Machine
You need to install the script, make it an executable and run as sudo with `install` parameter to install all the dependencies.
@ -612,7 +612,7 @@ chmod +x perfcollect
sudo ./perfcollect install
```
## Preparing Repro
### Preparing Repro
Before you collect a trace, you need to prepare a [Repro](#Repro). As of today, `PerfCollect` does not give you the possibility to run a standalone executable. It collects the data machine wide with explicit start and stop. The simplest way to create a repo app is to simply put the code that you want to profile inside a `while(true)` loop.
@ -648,7 +648,7 @@ namespace ProfilingDocs
scp -r "C:\Users\adsitnik\source\repos\ProfilingDocs\ProfilingDocs\bin\Release\netcoreapp3.1\ProfilingDocs.dll" adsitnik@11.222.33.444:/home/adsitnik/Projects/coreclr/bin/tests/Linux.x64.Release/Tests/Core_Root/ProfilingDocs.dll
```
## Collecting a Trace
### Collecting a Trace
To collect a trace, you need to open two terminals:
@ -689,7 +689,7 @@ Trace saved to slowStartsWith.trace.zip
![PerfCollect Demo](img/perfcollect_demo.gif)
## Analyzing the Trace
### Analyzing the Trace
As mentioned previously, currently only PerfView is capable of opening a `PerfCollect` trace file. So to analyze the trace file you need to copy it to a Windows machine. You can do that by using `scp`.

Просмотреть файл

@ -1,6 +1,6 @@
# Scenario Tests Guide
## Overview
Our existing scenario tests are under `src\scenarios` in this repo, where each subdirectory contains a test asset that can be combined with a specific set of commands to do measurements. Currently we have scenario tests for [SDK](./sdk-scenarios.md), [Crossgen](./crossgen-scenarios.md), [Blazor](./blazor-scenarios.md) and [other scenarios](./basic-scenarios.md).
@ -9,84 +9,107 @@ Our existing scenario tests are under `src\scenarios` in this repo, where each s
This is a general guideline on how the scenario tests are arranged in this repo. We will walk through it by measuring the **startup time of an empty console template** as a sample scenario. For other scenarios, refer to the following links:
- [How to run SDK scenario tests](./sdk-scenarios.md)
- [How to run Crossgen scenario tests](./crossgen-scenarios.md)
- [How to run Blazor tests](./blazor-scenarios.md)
- [How to run other Scenario tests](./basic-scenarios.md)
- [How to run SDK scenario tests](./sdk-scenarios.md)
- [How to run Crossgen scenario tests](./crossgen-scenarios.md)
- [How to run Blazor tests](./blazor-scenarios.md)
- [How to run other Scenario tests](./basic-scenarios.md)
### Prerequisites
### Prerequisites:
- python3 or newer
- dotnet runtime 3.0 or newer
- terminal/command prompt **in Admin Mode** (for collecting kernel traces)
- clean state of the test machine (anti-virus scan is off and no other user program's running -- to minimize the influence of environment on the test)
### Step 1 Initialize Environment
Before running the test, it is important to choose the right version of dotnet to test. Follow the guidance below to set up `PYTHONPATH` (to run our Python test harness) and dotnet directory for the desired test environment. This step is applicable to all scenarios and can only be run once for one environment.
Go to `src\scenarios` and run the following command:
#### Windows
Start a new PowerShell environment ***in Admin Mode*** and run:
```
```cmd
cd src\scenarios
.\init.ps1
```
The next steps will need to run in the same Powershell environment. You can also specify custom dotnet directory or download a new dotnet to use. Add `-Help` option for more information.
#### Linux
Start a new bash terminal ***with Root Access*** and run:
```
```bash
cd src/scenarios
. ./init.sh
```
The next steps will need to run in the same bash environment. You can also specify custom dotnet directory or download a new dotnet to use. Add `-h` or `-help` option for more information.
### Step 2 Run Precommand
Now you have `PYTHONPATH` set and dotnet to test in `PATH`, the next step is to run precommand to set up the specific test asset. Precommand is necessary for some scenarios and different test assets require different commands. **NOTE: for each test asset, not all commands are supported. Please refer to [Command Matrix](#command-matrix) for available scenarios.**
For some scenarios (not all), `pre.py` runs a defined precommand before the test run, which can but not limited to set up the asset by either creating a new template or using a static template.
Format for running precommands:
```
```cmd
cd <asset directory> # switch to the specific asset directory
```
#### Windows
```
```cmd
py pre.py <command> <options> # run precommand
```
#### Linux
```
```bash
python3 pre.py <command> <options> # run precommand
```
In our **startup time of an empty console template** example, we can run
```
```cmd
cd emptyconsoletemplate
python3 pre.py publish -f net6.0 -c Release
```
The above command creates a new dotnet console template in `emptyconsoletemplate\app\` folder, builds the project targeting net6.0 in Release and publishs it to `emptyconsoletemplate\pub\` folder.
Run `python3 pre.py --help` for more command options and their meanings.
### Step 3 Run Test
Upon this step, the project source code should exist under `app\` directory. There should be published output under `pub\` if the precommand is "publish", and built output under `bin\` if the precommand is "build". Now the test should be ready to run. **NOTE: for each test asset, not all commands are supported. Please refer to [Command Matrix](#command-matrix) for available scenarios.**
`test.py` runs the test with a set of defined attributes.
`test.py` runs the test with a set of defined attributes.
Format for running test commands:
#### Windows
```
```cmd
py test.py <command> <test-specific options>
```
#### Linux
```
```bash
python3 test.py <command> <test-specific options>
```
In our **startup time of an empty console template example**, we can run
```
```cmd
python3 test.py startup
```
The above command runs the published app under `pub\` with specified iterations and measures its startup time.
The above command runs the published app under `pub\` with specified iterations and measures its startup time.
Test report and traces are saved into `emptyconsoleatemplate\traces\` directory.
@ -96,13 +119,16 @@ Run `python3 test.py --help` for more command options and their meanings.
`post.py` should be optionally executed to clean up the artifacts. It's the same command for all scenarios.
```
```cmd
py -3 post.py
```
The above command removes `app`, `bin`, `traces`, `pub`, `tmp` directories if generated.
### Command Matrix
Some command options are only applicable for certain test assets. Refer to the command matrix for each scenario category for a list of available command combinations:
- [SDK Command Matrix](./sdk-scenarios.md#command-matrix)
- [Crossgen Command Matrix](./crossgen-scenarios.md#command-matrix)
- [Blazor Command Matrix](./blazor-scenarios.md#command-matrix)

Просмотреть файл

@ -1,11 +1,17 @@
# SDK Scenarios
An introduction of how to run scenario tests can be found in [Scenarios Tests Guide](./scenarios-workflow.md). The current document has specific instruction to run:
- [SDK Build Throughput Scenario](#sdk-build-throughput-scenario)
## SDK Build Throughput Scenario
**SDK Build Throughput** is a scenario test that measures the throughput of SDK build process. To be more specific, our test *implicitly calls*
```
```cmd
dotnet build <project>
```
with other applicable arguments and measures its throughput.
There are 2 types of SDK build --- *Clean Build* and *Build No Change*.
@ -14,28 +20,37 @@ There are 2 types of SDK build --- *Clean Build* and *Build No Change*.
- *Build No Change* simulates the build after the first-time-ever build. The test harness runs a warmup build which leaves the binaries, without cleanup between iterations.
### Prerequisites
- make sure the test directory is clean of artifacts (you can run `post.py` to remove existing artifacts folders from the last run )
- python3 or newer
- dotnet runtime 3.1 or newer
- terminal/command prompt **in Admin Mode** (for collecting kernel traces)
- clean state of the test machine (anti-virus scan is off and no other user program's running -- to minimize the influence of environment on the test)
We will walk through **SDK Console Template** as an example.
### Step 1 Initialize Environment
Same instruction of [Scenario Tests Guide - Step 1](./scenarios-workflow.md#step-1-initialize-environment).
### Step 2 Run Precommand
If you are running the test NOT for the first time and `app\` folder exists under the asset directory, make sure it's removed so the previous test artifact won't be used. You can use `post.py` to clean it up:
```
```cmd
python3 post.py
```
Run precommand to create a new console template.
```
Run precommand to create a new console template.
```cmd
cd emptyconsoletemplate
python3 pre.py default -f net6.0
```
The `default` command prepares the asset (creating a new project if the asset is a template and copying it to `app\`). Now there should be source code of a console template project under `app\`.
Note that it is important to be aware SDK takes different paths for different TFMs, and you can configure which TFM your SDK tests against. Howeverm your SDK version should be >= the TFM version because SDK cannot build a project that has a newer runtime. Here's a matrix of valid SDK vs. TFM combinations:
Note that it is important to be aware SDK takes different paths for different TFMs, and you can configure which TFM your SDK tests against. Howeverm your SDK version should be >= the TFM version because SDK cannot build a project that has a newer runtime. Here's a matrix of valid SDK vs. TFM combinations:
| | netcoreapp2.1 | netcoreapp3.1 | net5.0 | net6.0 |
|--------------|---------------|---------------|--------|--------|
@ -44,22 +59,27 @@ Note that it is important to be aware SDK takes different paths for different TF
| .NET 5 SDK | x | x | x | |
| .NET 6 SDK | x | x | x | x |
You can change TFM of the project by specifying `-f <tfm>`, which allows to replace the `<TargetFramework></TargetFramework>` property in the project file to be the custom TFM value (make sure it's a valid TFM value) you specified.
You can change TFM of the project by specifying `-f <tfm>`, which allows to replace the `<TargetFramework></TargetFramework>` property in the project file to be the custom TFM value (make sure it's a valid TFM value) you specified.
### Step 3 Run Testcommand
Run testcommand to measure the throughput of sdk build.
Run testcommand to measure the throughput of sdk build.
For *Clean Build* test, run:
```
```cmd
python3 test.py sdk clean_build
```
For *Build No Change* test, run:
```
```cmd
python3 test.py sdk build_no_change
```
The test result should look like the following:
```
```cmd
[2020/09/27 23:51:22][INFO] Merging traces\emptycsconsoletemplate_SDK_build_no_change_startup.perflabkernel.etl...
[2020/09/27 23:51:22][INFO] Trace Saved to traces\emptycsconsoletemplate_SDK_build_no_change_startup.etl
[2020/09/27 23:51:22][INFO] Parsing traces\emptycsconsoletemplate_SDK_build_no_change_startup.etl
@ -71,18 +91,19 @@ The test result should look like the following:
```
### Step 4 Run Postcommand
Same instruction of [Step 4 in Scenario Tests Guide](scenarios-workflow.md#step-4-run-postcommand).
## Command Matrix
- \<tfm> values:
- netcoreapp2.1
- netcoreapp3.1
- net5.0
- net6.0
- \<build option> values:
- clean_build
- build_no_change
- \<tfm> values:
- netcoreapp2.1
- netcoreapp3.1
- net5.0
- net6.0
- \<build option> values:
- clean_build
- build_no_change
| Scenario | Asset Directory | Precommand | Testcommand | Postcommand | Supported Framework | Supported Platform |
|:------------------------------|:---------------------|:-------------------------|:----------------------------|:------------|:------------------------------------------|:-------------------|
@ -95,4 +116,3 @@ Same instruction of [Step 4 in Scenario Tests Guide](scenarios-workflow.md#step-
| SDK Windows Forms Template | windowsforms | pre.py default -f \<tfm> | test.py sdk \<build option> | post.py | netcoreapp3.1 | Windows |
| SDK WPF Template | wpf | pre.py default -f \<tfm> | test.py sdk \<build option> | post.py | netcoreapp3.1 | Windows |
| SDK New Console | emptyconsoletemplate | N/A | test.py sdk new_console | post.py | netcoreapp2.1;netcoreapp3.1;net5.0;net6.0 | Windows;Linux |

Просмотреть файл

@ -15,7 +15,7 @@ The general workflow when using the GC infra is:
NOTE: If running under ARM/ARM64, the program's functionalities are limited to running certain benchmarks only and the setup process is slightly different. This is pointed out as necessary throughout this document. Look out for the _ARM NOTE_ labels. As for the other necessary tools without official ARM/ARM64 downloads (e.g. python, cmake), you can install and run the x86 versions.
# Setup
## Setup
### Install python 3.7+
@ -23,7 +23,7 @@ You will need at least version 3.7 of Python.
WARN: Python 3.8.0 is [not compatible](https://github.com/jupyter/notebook/issues/4613) with Jupyter Notebook on Windows.
This should be fixed in 3.8.1.
On Windows, just go to https://www.python.org/downloads/ and run the installer.
On Windows, just go to <https://www.python.org/downloads/> and run the installer.
It's recommended to install a 64-bit version if possible, but not required.
On other systems, its better to use your systems package manager.
@ -97,7 +97,7 @@ total_physical_memory_mb:
Most (if not all) of these fields can be retrieved from your machine's _Task Manager_ and under _System_ within _Control Panel_.
# Tutorial
## Tutorial
## Specifying tests/builds
@ -212,13 +212,13 @@ On Linux, only tests with containers require super user privileges.
You might get errors due to `dotnet` or `dotnet-trace` not being found. Or you might see an error:
```
```text
A fatal error occurred. The required library libhostfxr.so could not be found.
```
Or:
```
```text
A fatal error occurred, the default install location cannot be obtained.
```
@ -382,7 +382,7 @@ Now you know how to create, run, and analyze a test.
In many cases, all you need to use the infra is to manually modify a benchfile, then `run` and `diff` it.
# Metrics
## Metrics
Analysis commands are based on metrics.
@ -403,6 +403,7 @@ py . analyze-single bench/suite/low_memory_container.yaml.out/defgcperfsim__a__o
```
Alternatively, the can also be done by using the process and path arguments:
```sh
py . analyze-single --process name:corerun --path bench\suite\low_memory_container.etl.out\defgcperfsim__a__only_config__tlgb0.2__0.etl --run-metrics FirstToLastGCSeconds --single-gc-metrics DurationMSec --single-heap-metrics InMB OutMB
```
@ -487,11 +488,11 @@ The output will look like:
As you can see, the run-metrics appear only once for the whole trace, the single-gc-metrics have different values for each GC, and the single-heap-metrics have a different value for each different heap in each GC.
# GCPerfSim
## GCPerfSim
Although benchmarks can run any executable, they will usually run GCPerfSim. You can read its documentation in the [source](src/exec/GCPerfSim/GCPerfSim.cs).
# Running Without Traces
## Running Without Traces
Normally tests are run while collecting events for advanced analysis.
@ -499,12 +500,12 @@ If you set `collect: none` in the `options` section of your [benchfile](docs/ben
If you don't have a trace, you are limited in the metrics you can use. No single-heap or single-gc-metrics are available since individual GCs aren't collected. However, GCPerfSim outputs information at the end which is stored in the test status file (a `.yaml` file with the same name as the trace file would have). You can view those metrics in the section "float metrics that only require test status" [here](docs/metrics.md).
# Limitations
## Limitations
* ARM/ARM64 are only supported to run basic tests (See above for further details).
* The `affinitize` and `memory_load_percent` properties of a benchfile's config are not yet implemented outside of Windows.
# Further Reading
## Further Reading
See [example](docs/example.md) for a more detailed example involving more commands.
@ -515,7 +516,7 @@ Before modifying benchfiles, you should read [bench_file](docs/bench_file.md) wh
Commands can be run in a Jupyter notebook instead of on the command line. See [jupyter notebook](docs/jupyter%20notebook.md).
# Terms
## Terms
### Metric
@ -548,6 +549,6 @@ May be an ETL or netperf file.
ETL files come from using PerfView to collect ETW events, which is the default on Windows.
Netperf files come from using dotnet-trace, which uses EventPipe. This is the only option on non-Windows systems.
# Contributing
## Contributing
See [contributing](docs/contributing.md).

Просмотреть файл

@ -1,3 +1,5 @@
# Benchfile Description
(This file is generated by `py . lint`)
A benchfile lets us vary three different things: coreclrs, configs, and benchmarks.
@ -52,9 +54,7 @@ benchmarks:
```
# Detailed documentation of each type
## Detailed documentation of each type
## BenchFile
@ -98,8 +98,6 @@ benchmarks: `Mapping[str, [Benchmark](#Benchmark)]`
scores: `Mapping[str, Mapping[str, [ScoreElement](#ScoreElement)]] | None`
Mapping from an (arbitrary) score name to its specifier.
## BenchOptions
collect: `"none" | "gc" | "verbose" | "cpu_samples" | "thread_times" | None`
@ -147,8 +145,6 @@ always_use_dotnet_trace: `bool | None`
Has no effect on non-Windows.
Has no effect if `collect` is `none`.
## Benchmark
executable: `str | None`
@ -171,9 +167,8 @@ max_seconds: `float | None`
only_configs: `Sequence[str] | None`
Only run the test against configs with one of the specified names.
## Config
Allows to set environment variables, and container and memory load options.
WARN: Normally complus environment variables are specified in hexadecimal on the command line.
But when specifying them in a yaml file, use decimal.
@ -232,13 +227,13 @@ complus_tieredcompilation: `bool | None`
Set to true to enable tiered compilation
complus_bgcfltuningenabled: `bool | None`
Set to true to enable https://github.com/dotnet/coreclr/pull/26695
Set to true to enable <https://github.com/dotnet/coreclr/pull/26695>
complus_bgcmemgoal: `int | None`
See comment on https://github.com/dotnet/coreclr/pull/26695
See comment on <https://github.com/dotnet/coreclr/pull/26695>
complus_bgcmemgoalslack: `int | None`
See comment on https://github.com/dotnet/coreclr/pull/26695
See comment on <https://github.com/dotnet/coreclr/pull/26695>
complus_gcconcurrentfinalization: `bool | None`
Enable concurrent finalization (not available in normal coreclr builds)
@ -259,8 +254,6 @@ coreclr_specific: `Mapping[str, [ConfigOptions](#ConfigOptions)] | None`
Maps coreclr name to config options for only that coreclr.
If present, should have an entry for every coreclr.
## ConfigOptions
complus_gcserver: `bool | None`
@ -317,13 +310,13 @@ complus_tieredcompilation: `bool | None`
Set to true to enable tiered compilation
complus_bgcfltuningenabled: `bool | None`
Set to true to enable https://github.com/dotnet/coreclr/pull/26695
Set to true to enable <https://github.com/dotnet/coreclr/pull/26695>
complus_bgcmemgoal: `int | None`
See comment on https://github.com/dotnet/coreclr/pull/26695
See comment on <https://github.com/dotnet/coreclr/pull/26695>
complus_bgcmemgoalslack: `int | None`
See comment on https://github.com/dotnet/coreclr/pull/26695
See comment on <https://github.com/dotnet/coreclr/pull/26695>
complus_gcconcurrentfinalization: `bool | None`
Enable concurrent finalization (not available in normal coreclr builds)
@ -340,8 +333,6 @@ affinitize: `bool | None`
memory_load: `[MemoryLoadOptions](#MemoryLoadOptions) | None`
If set, the test runner will launch a second process that ensures this percentage of the system's memory is consumed.
## ConfigsVaryBy
name: `str`
@ -351,8 +342,6 @@ default_values: `Mapping[str, float] | None`
Value that coreclr would use without an explicit config.
Key is a coreclr name.
## CoreclrSpecifier
self_contained: `bool | None`
@ -386,9 +375,8 @@ architecture: `"amd64" | "x86" | "arm64" | "arm32" | None`
CoreRun.exe has the correct bitness corresponding to this architecture.
On non-Windows this is just for information.
## GCPerfSimArgs
Represents the arguments to GCPerfSim.
Read the GCPerfSim source for documentation.
@ -412,7 +400,6 @@ pohfi: `int`
allocType: `"simple" | "reference"`
testKind: `"time" | "highSurvival"`
## MemoryLoadOptions
percent: `float`
@ -422,8 +409,6 @@ no_readjust: `bool | None`
If true, the memory load process will never allocate or free any more memory after it's started.
If false, it will allocate or free in order to keep the system's memory at `percent`.
## ScoreElement
weight: `float`
@ -439,9 +424,8 @@ weight: `float`
par: `float | None`
Expected normal value for this metric.
## TestConfigContainer
Options for running the test in a container.
A container is a cgroup, or a job object on windows.
Docker containers are not yet implemented.

Просмотреть файл

@ -1,6 +1,6 @@
# Building Core_Root from the Runtime Repo
First, you need a clone of https://github.com/dotnet/runtime.
First, you need a clone of [dotnet/runtime](https://github.com/dotnet/runtime).
Note that it may be on an arbitrary commit (including one you might be currently working on).
Once you are ready to build the `Core_Root`, follow these steps.
@ -17,11 +17,13 @@ If you want to build for another OS/Architecture, use the `-os/--os` and `-arch/
From the root of the `runtime` repo, issue the following command:
On Windows
```powershell
.\build.cmd -s clr+libs -lc Release -rc Release
```
On Linux
```sh
./build.sh -s clr+libs -lc Release -rc Release
```
@ -45,11 +47,13 @@ At this point, the stage has been set to generate the `Core_Root`.
This time, move to the tests directory: `/runtime/src/tests/`. There, issue the following command:
On Windows
```powershell
.\build.cmd Release generatelayoutonly
```
On Linux
```sh
./build.sh -release -generatelayoutonly
```

Просмотреть файл

@ -9,15 +9,12 @@ Each command can have up to one “name optional” argument, which should come
If an argument is a list (or tuple), the values should be separated by spaces. E.g., `py . diff a.etl b.etl --run-metrics HeapSizeBeforeMB_Mean HeapSizeAfterMB_Mean` . Here `a.etl` and `b.etl` are the values for the name-optional argument (`paths`), and `HeapSizeBeforeMB_Mean` and `HeapSizeAfterMB_Mean` are the values for the argument run-metrics .
## Boolean arguments
A boolean argument can be specified like `--arg true` or `--arg false`.
For convenience, `--arg` is shorthand for `--arg true` and all boolean arguments are optional and default to false. So if the argument exists it is true, else it is false.
## Argsfiles
For convenience, you could store some arguments in a file.
@ -43,12 +40,13 @@ And then run:
py . diff --argsfile bench/diff_low_memory_container.yaml
```
## gc-where arguments
Many commands have arguments ending in `gc-where` which take the following syntax:
py . analyze-single foo.etl --gc-where "Generation=2" "PauseDurationMSec>100"
```py
py . analyze-single foo.etl --gc-where "Generation=2" "PauseDurationMSec>100"
```
In this case the argument filters it so we only print GCs that are gen 2 and took over 100ms.
The value of `gc-where` consists of a number of space-separated filters.
@ -60,7 +58,9 @@ The value may be a number or (unquoted) string.
You can also put any number of `or` in between the clauses, as in:
py . analyze-single foo.etl --gc-where "Generation=2" "PauseDurationMSec>100" or "Generation=0" "PauseDurationMSec<10"
```py
py . analyze-single foo.etl --gc-where "Generation=2" "PauseDurationMSec>100" or "Generation=0" "PauseDurationMSec<10"
```
Which would print GCs that are either long gen2 or short gen0.
Any more complicated filters should be written manually in the code.

Просмотреть файл

@ -1,4 +1,6 @@
# Adding commands
# Contribution guide
## Adding commands
To add a new command, you need to add it to the `ALL_COMMANDS` mapping in `all_commands.py`.
@ -6,7 +8,7 @@ That file contains a sample "greet" command which shows how to create a command.
If the command outputs to the console, it's recommended to create a `Document` (from `document.py`) and then call `print_document`, instead of calling `print` directly and formatting text yourself. This makes it easier to construct tables and will format your text to the terminal's width.
# Adding a New Metric
## Adding a New Metric
You'll need to modify `run_metrics.md`, `single_gc_metrics.md` or `single_heap_metrics.md`.
@ -16,7 +18,7 @@ The value is some function type. Preferably, make this function an instance prop
Metrics always return a `Result` -- this allows the metric to fail without causing an entire command to exit with an exception. `FloatValue` is a result of a `float`. If the metric can't fail, return e.g. `float` instead of `FloatValue`, and convert the function to a `Result`-returning function with `ok_of_property`.
# Code Quality
## Code Quality
We should make sure the code is clean with respect to the linter. Run `py . lint` to make sure it is clean. For now, do not worry about upgrading the dependencies as suggested by the linter, it won't work.
@ -24,18 +26,18 @@ When GCPerfSim is modified, it is important to run the full default suite with b
A full example on how to do this is [found here](modifying_and_testing_gcperfsim.md).
# C# and C dependencies
## C# and C dependencies
Non-Python code is handled by `build.py` which builds C# and C dependencies.
When you modify C# or C code (or dlls they depend on), they should automatically be rebuilt.
The code for building C dependencies is Windows-specific as currently only Windows needs these dependencies.
# Using a Custom TraceEvent
## Using a Custom TraceEvent
You may need to modify TraceEvent (which is part of PerfView) when working `managed-lib`, which uses it heavily. To do this:
* Check out the PerfView repository, make your changes, and build.
* In `src/analysis/managed-lib/GCPerf.csproj`, you can see there are two dependencies lists, one tagged `<!-- NUGET -->` and one tagged `<!-- LOCAL -->`. You can comment out the `NUGET` one and use `LOCAL` instead.
* In `src/analysis/managed-lib/GCPerf.csproj`, you can see there are two dependencies lists, one tagged `<!-- NUGET -->` and one tagged `<!-- LOCAL -->`. You can comment out the `NUGET` one and use `LOCAL` instead.
* Run `py . update-perfview-dlls path/to/perfview`, where `path/to/perfview` is the path to your PerfView checkout.
This does not have any effect immediately, you'll still need to do a rebuild in a later step.
* Uncomment `#define NEW_JOIN_ANALYSIS` in `Analysis.cs` and `MoreAnalysis.cs` in `src/analysis/managed-lib`.

Просмотреть файл

@ -36,7 +36,7 @@ The following functionalities are supported:
for a function within a given time range.
### Requirements and Setup
To begin with, capture a trace with CPU Samples enabled. If you use GC Infra, then
you're good to go. If you capture it elsewhere, make sure to include the `process_name`
and the `seconds_taken` fields in the [test status yaml file](test_status_files.md).
@ -292,4 +292,4 @@ functions:
* To validate: Use `.is_ok()` or `.is_err()`.
* To extract the value: Use `.value`.
* If you know what behavior happened, you can also use `.ok()` and `.err()` respectively.
* If you know what behavior happened, you can also use `.ok()` and `.err()` respectively.

Просмотреть файл

@ -1,3 +1,5 @@
# Metrics
(This file is generated by `py . lint`)
The following list does not include aggregate metrics.
@ -22,8 +24,7 @@ For example, `PctIsBlockingGen2` is a run-metric aggregating the single-gc-metri
A boolean aggregate can have a `Where` part too, as in `PctIsNonConcurrentWhereIsGen2`,
which tells you what percentage of Gen2 gcs where blocking.
# single-heap-metrics
## single-heap-metrics
no bool metrics
@ -135,7 +136,7 @@ TotalJoinMSec
TotalMarkMSec
TotalMarkPromotedMB
TotalStolenMSec
Sum of each time the processor was stolen for this heap's thread.
Sum of each time the processor was stolen for this heap's thread.
adjust_handle_age_compact
adjust_handle_age_sweep
after_absorb
@ -190,8 +191,7 @@ verify_objects_done
waiting_in_join
working
# single-gc-metrics
## single-gc-metrics
## bool metrics
@ -326,12 +326,11 @@ StartMSec
SuspendDurationMSec
SuspendToGCStartMSec
TotalGCTime
WARN: Only works in increments of 1MS, may error on smaller GCs
WARN: Only works in increments of 1MS, may error on smaller GCs
Type
Value of GCType enum
Value of GCType enum
# run-metrics
## run-metrics
no bool metrics

Просмотреть файл

@ -2,10 +2,12 @@
You can try `py -m pip install pythonnet`. If that works, you're done, but it may fail with:
error: option --single-version-externally-managed not recognized
```text
error: option --single-version-externally-managed not recognized
```
The fix is to install from source.
Instructions are at: https://github.com/pythonnet/pythonnet/wiki/Installation.
Instructions can be found [here](https://github.com/pythonnet/pythonnet/wiki/Installation).
## Troubleshooting on all operating systems
@ -18,7 +20,6 @@ To fix this, go to the *parent* directory and use `sudo python3.7 -m pip install
To verify that installation worked, run `import clr` in the python interpreter.
## Troubleshooting on Windows
### Remove shebang
@ -27,31 +28,34 @@ First: Running `py setup.py ...` may do nothing due to the shebang at the start
launching `python`, which opens the Windows Store or does nothing if given arguments.
This may be fixed by removing the shebang `#!/usr/bin/env python` at the start of `setup.py`.
### You may need VS2015
There may be an error like:
Cannot find the specified version of msbuild: '14'
```text
Cannot find the specified version of msbuild: '14'
```
or:
Could not load file or assembly 'Microsoft.Build.Utilities, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'
```text
Could not load file or assembly 'Microsoft.Build.Utilities, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'
```
(The latter error message often begins by mentioning `RGiesecke.DllExport.targets`.)
If so, you may need to install Visual Studio 2015 (exactly, not a higher version).
(See https://my.visualstudio.com/Downloads?q=visual%20studio%202015)
If so, you may need to install Visual Studio 2015 (exactly, not a higher version, found [here](https://my.visualstudio.com/Downloads?q=visual%20studio%202015))
This is not used as the main build tool, but it installs components that are apparently required.
### Modify setup.py to support a newer VS
You may see an error:
MSBuild >=15.0 could not be found.
```text
MSBuild >=15.0 could not be found.
```
Actually, pythonnet's `setup.py` is hardcoded to work *only* with VS 15 (which is VS 2017) and not with higher versions such as 16 (VS 2019).
(Also, "Visual Studio Build Tools 2017" alone does not seem to be sufficient, you need the full IDE.)
But you can change this by modifying `setup.py`.
@ -68,18 +72,20 @@ return res
There don't appear to be any problems from using 2019 instead of 2017.
## Troubleshooting on non-Windows systems
### Downgrade mono
Pythonnet [does not work](https://github.com/pythonnet/pythonnet/issues/939) with the latest version of mono, so you'll need to downgrade that to version 5.
On Ubuntu the instructions are:
* Change `/etc/apt/sources.list.d/mono-official-stable.list` to:
```
```bash
deb https://download.mono-project.com/repo/ubuntu stable-bionic/snapshots/5.20.1 main
```
* `sudo apt remove mono-complete`
* `sudo apt update`
* `sudo apt autoremove`
@ -88,13 +94,13 @@ deb https://download.mono-project.com/repo/ubuntu stable-bionic/snapshots/5.20.1
Then to install from source:
### May be missing Python dev tools
If you see an error:
```
```text
fatal error: Python.h: No such file or directory
```
You likely have python installed but not development tools.
See https://stackoverflow.com/questions/21530577/fatal-error-python-h-no-such-file-or-directory .
See <https://stackoverflow.com/questions/21530577/fatal-error-python-h-no-such-file-or-directory> .

Просмотреть файл

@ -7,14 +7,14 @@ This folder contains benchmarks of the most popular serializers.
* XML
* [XmlSerializer](https://docs.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlserializer) `4.3.0`
* JSON
* [DataContractJsonSerializer](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.json.datacontractjsonserializer) `4.3.0`
* [Jil](https://github.com/kevin-montrose/Jil) `2.17.0`
* [JSON.NET](https://github.com/JamesNK/Newtonsoft.Json) `12.0.2`
* [Utf8Json](https://github.com/neuecc/Utf8Json) `1.3.7`
* [DataContractJsonSerializer](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.json.datacontractjsonserializer) `4.3.0`
* [Jil](https://github.com/kevin-montrose/Jil) `2.17.0`
* [JSON.NET](https://github.com/JamesNK/Newtonsoft.Json) `12.0.2`
* [Utf8Json](https://github.com/neuecc/Utf8Json) `1.3.7`
* Binary
* [BinaryFormatter](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.formatters.binary.binaryformatter) `4.3.0`
* [MessagePack](https://github.com/neuecc/MessagePack-CSharp) `1.7.3.7`
* [protobuff-net](https://github.com/mgravell/protobuf-net) `2.4.0`
* [BinaryFormatter](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.formatters.binary.binaryformatter) `4.3.0`
* [MessagePack](https://github.com/neuecc/MessagePack-CSharp) `1.7.3.7`
* [protobuff-net](https://github.com/mgravell/protobuf-net) `2.4.0`
Missing: ProtoBuff from Google and BOND from MS
@ -30,4 +30,4 @@ Data Contracts were copied from a real Web App - [allReady](https://github.com/H
## Design Decisions
1. We want to compare "apples to apples", so the benchmarks are divided into few groups: `ToStream`, `FromStream`, `ToString`, `FromString`.
2. Stream benchmarks write to pre-allocated MemoryStream, so the allocated bytes columns include only the cost of serialization.
2. Stream benchmarks write to pre-allocated MemoryStream, so the allocated bytes columns include only the cost of serialization.

Просмотреть файл

@ -4,7 +4,7 @@ MICROSOFT PROVIDES THE DATASETS ON AN "AS IS" BASIS. MICROSOFT MAKES NO WARRANTI
The datasets are provided under the original terms that Microsoft received such datasets. See below for more information about each dataset.
### Wikipedia Detox
## Wikipedia Detox
>This dataset is provided under [CC0](https://creativecommons.org/share-your-work/public-domain/cc0/). Redistributing the dataset "wikipedia-detox-250-line-data.tsv" with attribution:
>
@ -16,7 +16,7 @@ The datasets are provided under the original terms that Microsoft received such
>
>Original readme: https://meta.wikimedia.org/wiki/Research:Detox
### UCI Iris Flower Dataset
## UCI Iris Flower Dataset
>Redistributing the dataset "iris.txt" with attribution:
>
@ -26,17 +26,17 @@ The datasets are provided under the original terms that Microsoft received such
>
>https://archive.ics.uci.edu/ml/datasets/iris
### Breast Cancer Wisconsin
## Breast Cancer Wisconsin
Redistributing the dataset "breast-cancer.txt" with attribution:
> O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18.
>
> Original source: http://ftp.cs.wisc.edu:80/math-prog/cpo-dataset/machine-learn/cancer/cancer1/datacum
> Original source: http://ftp.cs.wisc.edu:80/math-prog/cpo-dataset/machine-learn/cancer/cancer1/datacum
>
> Original readme: http://ftp.cs.wisc.edu/math-prog/cpo-dataset/machine-learn/cancer/cancer1/data.doc
> Original readme: http://ftp.cs.wisc.edu/math-prog/cpo-dataset/machine-learn/cancer/cancer1/data.doc
### UCI Adult Dataset
## UCI Adult Dataset
> Redictributing the dataset "adult.tiny.with-schema.txt" with attribution:
>

Просмотреть файл

@ -3,17 +3,20 @@
This simple tool allows for easy comparison of provided benchmark results.
It can be used to compare:
* historical results (eg. before and after my changes)
* results for different OSes (eg. Windows vs Ubuntu)
* results for different CPU architectures (eg. x64 vs ARM64)
* results for different target frameworks (eg. .NET Core 3.1 vs 5.0)
All you need to provide is:
* `--base` - path to folder/file with baseline results
* `--diff` - path to folder/file with diff results
* `--threshold` - threshold for Statistical Test. Examples: 5%, 10ms, 100ns, 1s
Optional arguments:
* `--top` - filter the diff to top/bottom `N` results
* `--noise` - noise threshold for Statistical Test. The difference for 1.0ns and 1.1ns is 10%, but it's just a noise. Examples: 0.5ns 1ns. The default value is 0.3ns.
* `--csv` - path to exported CSV results. Optional.