Docs - Update SuperBench documents (#101)

Update SuperBench documents.
This commit is contained in:
Yifan Xiong 2021-06-25 13:40:14 +08:00 коммит произвёл GitHub
Родитель c0c43b8f81
Коммит 832e392f91
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
13 изменённых файлов: 780 добавлений и 298 удалений

Просмотреть файл

@ -11,6 +11,9 @@ insert_final_newline = true
[*.py]
max_line_length = 120
[*.{js,jsx,ts,tsx,md,mdx,css}]
indent_size = 2
[*.{yml,yaml}]
indent_size = 2

301
README.md
Просмотреть файл

@ -1,4 +1,4 @@
# SuperBenchmark
# SuperBench
[![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE)
[![Lint](https://github.com/microsoft/superbenchmark/workflows/Lint/badge.svg)](https://github.com/microsoft/superbenchmark/actions?query=workflow%3ALint)
@ -9,304 +9,9 @@
| cpu-unit-test | [![Build Status](https://dev.azure.com/msrasrg/SuperBenchmark/_apis/build/status/microsoft.superbenchmark?branchName=main)](https://dev.azure.com/msrasrg/SuperBenchmark/_build/latest?definitionId=77&branchName=main) |
| gpu-unit-test | [![Build Status](https://dev.azure.com/msrasrg/SuperBenchmark/_apis/build/status/cuda-unit-test?branchName=main)](https://dev.azure.com/msrasrg/SuperBenchmark/_build/latest?definitionId=80&branchName=main) |
__SuperBench__ is a validation and profiling tool for AI infrastructure.
**SuperBench** is a validation and profiling tool for AI infrastructure, which supports:
* AI infrastructure validation and diagnosis
* Distributed validation tools to validate hundreds or thousands of servers automatically
* Consider both raw hardware and E2E model performance with ML workload patterns
* Build a contract to identify hardware issues
* Provide infrastructural-oriented criteria as Performance/Quality Gates for hardware and system release
* Provide detailed performance report and advanced analysis tool
* AI workload benchmarking and profiling
* Provide comprehensive performance comparison between different existing hardware
* Provide insights for hardware and software co-design
It includes micro-benchmark for primitive computation and communication benchmarking,
and model-benchmark to measure domain-aware end-to-end deep learning workloads.
> 🔴 __Note__:
SuperBench is in the early pre-alpha stage for open source, and not ready for general public yet.
If you want to jump in early, you can try building latest code yourself.
## SuperBench capabilities, workflow and benchmarking metrics
The following graphic shows the capabilities provide by SuperBench core framework and its extension.
<img src="imgs/superbench_structure.png">
Benchmarking metrics provided by SuperBench are listed as below.
<table>
<tbody>
<tr align="center" valign="bottom">
<td>
</td>
<td>
<b>Micro Benchmark</b>
<img src="imgs/bar.png"/>
</td>
<td>
<b>Model Benchmark</b>
<img src="imgs/bar.png"/>
</td>
</tr>
<tr valign="top">
<td align="center" valign="middle">
<b>Metrics</b>
</td>
<td>
<ul><li><b>Computation Benchmark</b></li>
<ul><li><b>Kernel Performance</b></li>
<ul>
<li>GFLOPS</li>
<li>TensorCore</li>
<li>cuBLAS</li>
<li>cuDNN</li>
</ul>
</ul>
<ul><li><b>Kernel Launch Time</b></li>
<ul>
<li>Kernel_Launch_Event_Time</li>
<li>Kernel_Launch_Wall_Time</li>
</ul>
</ul>
<ul><li><b>Operator Performance</b></li>
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
</ul>
<ul><li><b>Memory</b></li>
<ul><li>H2D_Mem_BW_&lt;GPU ID&gt;</li>
<li>H2D_Mem_BW_&lt;GPU ID&gt;</li></ul>
</ul>
</ul>
<ul><li><b>Communication Benchmark</b></li>
<ul><li><b>Device P2P Bandwidth</b></li>
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
</ul>
<ul><li><b>RDMA</b></li>
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
</ul>
<ul><li><b>NCCL</b></li>
<ul><li>NCCL_AllReduce</li></ul>
<ul><li>NCCL_AllGather</li></ul>
<ul><li>NCCL_broadcast</li></ul>
<ul><li>NCCL_reduce</li></ul>
<ul><li>NCCL_reduce_scatter</li></ul>
</ul>
</ul>
<ul><li><b>Computation-Communication Benchmark</b></li>
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
</ul>
<ul><li><b>Storage Benchmark</b></li>
<ul><li><b>Disk</b></li>
<ul>
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
</ul>
</ul>
</ul>
</td>
<td>
<ul><li><b>CNN models</b></li>
<ul>
<li><b>ResNet</b></li>
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
</ul>
<ul>
<li><b>DenseNet</b></li>
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
</ul>
<ul>
<li><b>VGG</b></li>
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
</ul>
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
</ul>
<ul><li><b>BERT models</b></li>
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
</ul>
<ul><li><b>LSTM</b></li></ul>
<ul><li><b>GPT-2</b></li></ul>
</td>
</tr>
</tbody>
</table>
## Installation
### Using Docker (_Preferred_)
__System Requirements__
* Platform: Ubuntu 18.04 or later (64-bit)
* Docker: Docker CE 19.03 or later
__Install SuperBench__
* Using Pre-Build Images
```sh
docker pull superbench/superbench:dev-cuda11.1.1
docker run -it --rm \
--privileged --net=host --ipc=host --gpus=all \
superbench/superbench:dev-cuda11.1.1 bash
```
* Building the Image
```sh
docker build -f dockerfile/cuda11.1.1.dockerfile -t superbench/superbench:dev .
```
### Using Python
__System Requirements__
* Platform: Ubuntu 18.04 or later (64-bit); Windows 10 (64-bit) with WSL2
* Python: Python 3.6 or later, pip 18.0 or later
Check whether Python environment is already configured:
```sh
# check Python version
python3 --version
# check pip version
python3 -m pip --version
```
If not, install the followings:
* [Python](https://www.python.org/)
* [pip](https://pip.pypa.io/en/stable/installing/)
* [venv](https://docs.python.org/3/library/venv.html)
It's recommended to use a virtual environment (optional):
```sh
# create a new virtual environment
python3 -m venv --system-site-packages ./venv
# activate the virtual environment
source ./venv/bin/activate
# exit the virtual environment later
# after you finish running superbench
deactivate
```
__Install SuperBench__
* PyPI Binary
```sh
# not available yet
```
* From Source
```sh
# get source code
git clone https://github.com/microsoft/superbenchmark
cd superbenchmark
# install superbench
python3 -m pip install .
make postinstall
```
## Usage
### Run SuperBench
```sh
# run benchmarks in default settings
sb exec
# use a custom config
sb exec --config-file ./superbench/config/default.yaml
```
### Benchmark Gallary
Please find more benchmark examples [here](examples/benchmarks/).
## Developer Guide
If you want to develop new feature, please follow below steps to set up development environment.
### Check Environment
Follow __[System Requirements](#using-python)__.
### Set Up
```sh
# get latest code
git clone https://github.com/microsoft/superbenchmark
cd superbenchmark
# install superbench
python3 -m pip install -e .[dev,test]
```
### Lint and Test
```sh
# format code using yapf
python3 setup.py format
# check code style with mypy and flake8
python3 setup.py lint
# run all unit tests
python3 setup.py test
```
### Submit a Pull Request
Please install `pre-commit` before `git commit` to run all pre-checks.
```sh
pre-commit install
```
Open a pull request to main branch on GitHub.
## Contributing
### Contributor License Agreement
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
### Contributing principles
SuperBenchmark is an open-source project. Your participation and contribution are highly appreciated. There are several important things you need know before contributing to this project:
#### What content can be added to SuperBenchmark
1. Bug fixes for existing features.
2. New features for benchmark module (micro-benchmark, model-benchmark, etc.)
If you would like to contribute a new feature on SuperBenchmark, please submit your proposal first. In [GitHub Issues](https://github.com/microsoft/superbenchmark/issues) module, choose `Enhancement Request` to finish the submission. If the proposal is accepted, you can submit pull requests to origin main branch.
#### Contribution steps
If you would like to contribute to the project, please follow below steps of joint development on GitHub.
1. `Fork` the repo first to your personal GitHub account.
2. Checkout from main branch for feature development.
3. When you finish the feature, please fetch the latest code from origin repo, merge to your branch and resolve conflict.
4. Submit pull requests to origin main branch.
5. Please note that there might be comments or questions from reviewers. It will need your help to update the pull request.
_Check SuperBench website for more details._
## Trademarks

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 517 B

После

Ширина:  |  Высота:  |  Размер: 517 B

Просмотреть файл

До

Ширина:  |  Высота:  |  Размер: 40 KiB

После

Ширина:  |  Высота:  |  Размер: 40 KiB

Просмотреть файл

@ -0,0 +1,113 @@
---
id: micro-benchmarks
---
# Micro Benchmarks
## Benchmarking list
### Computation benchmark
### Communication benchmark
### Computation-communication benchmark
### Storage benchmark
## Benchmarking metrics
<table>
<tbody>
<tr align="center" valign="bottom">
<td>
</td>
<td>
<b>Micro Benchmark</b>
<img src={require('../assets/bar.png').default}/>
</td>
<td>
<b>Model Benchmark</b>
<img src={require('../assets/bar.png').default}/>
</td>
</tr>
<tr valign="top">
<td align="center" valign="middle">
<b>Metrics</b>
</td>
<td>
<ul><li><b>Computation Benchmark</b></li>
<ul><li><b>Kernel Performance</b></li>
<ul>
<li>GFLOPS</li>
<li>TensorCore</li>
<li>cuBLAS</li>
<li>cuDNN</li>
</ul>
</ul>
<ul><li><b>Kernel Launch Time</b></li>
<ul>
<li>Kernel_Launch_Event_Time</li>
<li>Kernel_Launch_Wall_Time</li>
</ul>
</ul>
<ul><li><b>Operator Performance</b></li>
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
</ul>
<ul><li><b>Memory</b></li>
<ul><li>H2D_Mem_BW_&lt;GPU ID&gt;</li>
<li>H2D_Mem_BW_&lt;GPU ID&gt;</li></ul>
</ul>
</ul>
<ul><li><b>Communication Benchmark</b></li>
<ul><li><b>Device P2P Bandwidth</b></li>
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
</ul>
<ul><li><b>RDMA</b></li>
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
</ul>
<ul><li><b>NCCL</b></li>
<ul><li>NCCL_AllReduce</li></ul>
<ul><li>NCCL_AllGather</li></ul>
<ul><li>NCCL_broadcast</li></ul>
<ul><li>NCCL_reduce</li></ul>
<ul><li>NCCL_reduce_scatter</li></ul>
</ul>
</ul>
<ul><li><b>Computation-Communication Benchmark</b></li>
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
</ul>
<ul><li><b>Storage Benchmark</b></li>
<ul><li><b>Disk</b></li>
<ul>
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
</ul>
</ul>
</ul>
</td>
<td>
<ul><li><b>CNN models</b></li>
<ul>
<li><b>ResNet</b></li>
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
</ul>
<ul>
<li><b>DenseNet</b></li>
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
</ul>
<ul>
<li><b>VGG</b></li>
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
</ul>
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
</ul>
<ul><li><b>BERT models</b></li>
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
</ul>
<ul><li><b>LSTM</b></li></ul>
<ul><li><b>GPT-2</b></li></ul>
</td>
</tr>
</tbody>
</table>

Просмотреть файл

@ -0,0 +1,113 @@
---
id: model-benchmarks
---
# Model Benchmarks
## Benchmarking list
### GPT-2 models
### BERT models
### LSTM models
### CNN models
## Benchmarking metrics
<table>
<tbody>
<tr align="center" valign="bottom">
<td>
</td>
<td>
<b>Micro Benchmark</b>
<img src={require('../assets/bar.png').default}/>
</td>
<td>
<b>Model Benchmark</b>
<img src={require('../assets/bar.png').default}/>
</td>
</tr>
<tr valign="top">
<td align="center" valign="middle">
<b>Metrics</b>
</td>
<td>
<ul><li><b>Computation Benchmark</b></li>
<ul><li><b>Kernel Performance</b></li>
<ul>
<li>GFLOPS</li>
<li>TensorCore</li>
<li>cuBLAS</li>
<li>cuDNN</li>
</ul>
</ul>
<ul><li><b>Kernel Launch Time</b></li>
<ul>
<li>Kernel_Launch_Event_Time</li>
<li>Kernel_Launch_Wall_Time</li>
</ul>
</ul>
<ul><li><b>Operator Performance</b></li>
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
</ul>
<ul><li><b>Memory</b></li>
<ul><li>H2D_Mem_BW_&lt;GPU ID&gt;</li>
<li>H2D_Mem_BW_&lt;GPU ID&gt;</li></ul>
</ul>
</ul>
<ul><li><b>Communication Benchmark</b></li>
<ul><li><b>Device P2P Bandwidth</b></li>
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
</ul>
<ul><li><b>RDMA</b></li>
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
</ul>
<ul><li><b>NCCL</b></li>
<ul><li>NCCL_AllReduce</li></ul>
<ul><li>NCCL_AllGather</li></ul>
<ul><li>NCCL_broadcast</li></ul>
<ul><li>NCCL_reduce</li></ul>
<ul><li>NCCL_reduce_scatter</li></ul>
</ul>
</ul>
<ul><li><b>Computation-Communication Benchmark</b></li>
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
</ul>
<ul><li><b>Storage Benchmark</b></li>
<ul><li><b>Disk</b></li>
<ul>
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
</ul>
</ul>
</ul>
</td>
<td>
<ul><li><b>CNN models</b></li>
<ul>
<li><b>ResNet</b></li>
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
</ul>
<ul>
<li><b>DenseNet</b></li>
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
</ul>
<ul>
<li><b>VGG</b></li>
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
</ul>
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
</ul>
<ul><li><b>BERT models</b></li>
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
</ul>
<ul><li><b>LSTM</b></li></ul>
<ul><li><b>GPT-2</b></li></ul>
</td>
</tr>
</tbody>
</table>

158
docs/cli.md Normal file
Просмотреть файл

@ -0,0 +1,158 @@
---
id: cli
---
# CLI
SuperBench provides a command line interface to help you use, deploy and run benchmarks.
```
$ sb
_____ ____ _
/ ____| | _ \ | |
| (___ _ _ _ __ ___ _ __| |_) | ___ _ __ ___| |__
\___ \| | | | '_ \ / _ \ '__| _ < / _ \ '_ \ / __| '_ \
____) | |_| | |_) | __/ | | |_) | __/ | | | (__| | | |
|_____/ \__,_| .__/ \___|_| |____/ \___|_| |_|\___|_| |_|
| |
|_|
Welcome to the SB CLI!
```
## SuperBench CLI commands
The following lists `sb` commands usages and examples:
### `sb deploy`
Deploy the SuperBench environments to all managed nodes.
```bash title="SB CLI"
sb deploy [--docker-image]
[--docker-password]
[--docker-username]
[--host-file]
[--host-list]
[--host-password]
[--host-username]
[--private-key]
```
#### Optional arguments
| Name | Default | Description |
| --- | --- | --- |
| `--docker-image` `-i` | `superbench/superbench` | Docker image URI. |
| `--docker-password` | `None` | Docker registry password if authentication is needed. |
| `--docker-username` | `None` | Docker registry username if authentication is needed. |
| `--host-file` `-f` | `None` | Path to Ansible inventory host file. |
| `--host-list` `-l` | `None` | Comma separated host list. |
| `--host-password` | `None` | Host password or key passphase if needed. |
| `--host-username` | `None` | Host username if needed. |
| `--private-key` | `None` | Path to private key if needed. |
#### Global arguments
| Name | Default | Description |
| --- | --- | --- |
| `--help` `-h` | N/A | Show help message. |
#### Examples
Deploy image `superbench/cuda:11.1` to all nodes in `./host.yaml`:
```bash title="SB CLI"
sb deploy --docker-image superbench/cuda:11.1 --host-file ./host.yaml
```
### `sb exec`
Execute the SuperBench benchmarks locally.
```bash title="SB CLI"
sb exec [--config-file]
[--config-override]
```
#### Optional arguments
| Name | Default | Description |
| --- | --- | --- |
| `--config-file` `-c` | `None` | Path to SuperBench config file. |
| `--config-override` `-C` | `None` | Extra arguments to override config_file. |
#### Global arguments
| Name | Default | Description |
| --- | --- | --- |
| `--help` `-h` | N/A | Show help message. |
#### Examples
Execute GPT2 model benchmark in default configuration:
```bash title="SB CLI"
sb exec --config-override superbench.enable="['gpt2_models']"
```
### `sb run`
Run the SuperBench benchmarks distributedly.
```bash title="SB CLI"
sb run [--config-file]
[--config-override]
[--docker-image]
[--docker-password]
[--docker-username]
[--host-file]
[--host-list]
[--host-password]
[--host-username]
[--private-key]
```
#### Optional arguments
| Name | Default | Description |
| --- | --- | --- |
| `--config-file` `-c` | `None` | Path to SuperBench config file. |
| `--config-override` `-C` | `None` | Extra arguments to override config_file. |
| `--docker-image` `-i` | `superbench/superbench` | Docker image URI. |
| `--docker-password` | `None` | Docker registry password if authentication is needed. |
| `--docker-username` | `None` | Docker registry username if authentication is needed. |
| `--host-file` `-f` | `None` | Path to Ansible inventory host file. |
| `--host-list` `-l` | `None` | Comma separated host list. |
| `--host-password` | `None` | Host password or key passphase if needed. |
| `--host-username` | `None` | Host username if needed. |
| `--private-key` | `None` | Path to private key if needed. |
#### Global arguments
| Name | Default | Description |
| --- | --- | --- |
| `--help` `-h` | N/A | Show help message. |
#### Examples
Run all benchmarks on all managed nodes in `./host.yaml` using image `superbench/cuda:11.1`
and default benchmarking configuration:
```bash title="SB CLI"
sb run --docker-image superbench/cuda:11.1 --host-file ./host.yaml
```
### `sb version`
Print the current SuperBench CLI version.
```bash title="SB CLI"
sb version
```
#### Global arguments
| Name | Default | Description |
| --- | --- | --- |
| `--help` `-h` | N/A | Show help message. |
#### Examples
Print version:
```bash title="SB CLI"
sb version
```

Просмотреть файл

@ -0,0 +1,40 @@
---
id: contributing
---
# Contributing
## Contributor License Agreement
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Contributing principles
SuperBenchmark is an open-source project. Your participation and contribution are highly appreciated. There are several important things you need know before contributing to this project:
### What content can be added to SuperBenchmark
1. Bug fixes for existing features.
2. New features for benchmark module (micro-benchmark, model-benchmark, etc.)
If you would like to contribute a new feature on SuperBenchmark, please submit your proposal first. In [GitHub Issues](https://github.com/microsoft/superbenchmark/issues) module, choose `Enhancement Request` to finish the submission. If the proposal is accepted, you can submit pull requests to origin main branch.
### Contribution steps
If you would like to contribute to the project, please follow below steps of joint development on GitHub.
1. `Fork` the repo first to your personal GitHub account.
2. Checkout from main branch for feature development.
3. When you finish the feature, please fetch the latest code from origin repo, merge to your branch and resolve conflict.
4. Submit pull requests to origin main branch.
5. Please note that there might be comments or questions from reviewers. It will need your help to update the pull request.

Просмотреть файл

@ -0,0 +1,47 @@
---
id: development
---
# Development
If you want to develop new feature, please follow below steps to set up development environment.
## Check Environment
Follow [System Requirements](../getting-started/installation.md).
## Set Up
```bash
git clone https://github.com/microsoft/superbenchmark
cd superbenchmark
python3 -m pip install -e .[dev,test]
```
## Lint and Test
Format code using yapf.
```bash
python3 setup.py format
```
Check code style with mypy and flake8
```bash
python3 setup.py lint
```
Run unit tests.
```bash
python3 setup.py test
```
## Submit a Pull Request
Please install `pre-commit` before `git commit` to run all pre-checks.
```bash
pre-commit install
```
Open a pull request to main branch on GitHub.

Просмотреть файл

@ -0,0 +1,154 @@
---
id: configuration
---
# Configuration
## SuperBench config
SuperBench uses a [YAML](https://yaml.org/spec/1.2/spec.html) config file to configure the details of benchmarkings,
including which benchmark to run, which distributing mode to choose, which parameter to use, etc.
Here's what default config file looks like.
```yaml title="superbench/config/default.yaml"
# SuperBench Config
superbench:
enable: null
var:
default_local_mode: &default_local_mode
enable: true
modes:
- name: local
proc_num: 8
prefix: CUDA_VISIBLE_DEVICES={proc_rank}
parallel: yes
default_pytorch_mode: &default_pytorch_mode
enable: true
modes:
- name: torch.distributed
proc_num: 8
node_num: 1
frameworks:
- pytorch
common_model_config: &common_model_config
duration: 0
num_warmup: 16
num_steps: 128
precision:
- float32
- float16
model_action:
- train
benchmarks:
kernel-launch:
<<: *default_local_mode
gemm-flops:
<<: *default_local_mode
cudnn-function:
<<: *default_local_mode
cublas-function:
<<: *default_local_mode
matmul:
<<: *default_local_mode
frameworks:
- pytorch
sharding-matmul:
<<: *default_pytorch_mode
computation-communication-overlap:
<<: *default_pytorch_mode
gpt_models:
<<: *default_pytorch_mode
models:
- gpt2-small
- gpt2-large
parameters:
<<: *common_model_config
batch_size: 4
bert_models:
<<: *default_pytorch_mode
models:
- bert-base
- bert-large
parameters:
<<: *common_model_config
batch_size: 8
lstm_models:
<<: *default_pytorch_mode
models:
- lstm
parameters:
<<: *common_model_config
batch_size: 128
cnn_models:
<<: *default_pytorch_mode
models:
- resnet50
- resnet101
- resnet152
- densenet169
- densenet201
- vgg11
- vgg13
- vgg16
- vgg19
parameters:
<<: *common_model_config
batch_size: 128
```
By default, all benchmarks in default configuration will be run if you don't specify customized configuration.
If you want to have a quick try, you can modify this config a little bit. For example, only run resnet models.
1. copy the default config to a file named `resnet.yaml` in current path.
```bash
cp superbench/config/default.yaml resnet.yaml
```
2. enable only `cnn_models` in the config and remove other models except resnet under `benchmarks.cnn_models.models`.
```yaml {3,10-13} title="resnet.yaml"
# SuperBench Config
superbench:
enable: ['cnn_models']
var:
# ...
# omit the middle part
# ...
cnn_models:
<<: *default_pytorch_mode
models:
- resnet50
- resnet101
- resnet152
parameters:
<<: *common_model_config
batch_size: 128
```
## Ansible Inventory
SuperBench leverages [Ansible](https://docs.ansible.com/ansible/latest/) to run benchmarking workloads on managed nodes,
you need to provide an [inventory](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) file
to configure host list for managed nodes.
Here're some basic examples as your starting point.
* One managed node, same node as control node.
```ini title="local.ini"
[all]
localhost ansible_connection=local
```
* Two managed nodes, one is control node and the other can be remote accessed.
```ini title="mix.ini"
[all]
localhost ansible_connection=local
10.0.0.100 ansible_user=username ansible_ssh_private_key_file=id_rsa
```
* Eight managed nodes, all can be accessed remotely.
```ini title="remote.ini"
[all]
10.0.0.[100:103]
10.0.0.[200:203]
[all:vars]
ansible_user=username
ansible_ssh_private_key_file=id_rsa
```

Просмотреть файл

@ -0,0 +1,83 @@
---
id: installation
---
# Installation
SuperBench is used to run validations for AI infrastructure,
thus you need to prepare one __control node__ which is used to run SuperBench commands,
and one or multiple __managed nodes__ which are going to be validated.
Usually __control node__ could be a CPU node, while __managed nodes__ are GPU nodes with high speed inter-connection.
:::tip Tips
It is fine if you have only one GPU node and want to try SuperBench on it.
Control node and managed node can co-locate on the same machine.
:::
## Control node
Here're the system requirements for control node.
### Requirements
* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
* [Python](https://www.python.org/) version 3.6 or later (which can be checked by running `python3 --version`).
* [Pip](https://pip.pypa.io/en/stable/installing/) version 18.0 or later (which can be checked by running `python3 -m pip --version`).
:::note
Windows is not supported due to lack of Ansible support, but you still can use WSL2.
:::
Besides, control node should be able to access all managed nodes through SSH.
If you are going to use password instead of private key for SSH, you also need to install `sshpass`.
```bash
sudo apt-get install sshpass
```
It is also recommended to use [venv](https://docs.python.org/3/library/venv.html) for virtual environments,
but it is not strictly necessary.
```bash
# create a new virtual environment
python3 -m venv --system-site-packages ./venv
# activate the virtual environment
source ./venv/bin/activate
# exit the virtual environment later
# after you finish running superbench
deactivate
```
### Build
You can clone the source from GitHub and build it.
```bash
git clone https://github.com/microsoft/superbenchmark
cd superbenchmark
python3 -m pip install .
make postinstall
```
After installation, you should be able to run SB CLI.
```bash
sb
```
## Managed nodes
Here're the system requirements for all managed GPU nodes.
### Requirements
* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
* Compatible GPU drivers should be install correctly.
* For NVIDIA GPUs, driver version can be checked by running `nvidia-smi`.
* [Docker CE](https://docs.docker.com/engine/install/) version 19.03 or later (which can be checked by running `docker --version`).
* GPU support in Docker.
* For NVIDIA GPUs, install
[nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit).

Просмотреть файл

@ -0,0 +1,33 @@
---
id: run-superbench
---
# Run SuperBench
Having prepared benchmark configuration and inventory files,
you can start to run SuperBench over all managed nodes.
## Deploy
Leveraging `sb deploy` command, we can easily deploy SuperBench environment to all managed nodes.
After running the following command, SuperBench will automatically access all nodes, pull container image and prepare container.
```bash
sb deploy -f local.ini
```
Alternatively, to run on remote nodes, use the corresponding inventory file instead.
If you are using password for SSH and cannot specify private key in inventory,
or your private key requires a passphase before use, you can do
```bash
sb deploy -f remote.ini --host-password [password]
```
## Run
After deployment, you can start to run the SuperBench benchmarks on all managed nodes using `sb run` command.
```bash
sb run -f local.ini -c resnet.yaml
```

33
docs/introduction.md Normal file
Просмотреть файл

@ -0,0 +1,33 @@
---
id: introduction
---
# Introduction
## Features
__SuperBench__ is a validation and profiling tool for AI infrastructure, which supports:
* AI infrastructure validation and diagnosis
* Distributed validation tools to validate hundreds or thousands of servers automatically
* Consider both raw hardware and E2E model performance with ML workload patterns
* Build a contract to identify hardware issues
* Provide infrastructural-oriented criteria as Performance/Quality Gates for hardware and system release
* Provide detailed performance report and advanced analysis tool
* AI workload benchmarking and profiling
* Provide comprehensive performance comparison between different existing hardware
* Provide insights for hardware and software co-design
It includes micro-benchmark for primitive computation and communication benchmarking,
and model-benchmark to measure domain-aware end-to-end deep learning workloads.
:::note
SuperBench is in the early pre-alpha stage for open source, and not ready for general public yet.
If you want to jump in early, you can try building latest code yourself.
:::
## Overview
The following figure shows the capabilities provide by SuperBench core framework and its extension.
![SuperBench Structure](./assets/superbench_structure.png)