Родитель
c0c43b8f81
Коммит
832e392f91
|
@ -11,6 +11,9 @@ insert_final_newline = true
|
|||
[*.py]
|
||||
max_line_length = 120
|
||||
|
||||
[*.{js,jsx,ts,tsx,md,mdx,css}]
|
||||
indent_size = 2
|
||||
|
||||
[*.{yml,yaml}]
|
||||
indent_size = 2
|
||||
|
||||
|
|
301
README.md
301
README.md
|
@ -1,4 +1,4 @@
|
|||
# SuperBenchmark
|
||||
# SuperBench
|
||||
|
||||
[![MIT licensed](https://img.shields.io/badge/license-MIT-brightgreen.svg)](LICENSE)
|
||||
[![Lint](https://github.com/microsoft/superbenchmark/workflows/Lint/badge.svg)](https://github.com/microsoft/superbenchmark/actions?query=workflow%3ALint)
|
||||
|
@ -9,304 +9,9 @@
|
|||
| cpu-unit-test | [![Build Status](https://dev.azure.com/msrasrg/SuperBenchmark/_apis/build/status/microsoft.superbenchmark?branchName=main)](https://dev.azure.com/msrasrg/SuperBenchmark/_build/latest?definitionId=77&branchName=main) |
|
||||
| gpu-unit-test | [![Build Status](https://dev.azure.com/msrasrg/SuperBenchmark/_apis/build/status/cuda-unit-test?branchName=main)](https://dev.azure.com/msrasrg/SuperBenchmark/_build/latest?definitionId=80&branchName=main) |
|
||||
|
||||
__SuperBench__ is a validation and profiling tool for AI infrastructure.
|
||||
|
||||
**SuperBench** is a validation and profiling tool for AI infrastructure, which supports:
|
||||
|
||||
* AI infrastructure validation and diagnosis
|
||||
* Distributed validation tools to validate hundreds or thousands of servers automatically
|
||||
* Consider both raw hardware and E2E model performance with ML workload patterns
|
||||
* Build a contract to identify hardware issues
|
||||
* Provide infrastructural-oriented criteria as Performance/Quality Gates for hardware and system release
|
||||
* Provide detailed performance report and advanced analysis tool
|
||||
* AI workload benchmarking and profiling
|
||||
* Provide comprehensive performance comparison between different existing hardware
|
||||
* Provide insights for hardware and software co-design
|
||||
|
||||
It includes micro-benchmark for primitive computation and communication benchmarking,
|
||||
and model-benchmark to measure domain-aware end-to-end deep learning workloads.
|
||||
|
||||
> 🔴 __Note__:
|
||||
SuperBench is in the early pre-alpha stage for open source, and not ready for general public yet.
|
||||
If you want to jump in early, you can try building latest code yourself.
|
||||
|
||||
## SuperBench capabilities, workflow and benchmarking metrics
|
||||
|
||||
The following graphic shows the capabilities provide by SuperBench core framework and its extension.
|
||||
|
||||
<img src="imgs/superbench_structure.png">
|
||||
|
||||
Benchmarking metrics provided by SuperBench are listed as below.
|
||||
|
||||
<table>
|
||||
<tbody>
|
||||
<tr align="center" valign="bottom">
|
||||
<td>
|
||||
</td>
|
||||
<td>
|
||||
<b>Micro Benchmark</b>
|
||||
<img src="imgs/bar.png"/>
|
||||
</td>
|
||||
<td>
|
||||
<b>Model Benchmark</b>
|
||||
<img src="imgs/bar.png"/>
|
||||
</td>
|
||||
</tr>
|
||||
<tr valign="top">
|
||||
<td align="center" valign="middle">
|
||||
<b>Metrics</b>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>Computation Benchmark</b></li>
|
||||
<ul><li><b>Kernel Performance</b></li>
|
||||
<ul>
|
||||
<li>GFLOPS</li>
|
||||
<li>TensorCore</li>
|
||||
<li>cuBLAS</li>
|
||||
<li>cuDNN</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Kernel Launch Time</b></li>
|
||||
<ul>
|
||||
<li>Kernel_Launch_Event_Time</li>
|
||||
<li>Kernel_Launch_Wall_Time</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Operator Performance</b></li>
|
||||
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Memory</b></li>
|
||||
<ul><li>H2D_Mem_BW_<GPU ID></li>
|
||||
<li>H2D_Mem_BW_<GPU ID></li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Communication Benchmark</b></li>
|
||||
<ul><li><b>Device P2P Bandwidth</b></li>
|
||||
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>RDMA</b></li>
|
||||
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>NCCL</b></li>
|
||||
<ul><li>NCCL_AllReduce</li></ul>
|
||||
<ul><li>NCCL_AllGather</li></ul>
|
||||
<ul><li>NCCL_broadcast</li></ul>
|
||||
<ul><li>NCCL_reduce</li></ul>
|
||||
<ul><li>NCCL_reduce_scatter</li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Computation-Communication Benchmark</b></li>
|
||||
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Storage Benchmark</b></li>
|
||||
<ul><li><b>Disk</b></li>
|
||||
<ul>
|
||||
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
|
||||
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
|
||||
</ul>
|
||||
</ul>
|
||||
</ul>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>CNN models</b></li>
|
||||
<ul>
|
||||
<li><b>ResNet</b></li>
|
||||
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>DenseNet</b></li>
|
||||
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>VGG</b></li>
|
||||
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
|
||||
</ul>
|
||||
<ul><li><b>BERT models</b></li>
|
||||
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>LSTM</b></li></ul>
|
||||
<ul><li><b>GPT-2</b></li></ul>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
### Using Docker (_Preferred_)
|
||||
|
||||
__System Requirements__
|
||||
|
||||
* Platform: Ubuntu 18.04 or later (64-bit)
|
||||
* Docker: Docker CE 19.03 or later
|
||||
|
||||
__Install SuperBench__
|
||||
|
||||
* Using Pre-Build Images
|
||||
|
||||
```sh
|
||||
docker pull superbench/superbench:dev-cuda11.1.1
|
||||
docker run -it --rm \
|
||||
--privileged --net=host --ipc=host --gpus=all \
|
||||
superbench/superbench:dev-cuda11.1.1 bash
|
||||
```
|
||||
|
||||
* Building the Image
|
||||
|
||||
```sh
|
||||
docker build -f dockerfile/cuda11.1.1.dockerfile -t superbench/superbench:dev .
|
||||
```
|
||||
|
||||
### Using Python
|
||||
|
||||
__System Requirements__
|
||||
|
||||
* Platform: Ubuntu 18.04 or later (64-bit); Windows 10 (64-bit) with WSL2
|
||||
* Python: Python 3.6 or later, pip 18.0 or later
|
||||
|
||||
Check whether Python environment is already configured:
|
||||
```sh
|
||||
# check Python version
|
||||
python3 --version
|
||||
# check pip version
|
||||
python3 -m pip --version
|
||||
```
|
||||
If not, install the followings:
|
||||
* [Python](https://www.python.org/)
|
||||
* [pip](https://pip.pypa.io/en/stable/installing/)
|
||||
* [venv](https://docs.python.org/3/library/venv.html)
|
||||
|
||||
It's recommended to use a virtual environment (optional):
|
||||
```sh
|
||||
# create a new virtual environment
|
||||
python3 -m venv --system-site-packages ./venv
|
||||
# activate the virtual environment
|
||||
source ./venv/bin/activate
|
||||
|
||||
# exit the virtual environment later
|
||||
# after you finish running superbench
|
||||
deactivate
|
||||
```
|
||||
|
||||
__Install SuperBench__
|
||||
|
||||
* PyPI Binary
|
||||
|
||||
```sh
|
||||
# not available yet
|
||||
```
|
||||
|
||||
* From Source
|
||||
|
||||
```sh
|
||||
# get source code
|
||||
git clone https://github.com/microsoft/superbenchmark
|
||||
cd superbenchmark
|
||||
|
||||
# install superbench
|
||||
python3 -m pip install .
|
||||
make postinstall
|
||||
```
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
### Run SuperBench
|
||||
|
||||
```sh
|
||||
# run benchmarks in default settings
|
||||
sb exec
|
||||
|
||||
# use a custom config
|
||||
sb exec --config-file ./superbench/config/default.yaml
|
||||
```
|
||||
|
||||
### Benchmark Gallary
|
||||
|
||||
Please find more benchmark examples [here](examples/benchmarks/).
|
||||
|
||||
|
||||
## Developer Guide
|
||||
|
||||
If you want to develop new feature, please follow below steps to set up development environment.
|
||||
|
||||
### Check Environment
|
||||
|
||||
Follow __[System Requirements](#using-python)__.
|
||||
|
||||
### Set Up
|
||||
|
||||
```sh
|
||||
# get latest code
|
||||
git clone https://github.com/microsoft/superbenchmark
|
||||
cd superbenchmark
|
||||
|
||||
# install superbench
|
||||
python3 -m pip install -e .[dev,test]
|
||||
```
|
||||
|
||||
### Lint and Test
|
||||
|
||||
```sh
|
||||
# format code using yapf
|
||||
python3 setup.py format
|
||||
|
||||
# check code style with mypy and flake8
|
||||
python3 setup.py lint
|
||||
|
||||
# run all unit tests
|
||||
python3 setup.py test
|
||||
```
|
||||
|
||||
### Submit a Pull Request
|
||||
|
||||
Please install `pre-commit` before `git commit` to run all pre-checks.
|
||||
|
||||
```sh
|
||||
pre-commit install
|
||||
```
|
||||
|
||||
Open a pull request to main branch on GitHub.
|
||||
|
||||
|
||||
## Contributing
|
||||
|
||||
### Contributor License Agreement
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
|
||||
### Contributing principles
|
||||
|
||||
SuperBenchmark is an open-source project. Your participation and contribution are highly appreciated. There are several important things you need know before contributing to this project:
|
||||
|
||||
#### What content can be added to SuperBenchmark
|
||||
|
||||
1. Bug fixes for existing features.
|
||||
2. New features for benchmark module (micro-benchmark, model-benchmark, etc.)
|
||||
|
||||
If you would like to contribute a new feature on SuperBenchmark, please submit your proposal first. In [GitHub Issues](https://github.com/microsoft/superbenchmark/issues) module, choose `Enhancement Request` to finish the submission. If the proposal is accepted, you can submit pull requests to origin main branch.
|
||||
|
||||
#### Contribution steps
|
||||
|
||||
If you would like to contribute to the project, please follow below steps of joint development on GitHub.
|
||||
|
||||
1. `Fork` the repo first to your personal GitHub account.
|
||||
2. Checkout from main branch for feature development.
|
||||
3. When you finish the feature, please fetch the latest code from origin repo, merge to your branch and resolve conflict.
|
||||
4. Submit pull requests to origin main branch.
|
||||
5. Please note that there might be comments or questions from reviewers. It will need your help to update the pull request.
|
||||
_Check SuperBench website for more details._
|
||||
|
||||
## Trademarks
|
||||
|
||||
|
|
До Ширина: | Высота: | Размер: 517 B После Ширина: | Высота: | Размер: 517 B |
До Ширина: | Высота: | Размер: 40 KiB После Ширина: | Высота: | Размер: 40 KiB |
|
@ -0,0 +1,113 @@
|
|||
---
|
||||
id: micro-benchmarks
|
||||
---
|
||||
|
||||
# Micro Benchmarks
|
||||
|
||||
## Benchmarking list
|
||||
|
||||
### Computation benchmark
|
||||
|
||||
### Communication benchmark
|
||||
|
||||
### Computation-communication benchmark
|
||||
|
||||
### Storage benchmark
|
||||
|
||||
|
||||
## Benchmarking metrics
|
||||
|
||||
<table>
|
||||
<tbody>
|
||||
<tr align="center" valign="bottom">
|
||||
<td>
|
||||
</td>
|
||||
<td>
|
||||
<b>Micro Benchmark</b>
|
||||
<img src={require('../assets/bar.png').default}/>
|
||||
</td>
|
||||
<td>
|
||||
<b>Model Benchmark</b>
|
||||
<img src={require('../assets/bar.png').default}/>
|
||||
</td>
|
||||
</tr>
|
||||
<tr valign="top">
|
||||
<td align="center" valign="middle">
|
||||
<b>Metrics</b>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>Computation Benchmark</b></li>
|
||||
<ul><li><b>Kernel Performance</b></li>
|
||||
<ul>
|
||||
<li>GFLOPS</li>
|
||||
<li>TensorCore</li>
|
||||
<li>cuBLAS</li>
|
||||
<li>cuDNN</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Kernel Launch Time</b></li>
|
||||
<ul>
|
||||
<li>Kernel_Launch_Event_Time</li>
|
||||
<li>Kernel_Launch_Wall_Time</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Operator Performance</b></li>
|
||||
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Memory</b></li>
|
||||
<ul><li>H2D_Mem_BW_<GPU ID></li>
|
||||
<li>H2D_Mem_BW_<GPU ID></li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Communication Benchmark</b></li>
|
||||
<ul><li><b>Device P2P Bandwidth</b></li>
|
||||
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>RDMA</b></li>
|
||||
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>NCCL</b></li>
|
||||
<ul><li>NCCL_AllReduce</li></ul>
|
||||
<ul><li>NCCL_AllGather</li></ul>
|
||||
<ul><li>NCCL_broadcast</li></ul>
|
||||
<ul><li>NCCL_reduce</li></ul>
|
||||
<ul><li>NCCL_reduce_scatter</li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Computation-Communication Benchmark</b></li>
|
||||
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Storage Benchmark</b></li>
|
||||
<ul><li><b>Disk</b></li>
|
||||
<ul>
|
||||
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
|
||||
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
|
||||
</ul>
|
||||
</ul>
|
||||
</ul>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>CNN models</b></li>
|
||||
<ul>
|
||||
<li><b>ResNet</b></li>
|
||||
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>DenseNet</b></li>
|
||||
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>VGG</b></li>
|
||||
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
|
||||
</ul>
|
||||
<ul><li><b>BERT models</b></li>
|
||||
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>LSTM</b></li></ul>
|
||||
<ul><li><b>GPT-2</b></li></ul>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
|
@ -0,0 +1,113 @@
|
|||
---
|
||||
id: model-benchmarks
|
||||
---
|
||||
|
||||
# Model Benchmarks
|
||||
|
||||
## Benchmarking list
|
||||
|
||||
### GPT-2 models
|
||||
|
||||
### BERT models
|
||||
|
||||
### LSTM models
|
||||
|
||||
### CNN models
|
||||
|
||||
|
||||
## Benchmarking metrics
|
||||
|
||||
<table>
|
||||
<tbody>
|
||||
<tr align="center" valign="bottom">
|
||||
<td>
|
||||
</td>
|
||||
<td>
|
||||
<b>Micro Benchmark</b>
|
||||
<img src={require('../assets/bar.png').default}/>
|
||||
</td>
|
||||
<td>
|
||||
<b>Model Benchmark</b>
|
||||
<img src={require('../assets/bar.png').default}/>
|
||||
</td>
|
||||
</tr>
|
||||
<tr valign="top">
|
||||
<td align="center" valign="middle">
|
||||
<b>Metrics</b>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>Computation Benchmark</b></li>
|
||||
<ul><li><b>Kernel Performance</b></li>
|
||||
<ul>
|
||||
<li>GFLOPS</li>
|
||||
<li>TensorCore</li>
|
||||
<li>cuBLAS</li>
|
||||
<li>cuDNN</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Kernel Launch Time</b></li>
|
||||
<ul>
|
||||
<li>Kernel_Launch_Event_Time</li>
|
||||
<li>Kernel_Launch_Wall_Time</li>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Operator Performance</b></li>
|
||||
<ul><li>MatMul</li><li>Sharding_MatMul</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Memory</b></li>
|
||||
<ul><li>H2D_Mem_BW_<GPU ID></li>
|
||||
<li>H2D_Mem_BW_<GPU ID></li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Communication Benchmark</b></li>
|
||||
<ul><li><b>Device P2P Bandwidth</b></li>
|
||||
<ul><li>P2P_BW_Max</li><li>P2P_BW_Min</li><li>P2P_BW_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>RDMA</b></li>
|
||||
<ul><li>RDMA_Peak</li><li>RDMA_Avg</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>NCCL</b></li>
|
||||
<ul><li>NCCL_AllReduce</li></ul>
|
||||
<ul><li>NCCL_AllGather</li></ul>
|
||||
<ul><li>NCCL_broadcast</li></ul>
|
||||
<ul><li>NCCL_reduce</li></ul>
|
||||
<ul><li>NCCL_reduce_scatter</li></ul>
|
||||
</ul>
|
||||
</ul>
|
||||
<ul><li><b>Computation-Communication Benchmark</b></li>
|
||||
<ul><li><b>Mul_During_NCCL</b></li><li><b>MatMul_During_NCCL</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Storage Benchmark</b></li>
|
||||
<ul><li><b>Disk</b></li>
|
||||
<ul>
|
||||
<li>Read/Write</li><li>Rand_Read/Rand_Write</li>
|
||||
<li>R/W_Read</li><li>R/W_Write</li><li>Rand_R/W_Read</li><li>Rand_R/W_Write</li>
|
||||
</ul>
|
||||
</ul>
|
||||
</ul>
|
||||
</td>
|
||||
<td>
|
||||
<ul><li><b>CNN models</b></li>
|
||||
<ul>
|
||||
<li><b>ResNet</b></li>
|
||||
<ul><li>ResNet-50</li><li>ResNet-101</li><li>ResNet-152</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>DenseNet</b></li>
|
||||
<ul><li>DenseNet-169</li><li>DenseNet-201</li></ul>
|
||||
</ul>
|
||||
<ul>
|
||||
<li><b>VGG</b></li>
|
||||
<ul><li>VGG-11</li><li>VGG-13</li><li>VGG-16</li><li>VGG-19</li></ul>
|
||||
</ul>
|
||||
<ul><li><b>Other CNN models</b></li><ul><li>...</li></ul></ul>
|
||||
</ul>
|
||||
<ul><li><b>BERT models</b></li>
|
||||
<ul><li><b>BERT</b></li><li><b>BERT_LARGE</b></li></ul>
|
||||
</ul>
|
||||
<ul><li><b>LSTM</b></li></ul>
|
||||
<ul><li><b>GPT-2</b></li></ul>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
|
@ -0,0 +1,158 @@
|
|||
---
|
||||
id: cli
|
||||
---
|
||||
|
||||
# CLI
|
||||
|
||||
SuperBench provides a command line interface to help you use, deploy and run benchmarks.
|
||||
```
|
||||
$ sb
|
||||
|
||||
_____ ____ _
|
||||
/ ____| | _ \ | |
|
||||
| (___ _ _ _ __ ___ _ __| |_) | ___ _ __ ___| |__
|
||||
\___ \| | | | '_ \ / _ \ '__| _ < / _ \ '_ \ / __| '_ \
|
||||
____) | |_| | |_) | __/ | | |_) | __/ | | | (__| | | |
|
||||
|_____/ \__,_| .__/ \___|_| |____/ \___|_| |_|\___|_| |_|
|
||||
| |
|
||||
|_|
|
||||
|
||||
Welcome to the SB CLI!
|
||||
```
|
||||
|
||||
## SuperBench CLI commands
|
||||
|
||||
The following lists `sb` commands usages and examples:
|
||||
|
||||
### `sb deploy`
|
||||
|
||||
Deploy the SuperBench environments to all managed nodes.
|
||||
```bash title="SB CLI"
|
||||
sb deploy [--docker-image]
|
||||
[--docker-password]
|
||||
[--docker-username]
|
||||
[--host-file]
|
||||
[--host-list]
|
||||
[--host-password]
|
||||
[--host-username]
|
||||
[--private-key]
|
||||
```
|
||||
|
||||
#### Optional arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--docker-image` `-i` | `superbench/superbench` | Docker image URI. |
|
||||
| `--docker-password` | `None` | Docker registry password if authentication is needed. |
|
||||
| `--docker-username` | `None` | Docker registry username if authentication is needed. |
|
||||
| `--host-file` `-f` | `None` | Path to Ansible inventory host file. |
|
||||
| `--host-list` `-l` | `None` | Comma separated host list. |
|
||||
| `--host-password` | `None` | Host password or key passphase if needed. |
|
||||
| `--host-username` | `None` | Host username if needed. |
|
||||
| `--private-key` | `None` | Path to private key if needed. |
|
||||
|
||||
#### Global arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--help` `-h` | N/A | Show help message. |
|
||||
|
||||
#### Examples
|
||||
|
||||
Deploy image `superbench/cuda:11.1` to all nodes in `./host.yaml`:
|
||||
```bash title="SB CLI"
|
||||
sb deploy --docker-image superbench/cuda:11.1 --host-file ./host.yaml
|
||||
```
|
||||
|
||||
### `sb exec`
|
||||
|
||||
Execute the SuperBench benchmarks locally.
|
||||
```bash title="SB CLI"
|
||||
sb exec [--config-file]
|
||||
[--config-override]
|
||||
```
|
||||
|
||||
#### Optional arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--config-file` `-c` | `None` | Path to SuperBench config file. |
|
||||
| `--config-override` `-C` | `None` | Extra arguments to override config_file. |
|
||||
|
||||
#### Global arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--help` `-h` | N/A | Show help message. |
|
||||
|
||||
#### Examples
|
||||
|
||||
Execute GPT2 model benchmark in default configuration:
|
||||
```bash title="SB CLI"
|
||||
sb exec --config-override superbench.enable="['gpt2_models']"
|
||||
```
|
||||
|
||||
### `sb run`
|
||||
|
||||
Run the SuperBench benchmarks distributedly.
|
||||
```bash title="SB CLI"
|
||||
sb run [--config-file]
|
||||
[--config-override]
|
||||
[--docker-image]
|
||||
[--docker-password]
|
||||
[--docker-username]
|
||||
[--host-file]
|
||||
[--host-list]
|
||||
[--host-password]
|
||||
[--host-username]
|
||||
[--private-key]
|
||||
```
|
||||
|
||||
#### Optional arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--config-file` `-c` | `None` | Path to SuperBench config file. |
|
||||
| `--config-override` `-C` | `None` | Extra arguments to override config_file. |
|
||||
| `--docker-image` `-i` | `superbench/superbench` | Docker image URI. |
|
||||
| `--docker-password` | `None` | Docker registry password if authentication is needed. |
|
||||
| `--docker-username` | `None` | Docker registry username if authentication is needed. |
|
||||
| `--host-file` `-f` | `None` | Path to Ansible inventory host file. |
|
||||
| `--host-list` `-l` | `None` | Comma separated host list. |
|
||||
| `--host-password` | `None` | Host password or key passphase if needed. |
|
||||
| `--host-username` | `None` | Host username if needed. |
|
||||
| `--private-key` | `None` | Path to private key if needed. |
|
||||
|
||||
#### Global arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--help` `-h` | N/A | Show help message. |
|
||||
|
||||
#### Examples
|
||||
|
||||
Run all benchmarks on all managed nodes in `./host.yaml` using image `superbench/cuda:11.1`
|
||||
and default benchmarking configuration:
|
||||
```bash title="SB CLI"
|
||||
sb run --docker-image superbench/cuda:11.1 --host-file ./host.yaml
|
||||
```
|
||||
|
||||
### `sb version`
|
||||
|
||||
Print the current SuperBench CLI version.
|
||||
```bash title="SB CLI"
|
||||
sb version
|
||||
```
|
||||
|
||||
#### Global arguments
|
||||
|
||||
| Name | Default | Description |
|
||||
| --- | --- | --- |
|
||||
| `--help` `-h` | N/A | Show help message. |
|
||||
|
||||
#### Examples
|
||||
|
||||
Print version:
|
||||
```bash title="SB CLI"
|
||||
sb version
|
||||
```
|
|
@ -0,0 +1,40 @@
|
|||
---
|
||||
id: contributing
|
||||
---
|
||||
|
||||
# Contributing
|
||||
|
||||
## Contributor License Agreement
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
|
||||
## Contributing principles
|
||||
|
||||
SuperBenchmark is an open-source project. Your participation and contribution are highly appreciated. There are several important things you need know before contributing to this project:
|
||||
|
||||
### What content can be added to SuperBenchmark
|
||||
|
||||
1. Bug fixes for existing features.
|
||||
2. New features for benchmark module (micro-benchmark, model-benchmark, etc.)
|
||||
|
||||
If you would like to contribute a new feature on SuperBenchmark, please submit your proposal first. In [GitHub Issues](https://github.com/microsoft/superbenchmark/issues) module, choose `Enhancement Request` to finish the submission. If the proposal is accepted, you can submit pull requests to origin main branch.
|
||||
|
||||
### Contribution steps
|
||||
|
||||
If you would like to contribute to the project, please follow below steps of joint development on GitHub.
|
||||
|
||||
1. `Fork` the repo first to your personal GitHub account.
|
||||
2. Checkout from main branch for feature development.
|
||||
3. When you finish the feature, please fetch the latest code from origin repo, merge to your branch and resolve conflict.
|
||||
4. Submit pull requests to origin main branch.
|
||||
5. Please note that there might be comments or questions from reviewers. It will need your help to update the pull request.
|
|
@ -0,0 +1,47 @@
|
|||
---
|
||||
id: development
|
||||
---
|
||||
|
||||
# Development
|
||||
|
||||
If you want to develop new feature, please follow below steps to set up development environment.
|
||||
|
||||
## Check Environment
|
||||
|
||||
Follow [System Requirements](../getting-started/installation.md).
|
||||
|
||||
## Set Up
|
||||
|
||||
```bash
|
||||
git clone https://github.com/microsoft/superbenchmark
|
||||
cd superbenchmark
|
||||
|
||||
python3 -m pip install -e .[dev,test]
|
||||
```
|
||||
|
||||
## Lint and Test
|
||||
|
||||
Format code using yapf.
|
||||
```bash
|
||||
python3 setup.py format
|
||||
```
|
||||
|
||||
Check code style with mypy and flake8
|
||||
```bash
|
||||
python3 setup.py lint
|
||||
```
|
||||
|
||||
Run unit tests.
|
||||
```bash
|
||||
python3 setup.py test
|
||||
```
|
||||
|
||||
## Submit a Pull Request
|
||||
|
||||
Please install `pre-commit` before `git commit` to run all pre-checks.
|
||||
|
||||
```bash
|
||||
pre-commit install
|
||||
```
|
||||
|
||||
Open a pull request to main branch on GitHub.
|
|
@ -0,0 +1,154 @@
|
|||
---
|
||||
id: configuration
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
||||
## SuperBench config
|
||||
|
||||
SuperBench uses a [YAML](https://yaml.org/spec/1.2/spec.html) config file to configure the details of benchmarkings,
|
||||
including which benchmark to run, which distributing mode to choose, which parameter to use, etc.
|
||||
|
||||
Here's what default config file looks like.
|
||||
|
||||
```yaml title="superbench/config/default.yaml"
|
||||
# SuperBench Config
|
||||
superbench:
|
||||
enable: null
|
||||
var:
|
||||
default_local_mode: &default_local_mode
|
||||
enable: true
|
||||
modes:
|
||||
- name: local
|
||||
proc_num: 8
|
||||
prefix: CUDA_VISIBLE_DEVICES={proc_rank}
|
||||
parallel: yes
|
||||
default_pytorch_mode: &default_pytorch_mode
|
||||
enable: true
|
||||
modes:
|
||||
- name: torch.distributed
|
||||
proc_num: 8
|
||||
node_num: 1
|
||||
frameworks:
|
||||
- pytorch
|
||||
common_model_config: &common_model_config
|
||||
duration: 0
|
||||
num_warmup: 16
|
||||
num_steps: 128
|
||||
precision:
|
||||
- float32
|
||||
- float16
|
||||
model_action:
|
||||
- train
|
||||
benchmarks:
|
||||
kernel-launch:
|
||||
<<: *default_local_mode
|
||||
gemm-flops:
|
||||
<<: *default_local_mode
|
||||
cudnn-function:
|
||||
<<: *default_local_mode
|
||||
cublas-function:
|
||||
<<: *default_local_mode
|
||||
matmul:
|
||||
<<: *default_local_mode
|
||||
frameworks:
|
||||
- pytorch
|
||||
sharding-matmul:
|
||||
<<: *default_pytorch_mode
|
||||
computation-communication-overlap:
|
||||
<<: *default_pytorch_mode
|
||||
gpt_models:
|
||||
<<: *default_pytorch_mode
|
||||
models:
|
||||
- gpt2-small
|
||||
- gpt2-large
|
||||
parameters:
|
||||
<<: *common_model_config
|
||||
batch_size: 4
|
||||
bert_models:
|
||||
<<: *default_pytorch_mode
|
||||
models:
|
||||
- bert-base
|
||||
- bert-large
|
||||
parameters:
|
||||
<<: *common_model_config
|
||||
batch_size: 8
|
||||
lstm_models:
|
||||
<<: *default_pytorch_mode
|
||||
models:
|
||||
- lstm
|
||||
parameters:
|
||||
<<: *common_model_config
|
||||
batch_size: 128
|
||||
cnn_models:
|
||||
<<: *default_pytorch_mode
|
||||
models:
|
||||
- resnet50
|
||||
- resnet101
|
||||
- resnet152
|
||||
- densenet169
|
||||
- densenet201
|
||||
- vgg11
|
||||
- vgg13
|
||||
- vgg16
|
||||
- vgg19
|
||||
parameters:
|
||||
<<: *common_model_config
|
||||
batch_size: 128
|
||||
```
|
||||
|
||||
By default, all benchmarks in default configuration will be run if you don't specify customized configuration.
|
||||
|
||||
If you want to have a quick try, you can modify this config a little bit. For example, only run resnet models.
|
||||
1. copy the default config to a file named `resnet.yaml` in current path.
|
||||
```bash
|
||||
cp superbench/config/default.yaml resnet.yaml
|
||||
```
|
||||
2. enable only `cnn_models` in the config and remove other models except resnet under `benchmarks.cnn_models.models`.
|
||||
```yaml {3,10-13} title="resnet.yaml"
|
||||
# SuperBench Config
|
||||
superbench:
|
||||
enable: ['cnn_models']
|
||||
var:
|
||||
# ...
|
||||
# omit the middle part
|
||||
# ...
|
||||
cnn_models:
|
||||
<<: *default_pytorch_mode
|
||||
models:
|
||||
- resnet50
|
||||
- resnet101
|
||||
- resnet152
|
||||
parameters:
|
||||
<<: *common_model_config
|
||||
batch_size: 128
|
||||
```
|
||||
|
||||
## Ansible Inventory
|
||||
|
||||
SuperBench leverages [Ansible](https://docs.ansible.com/ansible/latest/) to run benchmarking workloads on managed nodes,
|
||||
you need to provide an [inventory](https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html) file
|
||||
to configure host list for managed nodes.
|
||||
|
||||
Here're some basic examples as your starting point.
|
||||
* One managed node, same node as control node.
|
||||
```ini title="local.ini"
|
||||
[all]
|
||||
localhost ansible_connection=local
|
||||
```
|
||||
* Two managed nodes, one is control node and the other can be remote accessed.
|
||||
```ini title="mix.ini"
|
||||
[all]
|
||||
localhost ansible_connection=local
|
||||
10.0.0.100 ansible_user=username ansible_ssh_private_key_file=id_rsa
|
||||
```
|
||||
* Eight managed nodes, all can be accessed remotely.
|
||||
```ini title="remote.ini"
|
||||
[all]
|
||||
10.0.0.[100:103]
|
||||
10.0.0.[200:203]
|
||||
|
||||
[all:vars]
|
||||
ansible_user=username
|
||||
ansible_ssh_private_key_file=id_rsa
|
||||
```
|
|
@ -0,0 +1,83 @@
|
|||
---
|
||||
id: installation
|
||||
---
|
||||
|
||||
# Installation
|
||||
|
||||
SuperBench is used to run validations for AI infrastructure,
|
||||
thus you need to prepare one __control node__ which is used to run SuperBench commands,
|
||||
and one or multiple __managed nodes__ which are going to be validated.
|
||||
|
||||
Usually __control node__ could be a CPU node, while __managed nodes__ are GPU nodes with high speed inter-connection.
|
||||
|
||||
:::tip Tips
|
||||
It is fine if you have only one GPU node and want to try SuperBench on it.
|
||||
Control node and managed node can co-locate on the same machine.
|
||||
:::
|
||||
|
||||
## Control node
|
||||
|
||||
Here're the system requirements for control node.
|
||||
|
||||
### Requirements
|
||||
|
||||
* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
|
||||
* [Python](https://www.python.org/) version 3.6 or later (which can be checked by running `python3 --version`).
|
||||
* [Pip](https://pip.pypa.io/en/stable/installing/) version 18.0 or later (which can be checked by running `python3 -m pip --version`).
|
||||
|
||||
:::note
|
||||
Windows is not supported due to lack of Ansible support, but you still can use WSL2.
|
||||
:::
|
||||
|
||||
Besides, control node should be able to access all managed nodes through SSH.
|
||||
If you are going to use password instead of private key for SSH, you also need to install `sshpass`.
|
||||
|
||||
```bash
|
||||
sudo apt-get install sshpass
|
||||
```
|
||||
|
||||
It is also recommended to use [venv](https://docs.python.org/3/library/venv.html) for virtual environments,
|
||||
but it is not strictly necessary.
|
||||
|
||||
```bash
|
||||
# create a new virtual environment
|
||||
python3 -m venv --system-site-packages ./venv
|
||||
# activate the virtual environment
|
||||
source ./venv/bin/activate
|
||||
|
||||
# exit the virtual environment later
|
||||
# after you finish running superbench
|
||||
deactivate
|
||||
```
|
||||
|
||||
### Build
|
||||
|
||||
You can clone the source from GitHub and build it.
|
||||
|
||||
```bash
|
||||
git clone https://github.com/microsoft/superbenchmark
|
||||
cd superbenchmark
|
||||
|
||||
python3 -m pip install .
|
||||
make postinstall
|
||||
```
|
||||
|
||||
After installation, you should be able to run SB CLI.
|
||||
|
||||
```bash
|
||||
sb
|
||||
```
|
||||
|
||||
## Managed nodes
|
||||
|
||||
Here're the system requirements for all managed GPU nodes.
|
||||
|
||||
### Requirements
|
||||
|
||||
* Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
|
||||
* Compatible GPU drivers should be install correctly.
|
||||
* For NVIDIA GPUs, driver version can be checked by running `nvidia-smi`.
|
||||
* [Docker CE](https://docs.docker.com/engine/install/) version 19.03 or later (which can be checked by running `docker --version`).
|
||||
* GPU support in Docker.
|
||||
* For NVIDIA GPUs, install
|
||||
[nvidia-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit).
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
id: run-superbench
|
||||
---
|
||||
|
||||
# Run SuperBench
|
||||
|
||||
Having prepared benchmark configuration and inventory files,
|
||||
you can start to run SuperBench over all managed nodes.
|
||||
|
||||
## Deploy
|
||||
|
||||
Leveraging `sb deploy` command, we can easily deploy SuperBench environment to all managed nodes.
|
||||
After running the following command, SuperBench will automatically access all nodes, pull container image and prepare container.
|
||||
|
||||
```bash
|
||||
sb deploy -f local.ini
|
||||
```
|
||||
|
||||
Alternatively, to run on remote nodes, use the corresponding inventory file instead.
|
||||
|
||||
If you are using password for SSH and cannot specify private key in inventory,
|
||||
or your private key requires a passphase before use, you can do
|
||||
```bash
|
||||
sb deploy -f remote.ini --host-password [password]
|
||||
```
|
||||
|
||||
## Run
|
||||
|
||||
After deployment, you can start to run the SuperBench benchmarks on all managed nodes using `sb run` command.
|
||||
|
||||
```bash
|
||||
sb run -f local.ini -c resnet.yaml
|
||||
```
|
|
@ -0,0 +1,33 @@
|
|||
---
|
||||
id: introduction
|
||||
---
|
||||
|
||||
# Introduction
|
||||
|
||||
## Features
|
||||
|
||||
__SuperBench__ is a validation and profiling tool for AI infrastructure, which supports:
|
||||
|
||||
* AI infrastructure validation and diagnosis
|
||||
* Distributed validation tools to validate hundreds or thousands of servers automatically
|
||||
* Consider both raw hardware and E2E model performance with ML workload patterns
|
||||
* Build a contract to identify hardware issues
|
||||
* Provide infrastructural-oriented criteria as Performance/Quality Gates for hardware and system release
|
||||
* Provide detailed performance report and advanced analysis tool
|
||||
* AI workload benchmarking and profiling
|
||||
* Provide comprehensive performance comparison between different existing hardware
|
||||
* Provide insights for hardware and software co-design
|
||||
|
||||
It includes micro-benchmark for primitive computation and communication benchmarking,
|
||||
and model-benchmark to measure domain-aware end-to-end deep learning workloads.
|
||||
|
||||
:::note
|
||||
SuperBench is in the early pre-alpha stage for open source, and not ready for general public yet.
|
||||
If you want to jump in early, you can try building latest code yourself.
|
||||
:::
|
||||
|
||||
## Overview
|
||||
|
||||
The following figure shows the capabilities provide by SuperBench core framework and its extension.
|
||||
|
||||
![SuperBench Structure](./assets/superbench_structure.png)
|
Загрузка…
Ссылка в новой задаче