superbenchmark

Граф коммитов

Автор	SHA1	Сообщение	Дата
Yifan Xiong	f7ffc54522	CLI - Add command sb benchmark [list,list-parameters] (#279 ) __Description__ Add command `sb benchmark list` and `sb benchmark list-parameters` to support listing all optional parameters for benchmarks. <details> <summary>Examples</summary> <pre> $ sb benchmark list -n [a-z]+-bw -o table Result -------- mem-bw nccl-bw rccl-bw </pre> <pre> $ sb benchmark list-parameters -n mem-bw === mem-bw === optional arguments: --bin_dir str Specify the directory of the benchmark binary. --duration int The elapsed time of benchmark in seconds. --mem_type str [str ...] Memory types to benchmark. E.g. htod dtoh dtod. --memory str Memory argument for bandwidthtest. E.g. pinned unpinned. --run_count int The run count of benchmark. --shmoo_mode Enable shmoo mode for bandwidthtest. default values: {'bin_dir': None, 'duration': 0, 'mem_type': ['htod', 'dtoh'], 'memory': 'pinned', 'run_count': 1} </pre> </details> __Major Revisions__ * Add `sb benchmark list` to list benchmarks matching given name. * Add `sb benchmark list-parameters` to list parameters for benchmarks which match given name. __Minor Revisions__ * Sort format help text for argparse.	2022-01-18 08:40:03 +00:00
dependabot[bot]	9a909d2bed	Bump follow-redirects from 1.14.1 to 1.14.7 in /website (#282 ) Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.1 to 1.14.7. - [Release notes](https://github.com/follow-redirects/follow-redirects/releases) - [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.14.1...v1.14.7) --- updated-dependencies: - dependency-name: follow-redirects dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-17 12:56:30 +08:00
dependabot[bot]	2538a7eedd	Bump shelljs from 0.8.4 to 0.8.5 in /website (#281 ) Bumps [shelljs](https://github.com/shelljs/shelljs) from 0.8.4 to 0.8.5. - [Release notes](https://github.com/shelljs/shelljs/releases) - [Changelog](https://github.com/shelljs/shelljs/blob/master/CHANGELOG.md) - [Commits](https://github.com/shelljs/shelljs/compare/v0.8.4...v0.8.5) --- updated-dependencies: - dependency-name: shelljs dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-17 10:51:54 +08:00
Yifan Xiong	ff563b66af	Release - SuperBench v0.4.0 (#278 ) __Description__ Cherry-pick bug fixes from v0.4.0 to main. __Major Revisions__ * Bug - Fix issues for Ansible and benchmarks (#267) * Tests - Refine test cases for microbenchmark (#268) * Bug - Build openmpi with ucx support in rocm dockerfiles (#269) * Benchmarks: Fix Bug - Fix fio build issue (#272) * Docs - Unify metric and add doc for cublas and cudnn functions (#271) * Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274) * Bug - Fix bug of detecting if gpu_index is none (#275) * Bug - Fix bugs in data diagnosis (#273) * Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270) * Benchmarks: Configuration - Update inference and network benchmarks in configs (#276) * Docs - Upgrade version and release note (#277) Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>	2021-12-30 16:24:00 +08:00
Yuting Jiang	682ed06aee	Docs - Add usage for data diagnosis (#266 ) Description Add usage for data diagnosis.	2021-12-14 03:10:29 +00:00
guoshzhao	2e10fb0dcd	Docs - Update docs for monitor. (#265 ) Description Update docs for monitor.	2021-12-13 14:07:28 +00:00
Yifan Xiong	cb8a3cfb15	Benchmarks - Add transformers for TensorRT inference (#254 ) Add transformers for TensorRT inference.	2021-12-13 13:21:32 +00:00
Ziyue Yang	10012a0a47	Docs - Add benchmark metrics for cpu-memory-bw-latency (#264 ) Description Add benchmark metrics for cpu-memory-bw-latency.	2021-12-13 19:08:19 +08:00
Ziyue Yang	b6781968f2	Benchmarks: Fix Comment - Correct benchmark name in test_gpu_copy_bw_performance.py #263 Description Benchmarks: Fix Comment - Correct benchmark name in test_gpu_copy_bw_performance.py.	2021-12-13 07:02:39 +00:00
Hossein Pourreza	b590409e0f	Benchmarks: Add Benchmark - Add mlc benchmark to superbench (#216 ) Description Add mlc memory bandwidth and latency micro benchmark to Superbench. Major Revision - Add mlc benchmark with test and example files	2021-12-13 13:47:42 +08:00
yangpanMS	c403b1ca76	Docs - Add a small note for using release container version (#262 ) Description Minor doc change to highlight sb CLI version is independent of the sb container version.	2021-12-13 03:48:11 +00:00
guoshzhao	4d85630abb	Benchmarks: Add Benchmark - Add ONNXRuntime inference benchmark based on ORT python API (#245 ) Description Add ONNXRuntime inference benchmark based on ORT python API. Major Revision - Add `ORTInferenceBenchmark` class to export pytorch model to onnx model and do inference - Add tests and example for `ort-inference` benchmark - Update the introduction docs.	2021-12-10 13:53:11 +00:00
Yuting Jiang	c2f942cb6f	Analyzer: Add Feature - Add basic analysis features (#248 ) Description Add basic analysis features. Major Revision - Add statistics, correlations of the raw data - Add numeric outlier detection(inter_quartile_range) - Add boxplot for selected metric	2021-12-10 11:01:59 +00:00
guoshzhao	6e357fb9d2	Monitor: Integration - Integrate monitor into Superbench (#259 ) Description Integrate monitor into Superbench. Major Revision - Initialize, start and stop monitor in SB executor. - Parse the monitor data in SB runner and merge into benchmark results. - Specify ReduceType for monitor metrics, such as MAX, MIN and LAST. - Add monitor configs into config file.	2021-12-10 09:33:13 +00:00
guoshzhao	afea9913ae	Benchmarks: Fix Bug - Set reduce_op type for metirc return_code (#261 ) Description Set the `reduce_op` type for metirc `return_code` as `None`.	2021-12-10 16:02:29 +08:00
Yuting Jiang	ed2f3c3c82	CLI - Integrate data diagnosis (#260 ) Description Add cli to integrate data diagnosis module.	2021-12-10 06:11:00 +00:00
Yuting Jiang	9f56b2198f	Benchmarks: Unify metric names of benchmarks (#252 ) Description Unify metric names of benchmarks.	2021-12-09 04:48:42 +00:00
Yuting Jiang	c13ed2a297	Analyzer: Initialization - Add baseline-based data diagnosis module (#242 ) Description Add data diagnosis module. Major Revision - Add DataDiagnosis class to support rule-based data diagnosis for result summary jsonl file of multi nodes - Add RuleOp class to define rule operators	2021-12-08 18:22:00 +08:00
Yifan Xiong	213ab14bea	Bug - Fix issues for distributed runs (#258 ) Fix issues for distributed runs: * fix config for memory bandwidth benchmarks * add throttling for high concurrency docker pull * update rsync path and exclude directories * handle exceptions when creating summary * tune for logging	2021-12-08 06:55:13 +00:00
guoshzhao	44f0270ec4	Benchmarks: Add Feature - Add return_code metric into result (#256 ) Description Add return_code metric into result and revise unit tests.	2021-12-07 07:32:37 +00:00
Yuting Jiang	655f238dbb	Docs - Add doc for data diagnosis (#249 ) Description Add doc for data diagnosis, including input, output and baseline file schema.	2021-12-06 02:49:38 +00:00
Yifan Xiong	bd8f105d2e	Benchmarks - Add config file for NDm A100 v4 (#255 ) Add config file for Azure NDm A100 v4 SKU.	2021-12-04 01:17:23 +08:00
guoshzhao	8042fa34cf	Benchmarks: Configuration - Add gpt-small into config files. (#253 ) Description Add gpt-small into config files.	2021-12-02 11:12:55 +00:00
guoshzhao	371fd61cea	Benchmarks: Add Feature - Add 'ignore_invalid' option when register benchmarks. (#247 ) Description If `ignore_invalid` is True, and 'required' arguments are not set when register the benchmark, the arguments should be provided by user in config and skip the arguments checking.	2021-12-02 10:26:56 +00:00
Yifan Xiong	b4ea97bfa4	Benchmark: Replace `-c` argument with `-N` for `numactl` in Configuration (#250 ) Description Replace `-c` argument with `-N` for `numactl` since the old `-c`/`--cpubind` argument is deprecated.	2021-12-02 09:27:03 +00:00
Ziyue Yang	b0e759f599	Benchmarks: Build Pipeline - Upgrade FIO benchmark tool (#251 ) Description Upgrade FIO benchmark tool from 3.27 to 3.28.	2021-12-01 20:33:09 +08:00
Yuting Jiang	978e88efdd	Docs: Update ib validation microbenchmark metrics (#246 ) Description Update ib validtion mirobenchmark metrics.	2021-11-30 12:58:34 +00:00
dependabot[bot]	53e47fb231	Bump algoliasearch-helper from 3.5.5 to 3.6.2 in /website (#243 ) Bumps [algoliasearch-helper](https://github.com/algolia/algoliasearch-helper-js) from 3.5.5 to 3.6.2. - [Release notes](https://github.com/algolia/algoliasearch-helper-js/releases) - [Changelog](https://github.com/algolia/algoliasearch-helper-js/blob/develop/CHANGELOG) - [Commits](https://github.com/algolia/algoliasearch-helper-js/compare/3.5.5...3.6.2) --- updated-dependencies: - dependency-name: algoliasearch-helper dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-11-29 06:43:39 +00:00
Ziyue Yang	d89fcd4f93	Docs: Update gpu-copy benchmark metrics (#241 ) Description Update gpu-copy benchmark metrics.	2021-11-26 03:26:24 +00:00
Kaiyu Xie	6d85b03a6e	Benchmarks: Fix Typo - Fix typo in description of kernel launch overhead example (#244 ) Description Fix typo in description of kernel_launch_overhead.py	2021-11-25 06:28:41 +00:00
guoshzhao	4074f12c1c	Monitor: Initialization - Add Monitor and MonitorRecord class (#240 ) Description Add the initial version of Monitor. Major Revision - Add `Monitor` class to launch background process for monitoring. - Add `MonitorRecord` class to save the data one time capturing.	2021-11-18 15:54:18 +08:00
guoshzhao	cc70f9c18c	Benchmarks: Add Feature - Extend the device manager utility to support more functions. (#239 ) Description Rename `nvidia_helper` utility as `device_manager` module and support more functions: ``` device_manager.get_device_count() device_manager.get_device_utilization(idx) device_manager.get_device_temperature(idx) device_manager.get_device_power_limit(idx) device_manager.get_device_memory(idx) device_manager.get_device_row_remapped_info(idx) device_manager.get_device_ecc_error(idx) ```	2021-11-15 14:24:04 +08:00
Yifan Xiong	8a00c8a03b	Benchmarks - Add TensorRT inference benchmark (#236 ) __Description__ Add TensorRT inference benchmark for torchvision models. __Major Revision__ - Measure TensorRT inference performance.	2021-11-12 15:27:16 +08:00
Yuting Jiang	b913e1f668	Docs: Update docs to add network benchmarks for tcp and gpcnet (#238 ) Description Update docs to add network benchmarks for tcp and gpcnet.	2021-11-10 12:35:53 +00:00
Yuting Jiang	54919424c3	Benchmarks: Add Benchmark - Add ib traffic validation distributed benchmark (#215 ) Description Add ib traffic validation distributed benchmark. Major Revision - Add ib traffic validation distributed benchmark, example and test	2021-11-10 01:18:41 +08:00
guoshzhao	f15fdf7295	Docs: Update docs to add ORT AMD benchmarks based on docker (#237 ) Update docs to add ORT AMD benchmarks based on docker.	2021-11-09 14:28:42 +08:00
Ziyue Yang	008e0fe1d8	Benchmarks: Add Feature - Add CPU-initiated copy and dtod support to gpu-sm-copy benchmark (#230 ) Description This commit does the following: 1) Adds CPU-initiated copy benchmark; 2) Adds dtod benchmark; 3) Support scanning NUMA nodes and GPUs inside the benchmark program; 4) Change the name of gpu-sm-copy to gpu-copy.	2021-10-30 11:19:09 +08:00
Ziyue Yang	6a068e2534	Dockerfiles: Fix Bug - Fix ROCm GPG file URL (#234 ) Description This commit fixes the URL of ROCm GPG file.	2021-10-30 07:26:52 +08:00
Yifan Xiong	976803f8ea	Docs - Add introduction and metrics in benchmarks docs (#233 ) Add introduction and metrics for micro-benchmarks and model-benchmarks document.	2021-10-27 13:23:11 +00:00
guoshzhao	e98a68124e	Benchmarks: Add Benchmark - Add onnx model benchmarks based on docker image. (#227 ) Add RocmOnnxModelBenchmark class to run benchmarks packaged in superbench/benchmark:rocm4.3.1-onnxruntime1.9.0	2021-10-27 18:41:40 +08:00
Yuting Jiang	6003f2c2a2	Benchmarks: Add Benchmark - Add gpcnet microbenchmark (#229 ) Description Add gpcnet microbenchmark Major Revision - add 2 microbenmark for gpcnet, gpc-network-test, gpc-network-load-test - add related test and example file	2021-10-22 08:40:01 +00:00
guoshzhao	f841c8f466	Benchmarks: Add Feature - Support AMD and CUDA platform for DockerBenchmark. (#226 ) Description Add CudaDockerBenchmark and RocmDockerBenchmark to support amd and cuda platform for DockerBenchmark.	2021-10-22 15:22:15 +08:00
Yuting Jiang	b592a7c793	Benchmarks: Build Pipeline - Add gpcnet as git submodule and building logic (#228 ) Description Add gpcnet as git submodule and building logic. Major Revision - add gpcnet as a submodule - add build logic in third_party/Makefile	2021-10-21 11:28:51 +00:00
guoshzhao	455ad1f873	revise the term onnx to onnxruntime. (#232 ) Description Revise the all the term `onnx` to `onnxruntime`.	2021-10-21 04:29:27 +00:00
Yuting Jiang	2664850abd	Benchmarks: Add Benchmark - Add ib validation tool source code (#191 ) Description Add IB validation tool source code. IB validation tool is a tool to validate IB traffic of different pattern in multi nodes flexibly Major Revision - Add ib validation tool source code - Add cmake file to build the source code	2021-10-21 03:14:43 +00:00
Yifan Xiong	a030d7150b	CI/CD - Upgrade to latest agent image in pipeline (#231 ) Upgrade to latest agent image in pipeline, fix "Ubuntu16 image does not exist" issue.	2021-10-21 09:20:13 +08:00
Yuting Jiang	49cc8f9a8c	Benchmarks: Add Benchmark - Add tcp connectivity validation microbenchmark (#217 ) Description Add tcp connectivity validation microbenchmark which is to validate TCP connectivity between current node and several nodes in the hostfile. Major Revision - Add tcp connectivity validation microbenchmark and related test, example	2021-10-12 23:42:12 +00:00
Yifan Xiong	3d0fde1292	Docs - Refine document structure (#225 ) __Major Revisions__ * Refine document structure for user tutorial. __Minor Revisions__ * Add AMD part in installation. * Change default config file to latest link.	2021-10-12 18:44:42 +08:00
Yifan Xiong	5283bdebe8	CI/CD - Disable version update, allow security update only (#224 ) Disable dependabot version update, allow security update only. Reference: https://docs.github.com/en/code-security/supply-chain-security/keeping-your-dependencies-updated-automatically/configuration-options-for-dependency-updates#open-pull-requests-limit.	2021-10-12 16:08:33 +08:00
Yifan Xiong	849b6caca2	CI/CD - Add code security scanning (#206 ) Add code security scanning. __Major Revisions__ * enable dependabot auto updates * scan code with CodeQL	2021-10-11 14:38:58 +08:00

... 3 4 5 6 7 ...

432 Коммитов Все ветки Поиск

432 Коммитов

Все ветки