superbenchmark

Граф коммитов

Автор	SHA1	Сообщение	Дата
Yifan Xiong	664c59a14d	Docs - Update version in README (#529 ) Update version in README.	2023-04-28 03:36:11 +00:00
Ziyue Yang	4cb431cab4	Benchmarks - Revise step time collection in distributed inference benchmark (#524 ) Description This commit revises distributed inference benchmark to give a unified step time result by taking maximum step times of different GPUs.	2023-04-24 10:17:49 +08:00
Yifan Xiong	51761b3af1	Release - SuperBench v0.8.0 (#517 ) Description Cherry-pick bug fixes from v0.8.0 to main. Major Revisions * Monitor - Fix the cgroup version checking logic (#502) * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503) * Fix wrong torch usage in communication wrapper for Distributed Inference Benchmark (#505) * Analyzer: Fix bug in python3.8 due to pandas api change (#504) * Bug - Fix bug to get metric from cmd when error happens (#506) * Monitor - Collect realtime GPU power when benchmarking (#507) * Add num_workers argument in model benchmark (#511) * Remove unreachable condition when write host list (#512) * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513) * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515) * Docs - Upgrade version and release note (#508) Co-authored-by: guoshzhao <guzhao@microsoft.com> Co-authored-by: Ziyue Yang <ziyyang@microsoft.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>	2023-04-14 12:57:55 +00:00
Yifan Xiong	97c9a41f14	Benchmark - Update TE FP8 model conversion (#499 ) __Description__ Update TE FP8 model conversion. __Major Revisions__ * Add 16-byte alignment comment. * Fix TE layer parameters type.	2023-03-28 15:01:41 +00:00
Yifan Xiong	c88c970943	Benchmarks - Support TE FP8 in BERT/GPT2 models (#496 ) Support Transformer Engine FP8 in existing PyTorch BERT/GPT2 models by converting linear/layernorm to TE layers.	2023-03-25 19:28:27 +08:00
Ziyue Yang	8daef211dd	Benchmarks - Add distributed inference benchmark (#493 ) Description This PR adds a micro-benchmark of distributed model inference workloads. Major Revision - Add a new micro-benchmark dist-inference. - Add corresponding example and unit tests. - Update configuration files to include this new micro-benchmark. - Update micro-benchmark README. --------- Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-03-24 17:15:17 +08:00
guoshzhao	a9b45a072e	Monitor - Support cgroup V2 when read system metrics. (#491 ) Description Since ubuntu 22.04 will use cgroup V2 and the file structure changed. Modify the monitor to adapt to cgroup v1 and v2.	2023-03-22 08:33:18 +00:00
Yifan Xiong	dbeba8056b	Benchmark - Support batch/shape range in cublaslt gemm (#494 ) Support batch and shape range with multiplication factors in cublaslt gemm benchmark.	2023-03-22 13:22:36 +08:00
rafsalas19	655bd0aa59	Adding HPL benchmark (#482 ) Description - Adding HPL benchmark --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-03-21 16:44:08 +00:00
Yifan Xiong	644b5395df	Benchmark - Fix torch.dist init issue with multiple models (#495 ) Fix potential barrier timeout in init_process_group due to race condition of using the same port. Change to different ports when running multiple models sequentially in one process. For example, when running vgg11/13/16/19, will use port 29501~29504 respectively.	2023-03-21 12:35:03 +00:00
Yuting Jiang	5a88db1601	Benchmarks: Support error tolerance in micro-benchmark for CuDNN function (#490 ) Description Support error tolerance in micro-benchmark for CuDNN function Major Revision - revise micro_base to support running the remaining commands run when one command failed in the microbenchmark - make error tolerance as true in cudnn functions	2023-03-20 21:20:21 +08:00
Yifan Xiong	b808135c27	Benchmarks - Support tensor core precisions in cublaslt gemm (#492 ) Support FP64/TF32/FP16/BF16 in cublaslt (batch) GEMM.	2023-03-20 10:59:40 +08:00
dependabot[bot]	139d4df55f	Bump webpack from 5.39.1 to 5.76.1 in /website (#489 ) Bumps [webpack](https://github.com/webpack/webpack) from 5.39.1 to 5.76.1. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](webpack/webpack@v5.39.1...v5.76.1) --- updated-dependencies: - dependency-name: webpack dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-03-17 20:18:35 +08:00
Yifan Xiong	35f5390512	Pin setuptools version to v65.7.0 (#483 ) Pin setuptools version to [v65.7.0](https://setuptools.pypa.io/en/latest/history.html#v65-7-0) to avoid breaking changes since v66.0.0.	2023-03-06 11:43:44 +00:00
Yifan Xiong	2cc4cd03e2	Limit ansible_runner version for Python3.6 (#485 ) Limit ansible_runner version to less than 2.3.2 for Python3.6.	2023-03-06 18:54:45 +08:00
Yuting Jiang	eba298f5f0	Benchmarks: Revision - Support flexible warmup and non-random data initialization in cublas-benchmark (#479 ) Description revise cublas-benchmark for flexible warmup and fill data with fixed number for perf test to improve the running efficiency. Major Revision - remove num_in_steps for warmup to support more flexible warmup setting for users - Add support to generate input with fixed number for perf test	2023-02-28 06:35:18 +08:00
Yuting Jiang	0292366075	Benchmarks: Build Pipeline - Add suppport for cpu-only perftest in makefile (#480 ) Description Add suppport to install cpu-only perftest in makefile. Co-authored-by: Yuting Jiang <yuting.jiang@microsoft.com> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-02-24 11:19:46 +08:00
Yifan Xiong	bbb86c4a83	CI/CD - Free disk space in GitHub Action VHD (#481 ) Free more disk space in GitHub Action VHD.	2023-02-23 17:30:39 +08:00
Yuting Jiang	ec7f502c93	CI/CD - Upgrade networkx version to fix installation compatibility issue (#478 ) Description Upgrade networkx version to fix installation compatibility issue.	2023-02-17 05:36:21 +00:00
dependabot[bot]	f041b6eacc	Bump @sideway/formula from 3.0.0 to 3.0.1 in /website (#477 ) Bumps [@sideway/formula](https://github.com/sideway/formula) from 3.0.0 to 3.0.1. - [Release notes](https://github.com/sideway/formula/releases) - [Commits](hapijs/formula@v3.0.0...v3.0.1) --- updated-dependencies: - dependency-name: "@sideway/formula" dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-17 02:33:08 +00:00
dependabot[bot]	e1a489496c	Bump http-cache-semantics from 4.1.0 to 4.1.1 in /website (#474 ) Bumps [http-cache-semantics](https://github.com/kornelski/http-cache-semantics) from 4.1.0 to 4.1.1. - [Release notes](https://github.com/kornelski/http-cache-semantics/releases) - [Commits](kornelski/http-cache-semantics@v4.1.0...v4.1.1) --- updated-dependencies: - dependency-name: http-cache-semantics dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-16 14:14:47 +00:00
rafsalas19	32896ca477	Adding Stream Benchmark (#473 ) Description - Added stream benchmark - Added stream unit test - Added stream example - Modified docker files to build stream --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com> Co-authored-by: Yifan Xiong <xiongyf@yandex.com>	2023-02-13 15:34:37 -05:00
Yuting Jiang	62a2913497	Executor - Support SuperBench Executor running on Windows (#475 ) Description Support SuperBench Executor running on Windows. Major Revision - Lazy import ansible related module	2023-02-13 08:20:07 +00:00
pnunna93	f21bfef2f3	Dockerfile: Remove fixed rccl version in rocm5.1.x docker file (#476 ) Description The commit(`e08b6d3a1c`) installs a rccl version which is causing "undefined symbol: ncclGetLastError" while trying to import torch. Revert it to avoid the error.	2023-02-07 15:24:26 +08:00
dependabot[bot]	121a5ddc5e	Bump ua-parser-js from 0.7.28 to 0.7.33 in /website (#469 ) Bumps [ua-parser-js](https://github.com/faisalman/ua-parser-js) from 0.7.28 to 0.7.33. - [Release notes](https://github.com/faisalman/ua-parser-js/releases) - [Changelog](https://github.com/faisalman/ua-parser-js/blob/master/changelog.md) - [Commits](faisalman/ua-parser-js@0.7.28...0.7.33) --- updated-dependencies: - dependency-name: ua-parser-js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-30 10:58:24 +08:00
Yifan Xiong	b07fda155e	Release - SuperBench v0.7.0 (#468 ) Description Cherry-pick bug fixes from v0.7.0 to main. Major Revisions * Benchmarks - Fix missing include in FP8 benchmark (#460) * Fix bug in TE BERT model (#461) * Doc - Update benchmark doc (#465) * Bug: Fix bug for incorrect datatype judgement in cublas-function source code (#464) * Support `sb deploy` without pulling image (#466) * Docs - Upgrade version and release note (#467) Co-authored-by: Russell J. Hewett <russell.j.hewett@gmail.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>	2023-01-28 11:07:06 +08:00
Yuting Jiang	f380bc5eff	Bug: Fix bug for incorrect datatype judgement in cublas-function source code (#462 ) Description Fix bug for incorrect datatype judgement in cublas-function source code.	2023-01-17 10:51:57 +08:00
dependabot[bot]	65bae28c0d	Bump json5 from 1.0.1 to 1.0.2 in /website (#459 ) Bumps [json5](https://github.com/json5/json5) from 1.0.1 to 1.0.2. - [Release notes](https://github.com/json5/json5/releases) - [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md) - [Commits](json5/json5@v1.0.1...v1.0.2) --- updated-dependencies: - dependency-name: json5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-09 17:10:40 +08:00
Yang Wang	ccccd988df	Benchmarks - Support topo-aware, pair-wise, and K-batch pattern in nccl-bw benchmark (#454 ) Support traffic patterns under the different devices in NCCL/RCCL test * change the metrics format if specified the pattern	2023-01-04 12:30:32 +00:00
Yang Wang	8e748d5649	Runner - Generate host groups file in mpi mode (#458 ) Major Revision - Add an option for pattern to generate mpi_pattern.txt file if specified the path. - In mpi pattern, serial_index and parallel_index will add in each benchmark as environment variables. Minor Revision - Fix typo	2023-01-04 19:49:14 +08:00
Yifan Xiong	5197cdf5cb	Benchmarks - Support FP8 in BERT models (#446 ) Support FP8 in PyTorch BERT models: * add fp8 hybrid/e4m3/e5m2 in precision arguments * build BERT encoders with `te.TransformerLayer` to repalce `transformers.BertModel` * wrap forward steps with fp8 autocast	2023-01-04 11:12:05 +08:00
Yang Wang	65e433c0c6	Runner: Support `topo-aware` and `k-batch` pattern in 'mpi' mode (#437 ) Description Support the following patterns in `mpi` mode: * `k-batch` * `topo-aware`	2023-01-03 10:28:35 +00:00
Yifan Xiong	fc661f7db3	Support GEMM benchmark on Hopper GPUs (#456 ) Support GEMM benchmark on Hopper GPUs.	2023-01-03 09:45:27 +00:00
Yifan Xiong	616e7a5a5a	Benchmarks - Integrate cublaslt micro-benchmark (#455 ) Integrate cublaslt-gemm micro-benchmark #451.	2023-01-03 08:54:40 +00:00
Yuting Jiang	75573f59da	Benchmarks: Micro benchmarks - Add correctness check in cublas-function benchmark (#452 ) Description Add correctness check in cublas-function benchmark. Major Revision - add python code of correctness check in cublas-function benchmark and test	2023-01-03 14:59:30 +08:00
Yifan Xiong	0591da5f49	Benchmarks - Add cuBLASLt FP16 and FP8 GEMM micro-benchmark (#451 ) Add micro-benchmark for cublaslt fp8 gemm.	2023-01-03 05:28:56 +00:00
Yuting Jiang	678b1251f1	Benchmarks: Micro benchmarks - add source code of correctness check for cublas functions (#450 ) Description Add c source code of correctness check for cublas functions. Major Revision - add correctness check for all supported cublas functions - add --correctness option into binary Minor Revision - fix bug and template fill_data and prepare_tensor to get right memory-alignment output matrix for different datatype	2023-01-03 04:20:10 +00:00
Yuting Jiang	9dfefce350	Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445 ) Description Add stdout logging util module and enable real-time logging flushing in executor Major Revision - Add stdout logging util module to redirect stdout into file log - enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log` - enable real-time log flushing in run_command of microbenchmarks through config `log_flushing` Minor Revision - add log_n_step args to enable regular step time log in model benchmarks - udpate related docs	2022-12-30 09:40:28 +00:00
Yang Wang	f2634d8608	Benchmarks - Support `pair-wise` pattern in IB validation benchmark (#453 ) Description * Reuse `gen_pair_wise_config` in micro-benchmark	2022-12-30 13:02:52 +08:00
Yifan Xiong	a3c65b2a57	Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449 ) Add Docker image for arch90 NVIDIA GPUs: * add CUDA11.8 Dockerfile * update archs in Makefile and benchmarks accordingly * update image build pipeline	2022-12-29 12:19:38 +00:00
Yang Wang	7838b6b154	Runner - Support `pair-wise` pattern in `mpi` mode (#447 ) * Extract pair-wise pattern from ib_validation	2022-12-29 08:23:36 +00:00
dependabot[bot]	6186146d59	Bump qs and express in /website (#440 ) Bumps [qs](https://github.com/ljharb/qs) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `qs` from 6.7.0 to 6.11.0 - [Release notes](https://github.com/ljharb/qs/releases) - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](https://github.com/ljharb/qs/compare/v6.7.0...v6.11.0) Updates `express` from 4.17.1 to 4.18.2 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](https://github.com/expressjs/express/compare/4.17.1...4.18.2) --- updated-dependencies: - dependency-name: qs dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-28 13:47:06 +08:00
dependabot[bot]	de6deb0e2d	Bump decode-uri-component from 0.2.0 to 0.2.2 in /website (#439 ) Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component) from 0.2.0 to 0.2.2. - [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases) - [Commits](https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.2) --- updated-dependencies: - dependency-name: decode-uri-component dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-27 13:40:49 +08:00
Yuting Jiang	6583ba2e40	Benchmark: Revision - Add wait time option to resolve mem-bw unstable issue (#438 ) Description Add wait time option to resolve mem-bw unstable issue.	2022-12-14 17:21:02 +08:00
Yuting Jiang	1deb2eaa29	downgrage transformers version to fix tersorrt (#441 ) Description Downgrage transformers version to fix tersorrt test failure.	2022-12-14 14:19:32 +08:00
Yang Wang	e4eeda0afd	Runner - support 'pattern' in 'mpi' mode to run tasks in parallel (#430 ) * add mpi-parallels mode * update according to comments * fix and update doc * update * merge into 'mpi' mode * udpate according to comments * fix testcases * fix ansible * regard pattern as field * udpate * fix flake8 version * add flake8 range * remove map-by from host config * udpate comments	2022-11-29 12:30:10 +08:00
dependabot[bot]	3c97381fd2	Bump loader-utils from 1.4.0 to 1.4.2 in /website (#431 ) Bumps [loader-utils](https://github.com/webpack/loader-utils) from 1.4.0 to 1.4.2. - [Release notes](https://github.com/webpack/loader-utils/releases) - [Changelog](https://github.com/webpack/loader-utils/blob/v1.4.2/CHANGELOG.md) - [Commits](https://github.com/webpack/loader-utils/compare/v1.4.0...v1.4.2) --- updated-dependencies: - dependency-name: loader-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-11-18 15:13:04 +08:00
Yang Wang	57f7403c47	Update typing-extensions version to fix pipeline issue (#432 )	2022-11-17 19:39:52 +08:00
Yifan Xiong	1b86503d1e	CLI - Add non-zero return code for `sb [deploy,run]` (#425 ) Add non-zero return code for `sb deploy` and `sb run` command when there're Ansible failures in control plane. Return code is set to count of failure. For failures caused by benchmarks, return code is still set per benchmark in results json file.	2022-11-01 10:46:19 +08:00
Yifan Xiong	d7bb8303fb	CLI - Update version to include revision hash and date (#427 ) Update version to include revision hash and date in "{last tag}+g{git hash}.d{date}" format, here're the examples: * exact tag: 0.6.0 * commit after tag: 0.6.0+gcbb1b34 * commit after tag with local changes: 0.6.0+gcbb1b34.d20221028	2022-10-31 10:44:41 +08:00

1 2 3 4 5 ...

371 Коммитов Все ветки Поиск

371 Коммитов

Все ветки