superbenchmark

Граф коммитов

Автор	SHA1	Сообщение	Дата
Yifan Xiong	63cf2f1db4	Update Update.	2023-04-10 09:26:36 +08:00
Yifan Xiong	0d6e723c63	Fix num_workers to 0 in data loader Fix num_workers to 0 in data loader.	2023-04-09 17:03:21 +08:00
guoshzhao	103807090f	Monitor - Collect realtime GPU power when benchmarking. (#507 ) Description Collect realtime GPU power when benchmarking.	2023-04-07 20:26:55 +08:00
Yuting Jiang	9f18dea342	Bug - Fix bug to get metric from cmd when error happens (#506 ) Description Fix bug to get metric from cmd when error happens(cudnn-function/_time:4)	2023-04-06 11:13:29 +00:00
Yuting Jiang	14a4a44b01	Analyzer: Fix bug in python3.8 due to pandas api change (#504 ) Description Analyzer: Fix bug in python3.8 due to pandas api change. Major Revision - force check numeric only in dataframe for analysis - dataframe.append -> pd.concat - pd.ExcelWriter.save() -> pd.ExcelWriter.close()	2023-04-06 07:13:48 +00:00
Ziyue Yang	b97ddcf723	Fix wrong torch usage in communication wrapper for Distributed Inference Benchmark (#505 ) Description This commit fixes wrong `torch.empty_like` usage and missing dtype and device argument in communication wrappers.	2023-04-06 06:16:57 +00:00
Yifan Xiong	9d250cdd0c	Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503 ) Fix matrix size overflow issue when cast from int to size_t implicitly.	2023-04-06 13:16:08 +08:00
guoshzhao	26373edb78	Monitor - Fix the cgroup version checking logic. (#502 ) Description Looks `grep cgroup /proc/filesystems` doesn't work for NDv4 whose cgroup version is v1, but the result of this command got v2 for NDv4. Instead, checking the file existence to judge the cgroup version.	2023-04-03 09:22:43 +08:00
Yifan Xiong	97c9a41f14	Benchmark - Update TE FP8 model conversion (#499 ) __Description__ Update TE FP8 model conversion. __Major Revisions__ * Add 16-byte alignment comment. * Fix TE layer parameters type.	2023-03-28 15:01:41 +00:00
Yifan Xiong	c88c970943	Benchmarks - Support TE FP8 in BERT/GPT2 models (#496 ) Support Transformer Engine FP8 in existing PyTorch BERT/GPT2 models by converting linear/layernorm to TE layers.	2023-03-25 19:28:27 +08:00
Ziyue Yang	8daef211dd	Benchmarks - Add distributed inference benchmark (#493 ) Description This PR adds a micro-benchmark of distributed model inference workloads. Major Revision - Add a new micro-benchmark dist-inference. - Add corresponding example and unit tests. - Update configuration files to include this new micro-benchmark. - Update micro-benchmark README. --------- Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-03-24 17:15:17 +08:00
guoshzhao	a9b45a072e	Monitor - Support cgroup V2 when read system metrics. (#491 ) Description Since ubuntu 22.04 will use cgroup V2 and the file structure changed. Modify the monitor to adapt to cgroup v1 and v2.	2023-03-22 08:33:18 +00:00
Yifan Xiong	dbeba8056b	Benchmark - Support batch/shape range in cublaslt gemm (#494 ) Support batch and shape range with multiplication factors in cublaslt gemm benchmark.	2023-03-22 13:22:36 +08:00
rafsalas19	655bd0aa59	Adding HPL benchmark (#482 ) Description - Adding HPL benchmark --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-03-21 16:44:08 +00:00
Yifan Xiong	644b5395df	Benchmark - Fix torch.dist init issue with multiple models (#495 ) Fix potential barrier timeout in init_process_group due to race condition of using the same port. Change to different ports when running multiple models sequentially in one process. For example, when running vgg11/13/16/19, will use port 29501~29504 respectively.	2023-03-21 12:35:03 +00:00
Yuting Jiang	5a88db1601	Benchmarks: Support error tolerance in micro-benchmark for CuDNN function (#490 ) Description Support error tolerance in micro-benchmark for CuDNN function Major Revision - revise micro_base to support running the remaining commands run when one command failed in the microbenchmark - make error tolerance as true in cudnn functions	2023-03-20 21:20:21 +08:00
Yifan Xiong	b808135c27	Benchmarks - Support tensor core precisions in cublaslt gemm (#492 ) Support FP64/TF32/FP16/BF16 in cublaslt (batch) GEMM.	2023-03-20 10:59:40 +08:00
dependabot[bot]	139d4df55f	Bump webpack from 5.39.1 to 5.76.1 in /website (#489 ) Bumps [webpack](https://github.com/webpack/webpack) from 5.39.1 to 5.76.1. - [Release notes](https://github.com/webpack/webpack/releases) - [Commits](webpack/webpack@v5.39.1...v5.76.1) --- updated-dependencies: - dependency-name: webpack dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-03-17 20:18:35 +08:00
Yifan Xiong	35f5390512	Pin setuptools version to v65.7.0 (#483 ) Pin setuptools version to [v65.7.0](https://setuptools.pypa.io/en/latest/history.html#v65-7-0) to avoid breaking changes since v66.0.0.	2023-03-06 11:43:44 +00:00
Yifan Xiong	2cc4cd03e2	Limit ansible_runner version for Python3.6 (#485 ) Limit ansible_runner version to less than 2.3.2 for Python3.6.	2023-03-06 18:54:45 +08:00
Yuting Jiang	eba298f5f0	Benchmarks: Revision - Support flexible warmup and non-random data initialization in cublas-benchmark (#479 ) Description revise cublas-benchmark for flexible warmup and fill data with fixed number for perf test to improve the running efficiency. Major Revision - remove num_in_steps for warmup to support more flexible warmup setting for users - Add support to generate input with fixed number for perf test	2023-02-28 06:35:18 +08:00
Yuting Jiang	0292366075	Benchmarks: Build Pipeline - Add suppport for cpu-only perftest in makefile (#480 ) Description Add suppport to install cpu-only perftest in makefile. Co-authored-by: Yuting Jiang <yuting.jiang@microsoft.com> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-02-24 11:19:46 +08:00
Yifan Xiong	bbb86c4a83	CI/CD - Free disk space in GitHub Action VHD (#481 ) Free more disk space in GitHub Action VHD.	2023-02-23 17:30:39 +08:00
Yuting Jiang	ec7f502c93	CI/CD - Upgrade networkx version to fix installation compatibility issue (#478 ) Description Upgrade networkx version to fix installation compatibility issue.	2023-02-17 05:36:21 +00:00
dependabot[bot]	f041b6eacc	Bump @sideway/formula from 3.0.0 to 3.0.1 in /website (#477 ) Bumps [@sideway/formula](https://github.com/sideway/formula) from 3.0.0 to 3.0.1. - [Release notes](https://github.com/sideway/formula/releases) - [Commits](hapijs/formula@v3.0.0...v3.0.1) --- updated-dependencies: - dependency-name: "@sideway/formula" dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-17 02:33:08 +00:00
dependabot[bot]	e1a489496c	Bump http-cache-semantics from 4.1.0 to 4.1.1 in /website (#474 ) Bumps [http-cache-semantics](https://github.com/kornelski/http-cache-semantics) from 4.1.0 to 4.1.1. - [Release notes](https://github.com/kornelski/http-cache-semantics/releases) - [Commits](kornelski/http-cache-semantics@v4.1.0...v4.1.1) --- updated-dependencies: - dependency-name: http-cache-semantics dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-02-16 14:14:47 +00:00
rafsalas19	32896ca477	Adding Stream Benchmark (#473 ) Description - Added stream benchmark - Added stream unit test - Added stream example - Modified docker files to build stream --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com> Co-authored-by: Yifan Xiong <xiongyf@yandex.com>	2023-02-13 15:34:37 -05:00
Yuting Jiang	62a2913497	Executor - Support SuperBench Executor running on Windows (#475 ) Description Support SuperBench Executor running on Windows. Major Revision - Lazy import ansible related module	2023-02-13 08:20:07 +00:00
pnunna93	f21bfef2f3	Dockerfile: Remove fixed rccl version in rocm5.1.x docker file (#476 ) Description The commit(`e08b6d3a1c`) installs a rccl version which is causing "undefined symbol: ncclGetLastError" while trying to import torch. Revert it to avoid the error.	2023-02-07 15:24:26 +08:00
dependabot[bot]	121a5ddc5e	Bump ua-parser-js from 0.7.28 to 0.7.33 in /website (#469 ) Bumps [ua-parser-js](https://github.com/faisalman/ua-parser-js) from 0.7.28 to 0.7.33. - [Release notes](https://github.com/faisalman/ua-parser-js/releases) - [Changelog](https://github.com/faisalman/ua-parser-js/blob/master/changelog.md) - [Commits](faisalman/ua-parser-js@0.7.28...0.7.33) --- updated-dependencies: - dependency-name: ua-parser-js dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-30 10:58:24 +08:00
Yifan Xiong	b07fda155e	Release - SuperBench v0.7.0 (#468 ) Description Cherry-pick bug fixes from v0.7.0 to main. Major Revisions * Benchmarks - Fix missing include in FP8 benchmark (#460) * Fix bug in TE BERT model (#461) * Doc - Update benchmark doc (#465) * Bug: Fix bug for incorrect datatype judgement in cublas-function source code (#464) * Support `sb deploy` without pulling image (#466) * Docs - Upgrade version and release note (#467) Co-authored-by: Russell J. Hewett <russell.j.hewett@gmail.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>	2023-01-28 11:07:06 +08:00
Yuting Jiang	f380bc5eff	Bug: Fix bug for incorrect datatype judgement in cublas-function source code (#462 ) Description Fix bug for incorrect datatype judgement in cublas-function source code.	2023-01-17 10:51:57 +08:00
dependabot[bot]	65bae28c0d	Bump json5 from 1.0.1 to 1.0.2 in /website (#459 ) Bumps [json5](https://github.com/json5/json5) from 1.0.1 to 1.0.2. - [Release notes](https://github.com/json5/json5/releases) - [Changelog](https://github.com/json5/json5/blob/main/CHANGELOG.md) - [Commits](json5/json5@v1.0.1...v1.0.2) --- updated-dependencies: - dependency-name: json5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-01-09 17:10:40 +08:00
Yang Wang	ccccd988df	Benchmarks - Support topo-aware, pair-wise, and K-batch pattern in nccl-bw benchmark (#454 ) Support traffic patterns under the different devices in NCCL/RCCL test * change the metrics format if specified the pattern	2023-01-04 12:30:32 +00:00
Yang Wang	8e748d5649	Runner - Generate host groups file in mpi mode (#458 ) Major Revision - Add an option for pattern to generate mpi_pattern.txt file if specified the path. - In mpi pattern, serial_index and parallel_index will add in each benchmark as environment variables. Minor Revision - Fix typo	2023-01-04 19:49:14 +08:00
Yifan Xiong	5197cdf5cb	Benchmarks - Support FP8 in BERT models (#446 ) Support FP8 in PyTorch BERT models: * add fp8 hybrid/e4m3/e5m2 in precision arguments * build BERT encoders with `te.TransformerLayer` to repalce `transformers.BertModel` * wrap forward steps with fp8 autocast	2023-01-04 11:12:05 +08:00
Yang Wang	65e433c0c6	Runner: Support `topo-aware` and `k-batch` pattern in 'mpi' mode (#437 ) Description Support the following patterns in `mpi` mode: * `k-batch` * `topo-aware`	2023-01-03 10:28:35 +00:00
Yifan Xiong	fc661f7db3	Support GEMM benchmark on Hopper GPUs (#456 ) Support GEMM benchmark on Hopper GPUs.	2023-01-03 09:45:27 +00:00
Yifan Xiong	616e7a5a5a	Benchmarks - Integrate cublaslt micro-benchmark (#455 ) Integrate cublaslt-gemm micro-benchmark #451.	2023-01-03 08:54:40 +00:00
Yuting Jiang	75573f59da	Benchmarks: Micro benchmarks - Add correctness check in cublas-function benchmark (#452 ) Description Add correctness check in cublas-function benchmark. Major Revision - add python code of correctness check in cublas-function benchmark and test	2023-01-03 14:59:30 +08:00
Yifan Xiong	0591da5f49	Benchmarks - Add cuBLASLt FP16 and FP8 GEMM micro-benchmark (#451 ) Add micro-benchmark for cublaslt fp8 gemm.	2023-01-03 05:28:56 +00:00
Yuting Jiang	678b1251f1	Benchmarks: Micro benchmarks - add source code of correctness check for cublas functions (#450 ) Description Add c source code of correctness check for cublas functions. Major Revision - add correctness check for all supported cublas functions - add --correctness option into binary Minor Revision - fix bug and template fill_data and prepare_tensor to get right memory-alignment output matrix for different datatype	2023-01-03 04:20:10 +00:00
Yuting Jiang	9dfefce350	Executor - Add stdout logging util module and enable real-time logging flushing in executor (#445 ) Description Add stdout logging util module and enable real-time logging flushing in executor Major Revision - Add stdout logging util module to redirect stdout into file log - enable stdout logging in executor to write benchmark output into both stdout and file `sb-bench.log` - enable real-time log flushing in run_command of microbenchmarks through config `log_flushing` Minor Revision - add log_n_step args to enable regular step time log in model benchmarks - udpate related docs	2022-12-30 09:40:28 +00:00
Yang Wang	f2634d8608	Benchmarks - Support `pair-wise` pattern in IB validation benchmark (#453 ) Description * Reuse `gen_pair_wise_config` in micro-benchmark	2022-12-30 13:02:52 +08:00
Yifan Xiong	a3c65b2a57	Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449 ) Add Docker image for arch90 NVIDIA GPUs: * add CUDA11.8 Dockerfile * update archs in Makefile and benchmarks accordingly * update image build pipeline	2022-12-29 12:19:38 +00:00
Yang Wang	7838b6b154	Runner - Support `pair-wise` pattern in `mpi` mode (#447 ) * Extract pair-wise pattern from ib_validation	2022-12-29 08:23:36 +00:00
dependabot[bot]	6186146d59	Bump qs and express in /website (#440 ) Bumps [qs](https://github.com/ljharb/qs) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `qs` from 6.7.0 to 6.11.0 - [Release notes](https://github.com/ljharb/qs/releases) - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](https://github.com/ljharb/qs/compare/v6.7.0...v6.11.0) Updates `express` from 4.17.1 to 4.18.2 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](https://github.com/expressjs/express/compare/4.17.1...4.18.2) --- updated-dependencies: - dependency-name: qs dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-28 13:47:06 +08:00
dependabot[bot]	de6deb0e2d	Bump decode-uri-component from 0.2.0 to 0.2.2 in /website (#439 ) Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component) from 0.2.0 to 0.2.2. - [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases) - [Commits](https://github.com/SamVerschueren/decode-uri-component/compare/v0.2.0...v0.2.2) --- updated-dependencies: - dependency-name: decode-uri-component dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-27 13:40:49 +08:00
Yuting Jiang	6583ba2e40	Benchmark: Revision - Add wait time option to resolve mem-bw unstable issue (#438 ) Description Add wait time option to resolve mem-bw unstable issue.	2022-12-14 17:21:02 +08:00
Yuting Jiang	1deb2eaa29	downgrage transformers version to fix tersorrt (#441 ) Description Downgrage transformers version to fix tersorrt test failure.	2022-12-14 14:19:32 +08:00

1 2 3 4 5 ...

376 Коммитов Все ветки Поиск

376 Коммитов

Все ветки