superbenchmark

Граф коммитов

Автор	SHA1	Сообщение	Дата
pdr	479491279e	Dockerfile - Add support for arm64 build (#660 ) Add support for arm64 build: - Updated dockerfile for arm64 build - extend cpu stream compilation for neoverse - handle onnxruntime-gpu installation - third party builds filtering based on arch - disable cuda decode perf build for non x86	2024-11-06 23:16:12 +00:00
Yuting Jiang	2101e933cc	CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline (#633 ) Description Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM issue.	2024-07-28 23:43:06 +00:00
Yuting Jiang	e304cf1572	Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634 ) Description Add support GPU ARCH 8.9 for NVIDIA L4/L40/L40s GPUs in gemm-flops.	2024-07-26 02:42:17 +00:00
Yifan Xiong	2c88db907f	Release - SuperBench v0.10.0 (#607 ) Description Cherry-pick bug fixes from v0.10.0 to main. Major Revisions * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590 * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591 * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592 * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595 * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596 * CI/CD - Add ndv5 topo file #597 * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593 * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599 * Dockerfile - Bug fix for rocm docker build and deploy #598 * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603 * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604 * Monitor - Upgrade pyrsmi to amdsmi python library. #601 * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605 * Dockerfile - Add rocm6.0 dockerfile #602 * Bug Fix - Bug fix for latest megatron-lm benchmark #600 * Docs - Upgrade version and release note #606 Co-authored-by: Ziyue Yang <ziyyang@microsoft.com> Co-authored-by: Yang Wang <yangwang1@microsoft.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com> Co-authored-by: guoshzhao <guzhao@microsoft.com>	2024-01-08 05:40:52 +00:00
Yuting Jiang	1f5031bd74	Dockerfile - Upgrade to rocm5.7 dockerfile (#587 ) Description upgrade to rocm5.7 dockerfile. --------- Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-12-09 17:41:12 +00:00
Ziyue Yang	6ef3a0110f	Benchmarks: Add MSCCL Support for Nvidia GPU (#584 ) Description Add MSCCL support for Nvidia GPU	2023-12-07 19:57:28 +08:00
Yuting Jiang	dd5a6329ed	Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark (#582 ) Description Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark	2023-12-07 09:37:09 +08:00
Yuting Jiang	79089b6517	Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576 ) Description hipblaslt function benchmark and rebase cublaslt function benchmark.	2023-11-22 19:48:10 +08:00
Yuting Jiang	27a10811af	Benchmarks: micro benchmark - source code for evaluating NVDEC decoding performance (#560 ) Description source code for evaluating NVDEC decoding performance. --------- Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-08-22 10:56:33 +00:00
Yuting Jiang	e1df877bfe	Release - SuperBench v0.9.0 (#558 ) Description Cherry-pick bug fixes from v0.9.0 to main. Major Revision - CI/CD: pipeline - clean more disk space to fix rocm building image pipeline(#555 ) - Benchmarks: bug fix - use absolute path for input file in DirectXEncodingLatency(#554) - CI/CD - add push win docker image on release branch in pipeline (#552) - Docs - Upgrade version and release note(#557)	2023-07-27 10:42:31 +08:00
Yuting Jiang	865472177f	Benchmarks: Build Pipeline - add AMF in third party and build AMF encoding latency test (#543 ) Description add AMF in third party and build AMF encoding latency test.	2023-07-03 14:43:21 +00:00
rafsalas19	655bd0aa59	Adding HPL benchmark (#482 ) Description - Adding HPL benchmark --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-03-21 16:44:08 +00:00
Yuting Jiang	0292366075	Benchmarks: Build Pipeline - Add suppport for cpu-only perftest in makefile (#480 ) Description Add suppport to install cpu-only perftest in makefile. Co-authored-by: Yuting Jiang <yuting.jiang@microsoft.com> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>	2023-02-24 11:19:46 +08:00
rafsalas19	32896ca477	Adding Stream Benchmark (#473 ) Description - Added stream benchmark - Added stream unit test - Added stream example - Modified docker files to build stream --------- Co-authored-by: Ubuntu <azureuser@sbtestvm.jzlku1oskncengjiado35wf1hd.ax.internal.cloudapp.net> Co-authored-by: Peng Cheng <chengpeng5555@outlook.com> Co-authored-by: Yifan Xiong <xiongyf@yandex.com>	2023-02-13 15:34:37 -05:00
Yifan Xiong	a3c65b2a57	Dockerfile - Add CUDA11.8 Docker image for Nvidia arch90 GPUs (#449 ) Add Docker image for arch90 NVIDIA GPUs: * add CUDA11.8 Dockerfile * update archs in Makefile and benchmarks accordingly * update image build pipeline	2022-12-29 12:19:38 +00:00
Yuting Jiang	e335556d7a	Benchmarks: Build Pipeline - Degrade perftest submodule to fix stability issue (#386 ) Description Degrade perftest submodule to v4.4-0.37 to fix stability issue. Issue: rdma-loopback is not stable on public version(v0.5/v0.6-rc1) Docker Version: v0.6-rc1-cuda11.1 Testbed: 8 A100 40GB GPUs (1 NDv4 node) Result: New perftest version introduce the variance, max-min/mean = 2% for v4.4-0.37, 8% for v4.5-0.2	2022-08-16 11:49:10 +08:00
Yifan Xiong	9f03d5687a	Update dependencies and Dockerfile (#371 ) Update dependencies and Dockerfile: * upgrade nccl-tests and rccl-tests to current latest version to match NCCL/RCCL versions * unify image tag names on DockerHub * remove verbose output in Dockerfile and minor fix some flags	2022-07-06 10:31:41 +00:00
Yifan Xiong	483bf782e1	Update ROCm Dockerfile (#361 ) Description Update ROCm Dockerfile. Major Revisions - Add dockerfile for ROCm 5.1.3 - Merge 5.1.x and 5.0.x dockerfile - Remove 4.2 and 4.0 legacy - Update build pipeline accordingly	2022-06-19 17:26:39 +08:00
Yifan Xiong	60a3c74306	Fix cmake and build issues (#360 ) Description Fix cmake and build issues. Major Revision * Remove unnecessary boost build * Remove user-agent for mlc * Remove -j for third party to build each project in sequence * Fix ansible collections installation path	2022-06-15 13:07:57 +08:00
rafsalas19	ff51a3cee9	Benchmarks: Add Feature - Add GPU-Burn as microbenchmark (#324 ) Description Modifications adding GPU-Burn to SuperBench. - added third party submodule - modified Makefile to make gpu-burn binary - added/modified microbenchmarks to add gpu-burn python scripts - modified default and azure_ndv4 configs to add gpu-burn	2022-03-16 16:20:11 +08:00
Yuting Jiang	4f5027dbda	Benchmarks: Build Pipeline - Make gpcnet only for cuda (#316 ) Description Make gpcnet only for cuda.	2022-02-24 18:18:49 +08:00
Yuting Jiang	4abda6f5d4	Benchmarks: Build Pipeline - Update rccl-tests submodule to fix divide by zero error (#306 ) Description Update rccl-tests submodule to fix divide by zero error.	2022-02-09 14:46:29 +00:00
Yifan Xiong	3419447c11	Benchmarks - Support T4 and A10 in GEMM benchmark (#294 ) Support T4 and A10 in GEMM benchmark.	2022-01-29 13:26:00 +00:00
Yifan Xiong	ff563b66af	Release - SuperBench v0.4.0 (#278 ) __Description__ Cherry-pick bug fixes from v0.4.0 to main. __Major Revisions__ * Bug - Fix issues for Ansible and benchmarks (#267) * Tests - Refine test cases for microbenchmark (#268) * Bug - Build openmpi with ucx support in rocm dockerfiles (#269) * Benchmarks: Fix Bug - Fix fio build issue (#272) * Docs - Unify metric and add doc for cublas and cudnn functions (#271) * Monitor: Revision - Add 'monitor/' prefix to monitor metrics in result summary (#274) * Bug - Fix bug of detecting if gpu_index is none (#275) * Bug - Fix bugs in data diagnosis (#273) * Bug - Fix issue that the root mpi rank may not be the first in the hostfile (#270) * Benchmarks: Configuration - Update inference and network benchmarks in configs (#276) * Docs - Upgrade version and release note (#277) Co-authored-by: Yuting Jiang <v-yutjiang@microsoft.com>	2021-12-30 16:24:00 +08:00
Ziyue Yang	b0e759f599	Benchmarks: Build Pipeline - Upgrade FIO benchmark tool (#251 ) Description Upgrade FIO benchmark tool from 3.27 to 3.28.	2021-12-01 20:33:09 +08:00
Yuting Jiang	b592a7c793	Benchmarks: Build Pipeline - Add gpcnet as git submodule and building logic (#228 ) Description Add gpcnet as git submodule and building logic. Major Revision - add gpcnet as a submodule - add build logic in third_party/Makefile	2021-10-21 11:28:51 +00:00
Yifan Xiong	dfbd70b129	Release - SuperBench v0.3.0 (#212 ) Description Cherry-pick bug fixes from v0.3.0 to main. Major Revisions * Docs - Upgrade version and release note (#209) * Benchmarks: Build Pipeline - Update rccl-test git submodule to dc1ad48 (#210) * Benchmarks: Update - Update benchmarks in configuration file (#208) * CI/CD - Update GitHub Action VM (#211) * Benchmarks: Fix Bug - Fix wrong parameters for gpu-sm-copy-bw in configuration examples (#203) * CI/CD - Fix bug in build image for push event (#205) * Benchmark: Fix Bug - fix error message of communication-computation-overlap (#204) * Tool: Fix bug - Fix function naming issue in system info (#200) * CI/CD - Push images in GitHub Action (#202) * Bug - Fix torch.distributed command for single node (#201) * CLI - Integrate system info for node (#199) * Benchmarks: Code Revision - Revise CMake files for microbenchmarks. (#196) * CI/CD - Add ROCm image build in GitHub Actions (#194) * Bug: Fix bug - fix bug of hipBusBandwidth build (#193) * Benchmarks: Build Pipeline - Restore rocblas build logic (#197) * Bug: Fix Bug - Add barrier before 'destroy_process_group' in model benchmarks (#198) * Bug - Revise 'docker run' in sb deploy (#195) * Bug - Fix Bug : fix bug of error param operations to operation in rccl-bw of hpe config (#190) Co-authored-by: Yuting Jiang <v-yujiang@microsoft.com> Co-authored-by: Guoshuai Zhao <guzhao@microsoft.com> Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>	2021-09-26 09:30:31 +08:00
Yuting Jiang	b90b47f3c1	Benchmarks: Build Pipeline - Support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker (#172 ) Description Revise rocblas building logic in third_party/makefile to support rocblas building in rocm4.0_ubuntu18.04_py3.6_pytorch_1.7.0 docker. Major Revision - add extra building logic including env about pthread limit and build command restrict to reduce amount of resource used Minor Revision - make rocm_version to be able to modify	2021-09-01 06:21:02 +08:00
Yuting Jiang	a1e5c90d43	Benchmarks: Build Pipeline - Add build logic of hipBusBandwidth in third_party (#151 ) Description Add build logic of hipBusBandwidth in third_party. Major Revision - Add build logic of hipBusBandwidth in third_party	2021-08-20 14:32:14 +08:00
Yuting Jiang	86c390a912	Benchmarks: Build Pipeline - Add rocBLAS building logic in third_party (#144 ) Description Add rocBLAS building logic in third_party. Major Revision - Add rocm_rocblas target in third_party/Makefile. - Add rocblas building logic	2021-08-02 17:30:02 +08:00
Yuting Jiang	a532eee414	Benchmarks: Build Pipeline - add rccl-tests as a submodule with building logic (#139 ) Description Support rocm in third_party/makefile and add rccl-tests as a submodule with building logic. Major Revision - Support rocm in third_party/makefile - Add rccl-tests as a submodule - Add build logic in third_party/Makefile for rccl-tests	2021-07-30 07:27:08 +08:00
Yuting Jiang	c88ce05611	Benchmarks: Build Pipeline - Support rocm in third_party/makefile (#140 ) Description Support rocm in third_party/makefile. Major Revision - Split rocm and cuda target in makefile - Add target in dockerfile	2021-07-29 14:18:36 +08:00
Ziyue Yang	4bbd7f513d	Benchmarks: Build Pipeline - Add FIO benchmark tool (#127 ) Description Add FIO benchmark tool into third-party dependency. Major Revision - Add FIO submodule into third-party directory and modify Makefile to enable it.	2021-07-19 12:59:28 +08:00
Yuting Jiang	419dea265a	Benchmarks: Build Pipeline - Add perftest as a submodule and add build logic (#129 ) Add perftest as a submodule and add build logic	2021-07-16 16:11:07 +08:00
Yuting Jiang	8c8beb4b04	Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic. (#128 ) Benchmarks: Build Pipeline - Add nccl-tests as a submodule and add build logic.	2021-07-16 14:51:56 +08:00
Yuting Jiang	9547ccc19a	Benchmarks: Fix bug - fix bug of third_party/cuda-samples git checkout issue when building docker (#126 ) * fix bug in docker build of third_party/cuda-samples	2021-07-15 16:41:57 +08:00
Yuting Jiang	f9550bd693	Benchmarks: Add Benchmark - Add memory bandwidth benchmark for cuda. (#114 ) Add microbenchmark, example, test, config for cuda memory performance and Add cuda-samples(tag with cuda version) as git submodule and update related makefile	2021-07-13 17:30:19 +08:00
guoshzhao	40d7905e6b	Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85 ) * add cutlass as submodule. * add build script for cutlass. * only support compute capability 7.0(V100) and 8.0(A100)	2021-06-01 11:17:44 +08:00

38 Коммитов