superbenchmark

Граф коммитов

Автор	SHA1	Сообщение	Дата
Yuting Jiang	7af75df392	Bug Fix: Data Diagnosis - Fix bug of failure test and warning of pandas in data diagnosis (#638 ) Description Fix bug of failure test and warning of pandas in data diagnosis. Major Revision - fix warning of pandas in replace and fillna due to type downcast - fix bug of failure check function only check one matched metric rather than all matched metrics - fix bug when converting regex into str of metrics when there're more than one match group	2024-08-16 09:04:24 +08:00
Yang Wang	46a5792915	Bug Fix - Update Docker Exec Command for Persistent HPCX Environment (#635 ) Add 10-hpcx.sh to /etc/profile.d Update the Docker exec command to ensure a persistent HPCX environment.	2024-08-13 16:35:01 +00:00
Yang Wang	9de841bc95	Use `types-setuptools` as `types-pkg_resources` is Yanked (#637 ) * https://pypi.org/project/types-pkg-resources/ * Use types-setuptools instead	2024-08-08 22:30:37 +08:00
Yuting Jiang	2101e933cc	CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline (#633 ) Description Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM issue.	2024-07-28 23:43:06 +00:00
Yuting Jiang	e304cf1572	Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634 ) Description Add support GPU ARCH 8.9 for NVIDIA L4/L40/L40s GPUs in gemm-flops.	2024-07-26 02:42:17 +00:00
dependabot[bot]	4e27142a59	Bump express from 4.18.2 to 4.19.2 in /website (#618 ) Bumps [express](https://github.com/expressjs/express) from 4.18.2 to 4.19.2. - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](expressjs/express@4.18.2...4.19.2) --- updated-dependencies: - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-26 09:12:11 +08:00
dependabot[bot]	b4945fb29c	Bump ws from 6.2.2 to 6.2.3 in /website (#629 ) Bumps [ws](https://github.com/websockets/ws) from 6.2.2 to 6.2.3. - [Release notes](https://github.com/websockets/ws/releases) - [Commits](websockets/ws@6.2.2...6.2.3) --- updated-dependencies: - dependency-name: ws dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-25 15:59:48 +08:00
omahs	a4c87da0ac	Docs - fix typos (#628 ) Docs - fix typos	2024-07-25 03:49:19 +00:00
dependabot[bot]	4102302a96	Bump ip from 1.1.5 to 1.1.9 in /website (#610 ) Bumps [ip](https://github.com/indutny/node-ip) from 1.1.5 to 1.1.9. - [Commits](indutny/node-ip@v1.1.5...v1.1.9) --- updated-dependencies: - dependency-name: ip dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-25 02:15:52 +00:00
dependabot[bot]	6e556d76e8	Bump follow-redirects from 1.14.8 to 1.15.6 in /website (#613 ) Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.14.8 to 1.15.6. - [Release notes](https://github.com/follow-redirects/follow-redirects/releases) - [Commits](follow-redirects/follow-redirects@v1.14.8...v1.15.6) --- updated-dependencies: - dependency-name: follow-redirects dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-24 23:59:58 +00:00
Yifan Xiong	1362732c79	Docs - Add BibTeX in README and repo (#632 ) Add BibTeX for citation in README and repo.	2024-07-23 18:31:21 -07:00
Yang Wang	9a3ce39d5a	Update omegaconf version to 2.3.0 (#631 ) Update `omegaconf` version to [2.3.0](https://pypi.org/project/omegaconf/2.3.0/) as omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will enforce this behaviour change. Discussion can be found at https://github.com/pypa/pip/issues/12063.	2024-07-23 14:46:28 -07:00
Yuting Jiang	7435f10a22	Dockerfile - Add CUDA 12.4 dockerfile (#619 ) Description Add CUDA 12.4 dockerfile. Major Revision - upgrade nvidia docker into 23.04 Minor Revision - upgrade hpcx into 2.18	2024-04-22 06:36:19 +00:00
Yuting Jiang	dc3846cbd4	Dockerfile - Upgrade mlc to v3.11 (#620 ) Description Upgrade mlc to v3.11.	2024-04-18 10:59:36 +08:00
Ziyue Yang	cc89ee591c	Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation (#616 ) Description Adds hipblasLt tuning to dist-inference cpp implementation.	2024-04-02 09:56:33 +08:00
Yang Wang	eeaa9b1ac9	Bug Fix - Bug fix for cuda 12.2 dockerfile LD_LIBRARY_PATH issue (#614 ) Description Cuda 12.2 image will report undfined symbol error due to incomplete LD_LIBRARY_PATH: ![image](https://github.com/microsoft/superbenchmark/assets/25875482/1a7c48c7-cb6b-4e3a-abbe-dde23007a96b) ### How to reproduce: 1. Deploy sb with cuda12.2 image ``` sb deploy -f local.ini -i superbench/superbench:v0.10.0-cuda12.2 ``` 2. Enter to the container ``` sudo docker exec -it sb-workspace bash ``` 3. Execute `mpirun`: ``` root@sb-container:~# mpirun mpirun: symbol lookup error: mpirun: undefined symbol: opal_libevent2022_event_base_loop ``` ### Fix to fix * Append hpcx_load into /etc/bash.bashrc for updaing env LD_LIBRARY_PATH in each time ---------	2024-03-21 15:05:55 +00:00
Yifan Xiong	2c88db907f	Release - SuperBench v0.10.0 (#607 ) Description Cherry-pick bug fixes from v0.10.0 to main. Major Revisions * Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590 * Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591 * Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592 * Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595 * Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596 * CI/CD - Add ndv5 topo file #597 * Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593 * Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599 * Dockerfile - Bug fix for rocm docker build and deploy #598 * Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603 * Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604 * Monitor - Upgrade pyrsmi to amdsmi python library. #601 * Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605 * Dockerfile - Add rocm6.0 dockerfile #602 * Bug Fix - Bug fix for latest megatron-lm benchmark #600 * Docs - Upgrade version and release note #606 Co-authored-by: Ziyue Yang <ziyyang@microsoft.com> Co-authored-by: Yang Wang <yangwang1@microsoft.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com> Co-authored-by: guoshzhao <guzhao@microsoft.com>	2024-01-08 05:40:52 +00:00
Ziyue Yang	2c2096ed83	Benchmark: Revision - Fix -O2 option passing in gpu_copy ROCm build (#589 ) Description `add_compile_options` will not work for ROCm build, change it to setting `CMAKE_CXX_FLAGS`.	2023-12-11 04:34:51 +00:00
Ziyue Yang	719a427fe7	Benchmarks: Microbenchmark - Add distributed inference benchmark cpp implementation (#586 ) Description Add distributed inference benchmark cpp implementation.	2023-12-11 06:53:51 +08:00
Yuting Jiang	1f5031bd74	Dockerfile - Upgrade to rocm5.7 dockerfile (#587 ) Description upgrade to rocm5.7 dockerfile. --------- Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-12-09 17:41:12 +00:00
Ziyue Yang	4fa60be7cd	Benchmarks: Micro benchmark - Add one-to-all, all-to-one, all-to-all support to gpu_copy_bw_performance (#588 ) Description Add one-to-all, all-to-one, all-to-all support to gpu_copy_bw_performance, and fix performance bug in gpu_copy	2023-12-08 23:22:38 +08:00
Ziyue Yang	6ef3a0110f	Benchmarks: Add MSCCL Support for Nvidia GPU (#584 ) Description Add MSCCL support for Nvidia GPU	2023-12-07 19:57:28 +08:00
Yuting Jiang	dd5a6329ed	Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark (#582 ) Description Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark	2023-12-07 09:37:09 +08:00
Ziyue Yang	254ea7feba	Benchmarks: Micro benchmark - Add graph mode in NCCL/RCCL benchmarks for latency metrics (#583 ) Description Revise NCCL/RCCL benchmarks to graph mode add latency metrics.	2023-12-05 16:48:13 +08:00
Yuting Jiang	9ae8c67093	Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation (#581 ) Description Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation Major Revision - Support cpu-gpu and gpu-cpu in ib-validation Minor Revision - support multi msg size, multi direction, multi ib commands in ib-validation	2023-12-04 22:20:46 +08:00
guoshzhao	028819b388	Monitor - Add support for AMD GPU. (#580 ) Description Add AMD support in monitor. Major Revision - Add library pyrsmi to collect metrics. - Currently can get device_utilization, device_power, device_used_memory and device_total_memory.	2023-11-27 18:45:56 +08:00
Yifan Xiong	1ad1c21c38	Dockerfile - Upgrade Docker image to CUDA 12.2 (#577 ) Upgrade Docker image to CUDA 12.2 for H100: * upgrade base image to 23.10 * fix onnxruntime version in python3.10 * fix compilation errors	2023-11-22 13:48:18 +00:00
Yuting Jiang	2235e084ab	Benchmarks: Micro benchmark - add initialization options for rocm gemm flops (#578 ) Description add initialization options for rocm gemm flops.	2023-11-22 12:52:22 +00:00
Yuting Jiang	79089b6517	Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576 ) Description hipblaslt function benchmark and rebase cublaslt function benchmark.	2023-11-22 19:48:10 +08:00
guoshzhao	9f4880cb8e	Analyzer - Generate baseline given results from multiple nodes. (#575 ) Description Generate baseline given results from multiple nodes. Major Revision - Add sub command `sb result generate-baseline` - Add UT and docs --------- Co-authored-by: 454314380 <454314380@qq.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>	2023-11-22 14:42:32 +08:00
Yuting Jiang	f53d941a22	Benchmarks: micro benchmarks - add int8 support for cublaslt function (#574 ) Description add int8 support for cublaslt function.	2023-11-20 11:21:20 +08:00
Yuting Jiang	c7800bb8e0	Bug Fix - remove cp ptx file command in gpu burn test (#567 ) Description remove cp ptx file in gpu burn test since the command is run inside self.args.bin_dir dir. `d246bab430/superbench/benchmarks/micro_benchmarks/micro_base.py (L183)`	2023-11-14 03:52:56 +00:00
dependabot[bot]	ce3737f98b	Bump @babel/traverse from 7.14.5 to 7.23.2 in /website (#566 ) Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse) from 7.14.5 to 7.23.2. - [Release notes](https://github.com/babel/babel/releases) - [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md) - [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse) --- updated-dependencies: - dependency-name: "@babel/traverse" dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-11-07 10:36:42 +08:00
dependabot[bot]	07477c3bae	Bump postcss from 8.3.5 to 8.4.31 in /website (#564 ) Bumps [postcss](https://github.com/postcss/postcss) from 8.3.5 to 8.4.31. - [Release notes](https://github.com/postcss/postcss/releases) - [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md) - [Commits](postcss/postcss@8.3.5...8.4.31) --- updated-dependencies: - dependency-name: postcss dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-11-05 11:35:49 +00:00
Yuting Jiang	d246bab430	Dockerfile - update mlc version into 3.10 for cuda and rocm dockerfiles (#562 ) Description Update mlc version into 3.10 for cuda and rocm dockerfiles to be consistent with cuda12 dockerfile Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-10-23 11:21:17 +08:00
Yuting Jiang	27a10811af	Benchmarks: micro benchmark - source code for evaluating NVDEC decoding performance (#560 ) Description source code for evaluating NVDEC decoding performance. --------- Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-08-22 10:56:33 +00:00
Yuting Jiang	6c0205cece	Benchmarks: micro benchmarks - add source code for DirectXRenderPerf (#549 ) Description add source code for DirectXRenderPerf. --------- Co-authored-by: yukirora <yuting.jiang@microsoft.com>	2023-08-18 05:17:04 +00:00
pnunna93	67f2aa7237	Benchmarks: model benchmarks - change torch.distributed.launch to torchrun (#556 ) This PR has following changes - torch.distributed.launch changed to torchrun. torch.distributed.launch is deprecated in latest Pytorch and is recommended to move to torchrun - https://pytorch.org/docs/stable/elastic/run.html - Changes to AMD GPU detection logic. The AMD GPU detection logic throws warning when containers have only renderD in /dev/dri, this change would resolve those warnings --------- Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>	2023-08-08 13:03:32 +08:00
Yuting Jiang	e1df877bfe	Release - SuperBench v0.9.0 (#558 ) Description Cherry-pick bug fixes from v0.9.0 to main. Major Revision - CI/CD: pipeline - clean more disk space to fix rocm building image pipeline(#555 ) - Benchmarks: bug fix - use absolute path for input file in DirectXEncodingLatency(#554) - CI/CD - add push win docker image on release branch in pipeline (#552) - Docs - Upgrade version and release note(#557)	2023-07-27 10:42:31 +08:00
dependabot[bot]	466b477e9d	Bump semver from 5.7.1 to 5.7.2 in /website (#550 ) Bumps [semver](https://github.com/npm/node-semver) from 5.7.1 to 5.7.2. - [Release notes](https://github.com/npm/node-semver/releases) - [Changelog](https://github.com/npm/node-semver/blob/v5.7.2/CHANGELOG.md) - [Commits](npm/node-semver@v5.7.1...v5.7.2) --- updated-dependencies: - dependency-name: semver dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2023-07-24 17:07:35 +08:00
Yuting Jiang	e8ac0b1e28	Benchmarks: micro benchmarks - add python code for DirectXGPUEncodingLatency (#548 ) Description add python code for DirectXGPUEncodingLatency.	2023-07-06 15:31:28 +08:00
Yuting Jiang	c8c079c2af	Benchmarks: micro benchmarks - add python code for DirectXGPUCopy (#546 ) Description add python code for DirectXGPUCopy.	2023-07-06 00:15:32 +08:00
Yuting Jiang	af4cfd5bbf	Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw (#547 ) Description add python code for DirecXGPUMemBw.	2023-07-05 22:07:13 +08:00
Yuting Jiang	f1d608aef7	Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops (#542 ) Description add python code for DirectX core flops and init DirectX test pipeline. Major Revision - add python code for DirectX core flops - init DirectX test pipeline Minor Revision - add test for DirectX core flops	2023-07-05 16:56:21 +08:00
Yuting Jiang	3704a432b9	CI/CD - Support DirectX test pipeline (#545 ) Description Support DirectX test pipeline.	2023-07-05 11:33:40 +08:00
Yuting Jiang	865472177f	Benchmarks: Build Pipeline - add AMF in third party and build AMF encoding latency test (#543 ) Description add AMF in third party and build AMF encoding latency test.	2023-07-03 14:43:21 +00:00
Yuting Jiang	97f7b1df86	Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions (#540 ) Description add auto selecting algorithm support for cudnn functions. Major Revision - add auto selecting algorithm support for cudnn functions in source code - add 'auto_algo' option in benchmark - add related test	2023-06-30 12:58:41 +00:00
Lei Qu	c7d0beaf9e	Doc - Update outdate references in micro-benchmarks.md (#544 ) Modify link for Nvidia bandwidth test tool Description previous link is 404 Minor Revision update the link value to https://github.com/NVIDIA/cuda-samples/tree/master/Samples/1_Utilities/bandwidthTest	2023-06-30 19:17:41 +08:00
Yifan Xiong	7184bdd1ed	Benchmarks - Update result parsing in tensorrt inference (#541 ) * Update result parsing for newer tensorrt versions * Update arguments when load torchvision models	2023-06-30 11:22:46 +08:00
Yuting Jiang	f259913707	Benchmarks: Add benchmark - Add source code of DirectxGPUCopy microbenchmark (#486 ) Description Add source code of DirectxGPUCopy microbenchmark.	2023-06-29 19:38:01 +08:00

1 2 3 4 5 ...

431 Коммитов Все ветки Поиск

431 Коммитов

Все ветки