Yuting Jiang
7af75df392
Bug Fix: Data Diagnosis - Fix bug of failure test and warning of pandas in data diagnosis ( #638 )
...
**Description**
Fix bug of failure test and warning of pandas in data diagnosis.
**Major Revision**
- fix warning of pandas in replace and fillna due to type downcast
- fix bug of failure check function only check one matched metric rather
than all matched metrics
- fix bug when converting regex into str of metrics when there're more
than one match group
2024-08-16 09:04:24 +08:00
Yang Wang
46a5792915
Bug Fix - Update Docker Exec Command for Persistent HPCX Environment ( #635 )
...
Add 10-hpcx.sh to /etc/profile.d
Update the Docker exec command to ensure a persistent HPCX environment.
2024-08-13 16:35:01 +00:00
Yang Wang
9de841bc95
Use `types-setuptools` as `types-pkg_resources` is Yanked ( #637 )
...
* https://pypi.org/project/types-pkg-resources/
* Use types-setuptools instead
2024-08-08 22:30:37 +08:00
Yuting Jiang
2101e933cc
CI/CD - Fix MSCCL build error in CUDA12.4 docker build pipeline ( #633 )
...
**Description**
Fix MSCCL build error in CUDA12.4 docker build pipeline due to OOM
issue.
2024-07-28 23:43:06 +00:00
Yuting Jiang
e304cf1572
Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops ( #634 )
...
**Description**
Add support GPU ARCH 8.9 for NVIDIA L4/L40/L40s GPUs in gemm-flops.
2024-07-26 02:42:17 +00:00
dependabot[bot]
4e27142a59
Bump express from 4.18.2 to 4.19.2 in /website ( #618 )
...
Bumps [express](https://github.com/expressjs/express ) from 4.18.2 to 4.19.2.
- [Release notes](https://github.com/expressjs/express/releases )
- [Changelog](https://github.com/expressjs/express/blob/master/History.md )
- [Commits](expressjs/express@4.18.2...4.19.2)
---
updated-dependencies:
- dependency-name: express
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-07-26 09:12:11 +08:00
dependabot[bot]
b4945fb29c
Bump ws from 6.2.2 to 6.2.3 in /website ( #629 )
...
Bumps [ws](https://github.com/websockets/ws ) from 6.2.2 to 6.2.3.
- [Release notes](https://github.com/websockets/ws/releases )
- [Commits](websockets/ws@6.2.2...6.2.3)
---
updated-dependencies:
- dependency-name: ws
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-07-25 15:59:48 +08:00
omahs
a4c87da0ac
Docs - fix typos ( #628 )
...
Docs - fix typos
2024-07-25 03:49:19 +00:00
dependabot[bot]
4102302a96
Bump ip from 1.1.5 to 1.1.9 in /website ( #610 )
...
Bumps [ip](https://github.com/indutny/node-ip ) from 1.1.5 to 1.1.9.
- [Commits](indutny/node-ip@v1.1.5...v1.1.9)
---
updated-dependencies:
- dependency-name: ip
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-07-25 02:15:52 +00:00
dependabot[bot]
6e556d76e8
Bump follow-redirects from 1.14.8 to 1.15.6 in /website ( #613 )
...
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects ) from 1.14.8 to 1.15.6.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases )
- [Commits](follow-redirects/follow-redirects@v1.14.8...v1.15.6)
---
updated-dependencies:
- dependency-name: follow-redirects
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-07-24 23:59:58 +00:00
Yifan Xiong
1362732c79
Docs - Add BibTeX in README and repo ( #632 )
...
Add BibTeX for citation in README and repo.
2024-07-23 18:31:21 -07:00
Yang Wang
9a3ce39d5a
Update omegaconf version to 2.3.0 ( #631 )
...
Update `omegaconf` version to
[2.3.0](https://pypi.org/project/omegaconf/2.3.0/ ) as omegaconf 2.0.6
has a non-standard dependency specifier PyYAML>=5.1.*. pip 24.1 will
enforce this behaviour change.
Discussion can be found at https://github.com/pypa/pip/issues/12063 .
2024-07-23 14:46:28 -07:00
Yuting Jiang
7435f10a22
Dockerfile - Add CUDA 12.4 dockerfile ( #619 )
...
**Description**
Add CUDA 12.4 dockerfile.
**Major Revision**
- upgrade nvidia docker into 23.04
**Minor Revision**
- upgrade hpcx into 2.18
2024-04-22 06:36:19 +00:00
Yuting Jiang
dc3846cbd4
Dockerfile - Upgrade mlc to v3.11 ( #620 )
...
**Description**
Upgrade mlc to v3.11.
2024-04-18 10:59:36 +08:00
Ziyue Yang
cc89ee591c
Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation ( #616 )
...
**Description**
Adds hipblasLt tuning to dist-inference cpp implementation.
2024-04-02 09:56:33 +08:00
Yang Wang
eeaa9b1ac9
Bug Fix - Bug fix for cuda 12.2 dockerfile LD_LIBRARY_PATH issue ( #614 )
...
**Description**
Cuda 12.2 image will report undfined symbol error due to incomplete
LD_LIBRARY_PATH:
![image](https://github.com/microsoft/superbenchmark/assets/25875482/1a7c48c7-cb6b-4e3a-abbe-dde23007a96b )
### How to reproduce:
1. Deploy sb with cuda12.2 image
```
sb deploy -f local.ini -i superbench/superbench:v0.10.0-cuda12.2
```
2. Enter to the container
```
sudo docker exec -it sb-workspace bash
```
3. Execute `mpirun`:
```
root@sb-container:~# mpirun
mpirun: symbol lookup error: mpirun: undefined symbol: opal_libevent2022_event_base_loop
```
### Fix to fix
* Append hpcx_load into /etc/bash.bashrc for updaing env LD_LIBRARY_PATH in each time
---------
2024-03-21 15:05:55 +00:00
Yifan Xiong
2c88db907f
Release - SuperBench v0.10.0 ( #607 )
...
**Description**
Cherry-pick bug fixes from v0.10.0 to main.
**Major Revisions**
* Benchmarks: Microbenchmark - Support different hipblasLt data types in dist_inference #590
* Benchmarks: Microbenchmark - Support in-place for NCCL/RCCL benchmark #591
* Bug Fix - Fix NUMA Domains Swap Issue in NDv4 Topology File #592
* Benchmarks: Microbenchmark - Add data type option for NCCL and RCCL tests #595
* Benchmarks: Bug Fix - Make metrics of dist-inference-cpp aligned with PyTorch version #596
* CI/CD - Add ndv5 topo file #597
* Benchmarks: Microbenchmark - Improve AMD GPU P2P performance with fine-grained GPU memory #593
* Benchmarks: Build Pipeline - fix nccl and nccl test version to 2.18.3 to resolve hang issue in cuda12.2 docker #599
* Dockerfile - Bug fix for rocm docker build and deploy #598
* Benchmarks: Microbenchmark - Adapt to hipblasLt data type changes #603
* Benchmarks: Micro benchmarks - Update hipblaslt metric unit to tflops #604
* Monitor - Upgrade pyrsmi to amdsmi python library. #601
* Benchmarks: Micro benchmarks - add fp8 and initialization for hipblaslt benchmark #605
* Dockerfile - Add rocm6.0 dockerfile #602
* Bug Fix - Bug fix for latest megatron-lm benchmark #600
* Docs - Upgrade version and release note #606
Co-authored-by: Ziyue Yang <ziyyang@microsoft.com>
Co-authored-by: Yang Wang <yangwang1@microsoft.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
Co-authored-by: guoshzhao <guzhao@microsoft.com>
2024-01-08 05:40:52 +00:00
Ziyue Yang
2c2096ed83
Benchmark: Revision - Fix -O2 option passing in gpu_copy ROCm build ( #589 )
...
**Description**
`add_compile_options` will not work for ROCm build, change it to setting
`CMAKE_CXX_FLAGS`.
2023-12-11 04:34:51 +00:00
Ziyue Yang
719a427fe7
Benchmarks: Microbenchmark - Add distributed inference benchmark cpp implementation ( #586 )
...
**Description**
Add distributed inference benchmark cpp implementation.
2023-12-11 06:53:51 +08:00
Yuting Jiang
1f5031bd74
Dockerfile - Upgrade to rocm5.7 dockerfile ( #587 )
...
**Description**
upgrade to rocm5.7 dockerfile.
---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
2023-12-09 17:41:12 +00:00
Ziyue Yang
4fa60be7cd
Benchmarks: Micro benchmark - Add one-to-all, all-to-one, all-to-all support to gpu_copy_bw_performance ( #588 )
...
**Description**
Add one-to-all, all-to-one, all-to-all support to
gpu_copy_bw_performance, and fix performance bug in gpu_copy
2023-12-08 23:22:38 +08:00
Ziyue Yang
6ef3a0110f
Benchmarks: Add MSCCL Support for Nvidia GPU ( #584 )
...
**Description**
Add MSCCL support for Nvidia GPU
2023-12-07 19:57:28 +08:00
Yuting Jiang
dd5a6329ed
Benchmarks: Add benchmark: Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark ( #582 )
...
**Description**
Megatron-LM/Megatron-Deepspeed GPT pretrain benchmark
2023-12-07 09:37:09 +08:00
Ziyue Yang
254ea7feba
Benchmarks: Micro benchmark - Add graph mode in NCCL/RCCL benchmarks for latency metrics ( #583 )
...
**Description**
Revise NCCL/RCCL benchmarks to graph mode add latency metrics.
2023-12-05 16:48:13 +08:00
Yuting Jiang
9ae8c67093
Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation ( #581 )
...
**Description**
Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in
ib-validation
**Major Revision**
- Support cpu-gpu and gpu-cpu in ib-validation
**Minor Revision**
- support multi msg size, multi direction, multi ib commands in
ib-validation
2023-12-04 22:20:46 +08:00
guoshzhao
028819b388
Monitor - Add support for AMD GPU. ( #580 )
...
**Description**
Add AMD support in monitor.
**Major Revision**
- Add library pyrsmi to collect metrics.
- Currently can get device_utilization, device_power, device_used_memory
and device_total_memory.
2023-11-27 18:45:56 +08:00
Yifan Xiong
1ad1c21c38
Dockerfile - Upgrade Docker image to CUDA 12.2 ( #577 )
...
Upgrade Docker image to CUDA 12.2 for H100:
* upgrade base image to 23.10
* fix onnxruntime version in python3.10
* fix compilation errors
2023-11-22 13:48:18 +00:00
Yuting Jiang
2235e084ab
Benchmarks: Micro benchmark - add initialization options for rocm gemm flops ( #578 )
...
**Description**
add initialization options for rocm gemm flops.
2023-11-22 12:52:22 +00:00
Yuting Jiang
79089b6517
Benchmarks: Micro benchmark - Add hipBLASLt function benchmark ( #576 )
...
**Description**
hipblaslt function benchmark and rebase cublaslt function benchmark.
2023-11-22 19:48:10 +08:00
guoshzhao
9f4880cb8e
Analyzer - Generate baseline given results from multiple nodes. ( #575 )
...
**Description**
Generate baseline given results from multiple nodes.
**Major Revision**
- Add sub command `sb result generate-baseline`
- Add UT and docs
---------
Co-authored-by: 454314380 <454314380@qq.com>
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
2023-11-22 14:42:32 +08:00
Yuting Jiang
f53d941a22
Benchmarks: micro benchmarks - add int8 support for cublaslt function ( #574 )
...
**Description**
add int8 support for cublaslt function.
2023-11-20 11:21:20 +08:00
Yuting Jiang
c7800bb8e0
Bug Fix - remove cp ptx file command in gpu burn test ( #567 )
...
**Description**
remove cp ptx file in gpu burn test since the command is run inside
self.args.bin_dir dir.
d246bab430/superbench/benchmarks/micro_benchmarks/micro_base.py (L183)
2023-11-14 03:52:56 +00:00
dependabot[bot]
ce3737f98b
Bump @babel/traverse from 7.14.5 to 7.23.2 in /website ( #566 )
...
Bumps [@babel/traverse](https://github.com/babel/babel/tree/HEAD/packages/babel-traverse ) from 7.14.5 to 7.23.2.
- [Release notes](https://github.com/babel/babel/releases )
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md )
- [Commits](https://github.com/babel/babel/commits/v7.23.2/packages/babel-traverse )
---
updated-dependencies:
- dependency-name: "@babel/traverse"
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-07 10:36:42 +08:00
dependabot[bot]
07477c3bae
Bump postcss from 8.3.5 to 8.4.31 in /website ( #564 )
...
Bumps [postcss](https://github.com/postcss/postcss ) from 8.3.5 to 8.4.31.
- [Release notes](https://github.com/postcss/postcss/releases )
- [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md )
- [Commits](postcss/postcss@8.3.5...8.4.31)
---
updated-dependencies:
- dependency-name: postcss
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-05 11:35:49 +00:00
Yuting Jiang
d246bab430
Dockerfile - update mlc version into 3.10 for cuda and rocm dockerfiles ( #562 )
...
**Description**
Update mlc version into 3.10 for cuda and rocm dockerfiles to be
consistent with cuda12 dockerfile
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
2023-10-23 11:21:17 +08:00
Yuting Jiang
27a10811af
Benchmarks: micro benchmark - source code for evaluating NVDEC decoding performance ( #560 )
...
**Description**
source code for evaluating NVDEC decoding performance.
---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
2023-08-22 10:56:33 +00:00
Yuting Jiang
6c0205cece
Benchmarks: micro benchmarks - add source code for DirectXRenderPerf ( #549 )
...
**Description**
add source code for DirectXRenderPerf.
---------
Co-authored-by: yukirora <yuting.jiang@microsoft.com>
2023-08-18 05:17:04 +00:00
pnunna93
67f2aa7237
Benchmarks: model benchmarks - change torch.distributed.launch to torchrun ( #556 )
...
This PR has following changes
- torch.distributed.launch changed to torchrun. torch.distributed.launch
is deprecated in latest Pytorch and is recommended to move to torchrun -
https://pytorch.org/docs/stable/elastic/run.html
- Changes to AMD GPU detection logic. The AMD GPU detection logic throws
warning when containers have only renderD in /dev/dri, this change would
resolve those warnings
---------
Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com>
2023-08-08 13:03:32 +08:00
Yuting Jiang
e1df877bfe
Release - SuperBench v0.9.0 ( #558 )
...
**Description**
Cherry-pick bug fixes from v0.9.0 to main.
**Major Revision**
- CI/CD: pipeline - clean more disk space to fix rocm building image
pipeline(#555 )
- Benchmarks: bug fix - use absolute path for input file in
DirectXEncodingLatency(#554 )
- CI/CD - add push win docker image on release branch in pipeline (#552 )
- Docs - Upgrade version and release note(#557 )
2023-07-27 10:42:31 +08:00
dependabot[bot]
466b477e9d
Bump semver from 5.7.1 to 5.7.2 in /website ( #550 )
...
Bumps [semver](https://github.com/npm/node-semver ) from 5.7.1 to 5.7.2.
- [Release notes](https://github.com/npm/node-semver/releases )
- [Changelog](https://github.com/npm/node-semver/blob/v5.7.2/CHANGELOG.md )
- [Commits](npm/node-semver@v5.7.1...v5.7.2)
---
updated-dependencies:
- dependency-name: semver
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-07-24 17:07:35 +08:00
Yuting Jiang
e8ac0b1e28
Benchmarks: micro benchmarks - add python code for DirectXGPUEncodingLatency ( #548 )
...
**Description**
add python code for DirectXGPUEncodingLatency.
2023-07-06 15:31:28 +08:00
Yuting Jiang
c8c079c2af
Benchmarks: micro benchmarks - add python code for DirectXGPUCopy ( #546 )
...
**Description**
add python code for DirectXGPUCopy.
2023-07-06 00:15:32 +08:00
Yuting Jiang
af4cfd5bbf
Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw ( #547 )
...
**Description**
add python code for DirecXGPUMemBw.
2023-07-05 22:07:13 +08:00
Yuting Jiang
f1d608aef7
Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops ( #542 )
...
**Description**
add python code for DirectX core flops and init DirectX test pipeline.
**Major Revision**
- add python code for DirectX core flops
- init DirectX test pipeline
**Minor Revision**
- add test for DirectX core flops
2023-07-05 16:56:21 +08:00
Yuting Jiang
3704a432b9
CI/CD - Support DirectX test pipeline ( #545 )
...
**Description**
Support DirectX test pipeline.
2023-07-05 11:33:40 +08:00
Yuting Jiang
865472177f
Benchmarks: Build Pipeline - add AMF in third party and build AMF encoding latency test ( #543 )
...
**Description**
add AMF in third party and build AMF encoding latency test.
2023-07-03 14:43:21 +00:00
Yuting Jiang
97f7b1df86
Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions ( #540 )
...
**Description**
add auto selecting algorithm support for cudnn functions.
**Major Revision**
- add auto selecting algorithm support for cudnn functions in source
code
- add 'auto_algo' option in benchmark
- add related test
2023-06-30 12:58:41 +00:00
Lei Qu
c7d0beaf9e
Doc - Update outdate references in micro-benchmarks.md ( #544 )
...
Modify link for Nvidia bandwidth test tool
**Description**
previous link is 404
**Minor Revision**
update the link value to
https://github.com/NVIDIA/cuda-samples/tree/master/Samples/1_Utilities/bandwidthTest
2023-06-30 19:17:41 +08:00
Yifan Xiong
7184bdd1ed
Benchmarks - Update result parsing in tensorrt inference ( #541 )
...
* Update result parsing for newer tensorrt versions
* Update arguments when load torchvision models
2023-06-30 11:22:46 +08:00
Yuting Jiang
f259913707
Benchmarks: Add benchmark - Add source code of DirectxGPUCopy microbenchmark ( #486 )
...
**Description**
Add source code of DirectxGPUCopy microbenchmark.
2023-06-29 19:38:01 +08:00