..
cublas_function
Dockerfile - Upgrade Docker image to CUDA 12.2 ( #577 )
2023-11-22 13:48:18 +00:00
cublaslt_gemm
Benchmarks: micro benchmarks - add int8 support for cublaslt function ( #574 )
2023-11-20 11:21:20 +08:00
cuda_decode_performance
Dockerfile - Add support for arm64 build ( #660 )
2024-11-06 23:16:12 +00:00
cudnn_function
Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions ( #540 )
2023-06-30 12:58:41 +00:00
directx_gemm_flops_performance
Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops ( #542 )
2023-07-05 16:56:21 +08:00
directx_gpu_copy_performance
Benchmarks: micro benchmarks - add python code for DirectXGPUCopy ( #546 )
2023-07-06 00:15:32 +08:00
directx_mem_bw_performance
Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw ( #547 )
2023-07-05 22:07:13 +08:00
directx_render_performance
Benchmarks: micro benchmarks - add source code for DirectXRenderPerf ( #549 )
2023-08-18 05:17:04 +00:00
directx_third_party
Benchmarks: micro benchmarks - add source code for DirectXRenderPerf ( #549 )
2023-08-18 05:17:04 +00:00
directx_utils
Benchmarks: Add benchmark - Add source code of DirectxGPUCopy microbenchmark ( #486 )
2023-06-29 19:38:01 +08:00
dist_inference_cpp
Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation ( #616 )
2024-04-02 09:56:33 +08:00
gpu_copy_performance
Bug Fix - Fix numa error on grace cpu in gpu-copy ( #658 )
2024-11-05 23:10:51 +00:00
ib_validation_performance
Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation ( #581 )
2023-12-04 22:20:46 +08:00
kernel_launch_overhead
Dockerfile - Upgrade to rocm5.7 dockerfile ( #587 )
2023-12-09 17:41:12 +00:00
__init__.py
Benchmarks: Micro benchmark - Add hipBLASLt function benchmark ( #576 )
2023-11-22 19:48:10 +08:00
_export_torch_to_onnx.py
Benchmarks - Update result parsing in tensorrt inference ( #541 )
2023-06-30 11:22:46 +08:00
blaslt_function_base.py
Benchmarks: Micro benchmark - Add hipBLASLt function benchmark ( #576 )
2023-11-22 19:48:10 +08:00
computation_communication_overlap.py
Benchmarks: Unify metric names of benchmarks ( #252 )
2021-12-09 04:48:42 +00:00
cpu_hpl_performance.py
Adding HPL benchmark ( #482 )
2023-03-21 16:44:08 +00:00
cpu_memory_bw_latency_performance.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
cpu_stream_performance.py
Dockerfile - Add support for arm64 build ( #660 )
2024-11-06 23:16:12 +00:00
cublas_function.py
Benchmarks: Revision - Support flexible warmup and non-random data initialization in cublas-benchmark ( #479 )
2023-02-28 06:35:18 +08:00
cublaslt_function.py
Benchmarks: Micro benchmark - Add hipBLASLt function benchmark ( #576 )
2023-11-22 19:48:10 +08:00
cuda_common.cmake
Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops ( #634 )
2024-07-26 02:42:17 +00:00
cuda_gemm_flops_performance.py
Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops ( #634 )
2024-07-26 02:42:17 +00:00
cuda_memory_bw_performance.py
Benchmark: Revision - Add wait time option to resolve mem-bw unstable issue ( #438 )
2022-12-14 17:21:02 +08:00
cuda_nccl_bw_performance.py
Release - SuperBench v0.10.0 ( #607 )
2024-01-08 05:40:52 +00:00
cudnn_function.py
Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions ( #540 )
2023-06-30 12:58:41 +00:00
directx_gemm_flops_performance.py
Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops ( #542 )
2023-07-05 16:56:21 +08:00
directx_gpu_copy_performance.py
Benchmarks: micro benchmarks - add python code for DirectXGPUCopy ( #546 )
2023-07-06 00:15:32 +08:00
directx_gpu_encoding_latency.py
Release - SuperBench v0.9.0 ( #558 )
2023-07-27 10:42:31 +08:00
directx_mem_bw_performance.py
Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw ( #547 )
2023-07-05 22:07:13 +08:00
disk_performance.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
dist_inference.py
Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation ( #616 )
2024-04-02 09:56:33 +08:00
gemm_flops_performance_base.py
Benchmarks: Unify metric names of benchmarks ( #252 )
2021-12-09 04:48:42 +00:00
gpcnet_performance.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
gpu_burn_test.py
Bug Fix - remove cp ptx file command in gpu burn test ( #567 )
2023-11-14 03:52:56 +00:00
gpu_copy_bw_performance.py
Benchmarks: Micro benchmark - Add one-to-all, all-to-one, all-to-all support to gpu_copy_bw_performance ( #588 )
2023-12-08 23:22:38 +08:00
hipblaslt_function.py
Release - SuperBench v0.10.0 ( #607 )
2024-01-08 05:40:52 +00:00
ib_loopback_performance.py
Fix port conflict in ib loopback ( #375 )
2022-07-20 11:30:00 +08:00
ib_validation_performance.py
Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation ( #581 )
2023-12-04 22:20:46 +08:00
kernel_launch_overhead.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
memory_bw_performance_base.py
Benchmarks: Unify metric names of benchmarks ( #252 )
2021-12-09 04:48:42 +00:00
micro_base.py
Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw ( #547 )
2023-07-05 22:07:13 +08:00
ort_inference_performance.py
Benchmarks: Add Feature - Add percentile metrics for ort and pytorch inference benchmarks ( #283 )
2022-01-19 10:49:56 +08:00
rocm_common.cmake
Release - SuperBench v0.10.0 ( #607 )
2024-01-08 05:40:52 +00:00
rocm_gemm_flops_performance.py
Benchmarks: Micro benchmark - add initialization options for rocm gemm flops ( #578 )
2023-11-22 12:52:22 +00:00
rocm_memory_bw_performance.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
sharding_matmul.py
Benchmarks: Unify metric names of benchmarks ( #252 )
2021-12-09 04:48:42 +00:00
tcp_connectivity.py
Benchmarks: Add Feature - Provide option to save raw data into file. ( #333 )
2022-04-01 16:26:09 +08:00
tensorrt_inference_performance.py
Benchmarks - Update result parsing in tensorrt inference ( #541 )
2023-06-30 11:22:46 +08:00