superbenchmark/superbench/benchmarks/micro_benchmarks
pdr 479491279e
Dockerfile - Add support for arm64 build (#660)
Add support for arm64 build:

- Updated dockerfile for arm64 build
- extend cpu stream compilation for neoverse 
- handle onnxruntime-gpu installation
- third party builds filtering based on arch
- disable cuda decode perf build for non x86
2024-11-06 23:16:12 +00:00
..
cublas_function Dockerfile - Upgrade Docker image to CUDA 12.2 (#577) 2023-11-22 13:48:18 +00:00
cublaslt_gemm Benchmarks: micro benchmarks - add int8 support for cublaslt function (#574) 2023-11-20 11:21:20 +08:00
cuda_decode_performance Dockerfile - Add support for arm64 build (#660) 2024-11-06 23:16:12 +00:00
cudnn_function Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions (#540) 2023-06-30 12:58:41 +00:00
directx_gemm_flops_performance Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops (#542) 2023-07-05 16:56:21 +08:00
directx_gpu_copy_performance Benchmarks: micro benchmarks - add python code for DirectXGPUCopy (#546) 2023-07-06 00:15:32 +08:00
directx_mem_bw_performance Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw (#547) 2023-07-05 22:07:13 +08:00
directx_render_performance Benchmarks: micro benchmarks - add source code for DirectXRenderPerf (#549) 2023-08-18 05:17:04 +00:00
directx_third_party Benchmarks: micro benchmarks - add source code for DirectXRenderPerf (#549) 2023-08-18 05:17:04 +00:00
directx_utils Benchmarks: Add benchmark - Add source code of DirectxGPUCopy microbenchmark (#486) 2023-06-29 19:38:01 +08:00
dist_inference_cpp Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation (#616) 2024-04-02 09:56:33 +08:00
gpu_copy_performance Bug Fix - Fix numa error on grace cpu in gpu-copy (#658) 2024-11-05 23:10:51 +00:00
ib_validation_performance Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation (#581) 2023-12-04 22:20:46 +08:00
kernel_launch_overhead Dockerfile - Upgrade to rocm5.7 dockerfile (#587) 2023-12-09 17:41:12 +00:00
__init__.py Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576) 2023-11-22 19:48:10 +08:00
_export_torch_to_onnx.py Benchmarks - Update result parsing in tensorrt inference (#541) 2023-06-30 11:22:46 +08:00
blaslt_function_base.py Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576) 2023-11-22 19:48:10 +08:00
computation_communication_overlap.py Benchmarks: Unify metric names of benchmarks (#252) 2021-12-09 04:48:42 +00:00
cpu_hpl_performance.py Adding HPL benchmark (#482) 2023-03-21 16:44:08 +00:00
cpu_memory_bw_latency_performance.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
cpu_stream_performance.py Dockerfile - Add support for arm64 build (#660) 2024-11-06 23:16:12 +00:00
cublas_function.py Benchmarks: Revision - Support flexible warmup and non-random data initialization in cublas-benchmark (#479) 2023-02-28 06:35:18 +08:00
cublaslt_function.py Benchmarks: Micro benchmark - Add hipBLASLt function benchmark (#576) 2023-11-22 19:48:10 +08:00
cuda_common.cmake Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634) 2024-07-26 02:42:17 +00:00
cuda_gemm_flops_performance.py Benchmarks: Micro benchmarks - add support for NVIDIA L4/L40/L40s GPUs in gemm-flops (#634) 2024-07-26 02:42:17 +00:00
cuda_memory_bw_performance.py Benchmark: Revision - Add wait time option to resolve mem-bw unstable issue (#438) 2022-12-14 17:21:02 +08:00
cuda_nccl_bw_performance.py Release - SuperBench v0.10.0 (#607) 2024-01-08 05:40:52 +00:00
cudnn_function.py Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions (#540) 2023-06-30 12:58:41 +00:00
directx_gemm_flops_performance.py Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops (#542) 2023-07-05 16:56:21 +08:00
directx_gpu_copy_performance.py Benchmarks: micro benchmarks - add python code for DirectXGPUCopy (#546) 2023-07-06 00:15:32 +08:00
directx_gpu_encoding_latency.py Release - SuperBench v0.9.0 (#558) 2023-07-27 10:42:31 +08:00
directx_mem_bw_performance.py Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw (#547) 2023-07-05 22:07:13 +08:00
disk_performance.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
dist_inference.py Benchmarks: Revise Code - Add hipblasLt tuning to dist-inference cpp implementation (#616) 2024-04-02 09:56:33 +08:00
gemm_flops_performance_base.py Benchmarks: Unify metric names of benchmarks (#252) 2021-12-09 04:48:42 +00:00
gpcnet_performance.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
gpu_burn_test.py Bug Fix - remove cp ptx file command in gpu burn test (#567) 2023-11-14 03:52:56 +00:00
gpu_copy_bw_performance.py Benchmarks: Micro benchmark - Add one-to-all, all-to-one, all-to-all support to gpu_copy_bw_performance (#588) 2023-12-08 23:22:38 +08:00
hipblaslt_function.py Release - SuperBench v0.10.0 (#607) 2024-01-08 05:40:52 +00:00
ib_loopback_performance.py Fix port conflict in ib loopback (#375) 2022-07-20 11:30:00 +08:00
ib_validation_performance.py Benchmarks: micro benchmark - Support cpu-gpu and gpu-cpu in ib-validation (#581) 2023-12-04 22:20:46 +08:00
kernel_launch_overhead.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
memory_bw_performance_base.py Benchmarks: Unify metric names of benchmarks (#252) 2021-12-09 04:48:42 +00:00
micro_base.py Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw (#547) 2023-07-05 22:07:13 +08:00
ort_inference_performance.py Benchmarks: Add Feature - Add percentile metrics for ort and pytorch inference benchmarks (#283) 2022-01-19 10:49:56 +08:00
rocm_common.cmake Release - SuperBench v0.10.0 (#607) 2024-01-08 05:40:52 +00:00
rocm_gemm_flops_performance.py Benchmarks: Micro benchmark - add initialization options for rocm gemm flops (#578) 2023-11-22 12:52:22 +00:00
rocm_memory_bw_performance.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
sharding_matmul.py Benchmarks: Unify metric names of benchmarks (#252) 2021-12-09 04:48:42 +00:00
tcp_connectivity.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
tensorrt_inference_performance.py Benchmarks - Update result parsing in tensorrt inference (#541) 2023-06-30 11:22:46 +08:00