superbenchmark/examples/benchmarks
Ziyue Yang 8daef211dd
Benchmarks - Add distributed inference benchmark (#493)
**Description**
This PR adds a micro-benchmark of distributed model inference workloads.

**Major Revision**
- Add a new micro-benchmark dist-inference.
- Add corresponding example and unit tests.
- Update configuration files to include this new micro-benchmark.
- Update micro-benchmark README.

---------

Co-authored-by: Peng Cheng <chengpeng5555@outlook.com>
2023-03-24 17:15:17 +08:00
..
computation_communication_overlap.py
cpu_hpl_performance.py
cpu_memory_bw_latency_performance.py
cpu_stream_performance.py
cublas_function.py
cuda_memory_bw_performance.py
cudnn_function.py
disk_performance.py
dist_inference.py Benchmarks - Add distributed inference benchmark (#493) 2023-03-24 17:15:17 +08:00
fambench.py
gemm_flops_cuda_performance.py
gpcnet_performance.py
gpu_burn_test.py
gpu_copy_bw_performance.py
ib_loopback_performance.py
ib_validation_performance.py
kernel_launch_overhead.py
matmul.py
nccl_bw_performance.py
ort_inference_performance.py
pytorch_bert_large.py
pytorch_cnn.py
pytorch_gpt2_large.py
pytorch_lstm.py
rocm_gemm_flops_performance.py
rocm_memory_bw_performance.py
rocm_onnxruntime_model_benchmark.py
sharding_matmul.py
tcp_connectivity.py
tensorrt_inference_performance.py