51761b3af1
**Description** Cherry-pick bug fixes from v0.8.0 to main. **Major Revisions** * Monitor - Fix the cgroup version checking logic (#502) * Benchmark - Fix matrix size overflow issue in cuBLASLt GEMM (#503) * Fix wrong torch usage in communication wrapper for Distributed Inference Benchmark (#505) * Analyzer: Fix bug in python3.8 due to pandas api change (#504) * Bug - Fix bug to get metric from cmd when error happens (#506) * Monitor - Collect realtime GPU power when benchmarking (#507) * Add num_workers argument in model benchmark (#511) * Remove unreachable condition when write host list (#512) * Update cuda11.8 image to cuda12.1 based on nvcr23.03 (#513) * Doc - Fix wrong unit of cpu-memory-bw-latency in doc (#515) * Docs - Upgrade version and release note (#508) Co-authored-by: guoshzhao <guzhao@microsoft.com> Co-authored-by: Ziyue Yang <ziyyang@microsoft.com> Co-authored-by: Yuting Jiang <yutingjiang@microsoft.com> |
||
---|---|---|
.. | ||
ISSUE_TEMPLATE | ||
workflows | ||
CODEOWNERS | ||
dependabot.yml | ||
pull_request_template.md |