superbenchmark/superbench/benchmarks
Yuting Jiang b5b1c3dac7
Bug - Fix bug of duration feature for model benchmarks in distributed mode. (#347)
**Description**
Fix bug of duration feature for model benchmarks in distributed mode.

**Major Revision**
- Add all_reduce to sync the result of is_finished(the function to judge whether the model benchmark should be stopped) in each step 
  - to avoid inconsistency between different ranks to determine duration end (some rank may enter one more step and can never finish)
- Add torch.cuda.synchronize() before and after step time measuring in train_step() for all model benchmarks
  - some operations in train_step() maybe async resulting incorrect step time records (for example, lstm)
2022-04-25 15:50:03 +08:00
..
docker_benchmarks Benchmarks: Add Benchmark - Add FAMBench based on docker benchmark (#338) 2022-04-11 15:31:07 +08:00
micro_benchmarks Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
model_benchmarks Bug - Fix bug of duration feature for model benchmarks in distributed mode. (#347) 2022-04-25 15:50:03 +08:00
__init__.py Runner: Add Feature - Generate summarized output files. (#157) 2021-08-20 16:48:40 +08:00
base.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
build.sh Release - SuperBench v0.3.0 (#212) 2021-09-26 09:30:31 +08:00
context.py revise the term onnx to onnxruntime. (#232) 2021-10-21 04:29:27 +00:00
reducer.py Monitor: Integration - Integrate monitor into Superbench (#259) 2021-12-10 09:33:13 +00:00
registry.py Benchmarks: Add Feature - Add 'ignore_invalid' option when register benchmarks. (#247) 2021-12-02 10:26:56 +00:00
result.py Benchmarks: Add Feature - Provide option to save raw data into file. (#333) 2022-04-01 16:26:09 +08:00
return_code.py Benchmarks: Add Benchmark - Add ib traffic validation distributed benchmark (#215) 2021-11-10 01:18:41 +08:00