**Description**
Generate the summarized output files from all nodes. For each metric, do the reduce operation according to the `reduce_op`
**Major Revision**
- Generate the summarized json file per node:
For microbenchmark, the format is `{benchmark_name}/[{run_count}/]{metric_name}[:rank]`
For modelbenchmark, the format is `{benchmark_name}/{sub_benchmark_name}/[{run_count}/]{metric_name}`
`[]` means optional.
```
{
"kernel-launch/overhead_event:0": 0.00583,
"kernel-launch/overhead_event:1": 0.00545,
"kernel-launch/overhead_event:2": 0.00581,
"kernel-launch/overhead_event:3": 0.00572,
"kernel-launch/overhead_event:4": 0.00559,
"kernel-launch/overhead_event:5": 0.00591,
"kernel-launch/overhead_event:6": 0.00562,
"kernel-launch/overhead_event:7": 0.00586,
"resnet_models/pytorch-resnet50/steptime-train-float32": 544.0827468410134,
"resnet_models/pytorch-resnet50/throughput-train-float32": 353.7607016465773,
"resnet_models/pytorch-resnet50/steptime-train-float16": 425.40482617914677,
"resnet_models/pytorch-resnet50/throughput-train-float16": 454.0142363793973,
"pytorch-sharding-matmul/0/allreduce": 10.561786651611328,
"pytorch-sharding-matmul/1/allreduce": 10.561786651611328,
"pytorch-sharding-matmul/0/allgather": 10.088025093078613,
"pytorch-sharding-matmul/1/allgather": 10.088025093078613
}
```
- Generate the summarized jsonl file for all nodes, each line is the result from one node in json format.
**Description**
Add reduce function support for output summary.
**Major Revision**
- Add reducer class to maintain all reduce functions.
- Save reduce type of each metric into `BenchmarkResult`
- Fix UT.