Yifan Xiong
e7b6af35c0
Website - Initialize SuperBench website ( #102 )
...
* Initialize SuperBench website.
* Add GitHub Actions for automatically build and publish.
2021-06-25 15:59:13 +08:00
Yifan Xiong
832e392f91
Docs - Update SuperBench documents ( #101 )
...
Update SuperBench documents.
2021-06-25 13:40:14 +08:00
Yifan Xiong
c0c43b8f81
Bug bash - Fix bugs in multi GPU benchmarks ( #98 )
...
* Add `sb deploy` command content.
* Fix inline if-expression syntax in playbook.
* Fix quote escape issue in bash command.
* Add custom env in config.
* Update default config for multi GPU benchmarks.
* Update MANIFEST.in to include jinja2 template.
* Require jinja2 minimum version.
* Fix occasional duplicate output in Ansible runner.
* Fix mixed color from Ansible and Python colorlog.
* Update according to comments.
* Change superbench.env from list to dict in config file.
2021-06-23 18:16:43 +08:00
guoshzhao
216c5b5c71
Benchmarks: Add Feature - Add DistributedImpl and DistributedBackend arguments for micro benchmark. ( #100 )
2021-06-21 23:34:05 +08:00
Yuting Jiang
3d72c07807
Bug bash - Rename bin name and metric name of cublas and cudnn microbenchmark ( #99 )
...
rename bin name and result metric of cublas and cudnn microbenchmark
2021-06-20 15:42:37 +08:00
Yifan Xiong
25ec3a7c1c
Dockerfile - Update CUDA 11.1.1 Dockerfile ( #96 )
...
Update packages and add build cache for CUDA 11.1.1 Dockerfile:
* Remove duplicate cmake and ompi, which are already in base image
* Add hpcx and sharp lib
* Add cache for gitmodules build
* Sort apt-get packages
2021-06-16 16:47:52 +08:00
Yifan Xiong
ddbc51a135
Bug bash - Fix bugs and refine log in single GPU benchmarks ( #97 )
...
Fix bugs and refine log in single GPU benchmarks:
* Fix none framework issue
* Fix empty parameter bug
* Remove missed mobilenet_v3 models
* Change benchmark registration log to debug level
* Add pid in logging
* Add missing benchmarks in default config
* Fix deprecated logging warn
2021-06-16 13:51:22 +08:00
guoshzhao
03b41be145
Benchmarks: Fix Bug - Fix OOM issue when run pytorch models sequentially. ( #93 )
...
* Clean up the cache.
2021-06-07 10:19:05 +08:00
guoshzhao
2d9be807a9
Benchmarks: Fix Bug - Fix return code overwrite issue ( #94 )
...
* fix return code reset issue
2021-06-04 18:02:12 +08:00
Yifan Xiong
6b0ca1cb05
Runner - Support local mode in runner ( #88 )
...
* Support local mode in runner.
2021-06-02 23:58:44 +08:00
guoshzhao
44c5103b5c
Benchmarks: Code Revision - Change default shape of sharding-matmul. ( #92 )
...
* Change default shape of sharding-matmul.
2021-06-02 10:50:09 +08:00
guoshzhao
6c6f526937
Benchmarks: Add Benchmark - Add FLOPs performance benchmark for cuda. ( #87 )
...
* add cuda flops performance benchmark.
2021-06-02 09:15:58 +08:00
guoshzhao
331c740a15
Benchmarks: Add Feature - Add nvml package to provide python interfaces of nvidia. ( #91 )
2021-06-01 23:31:07 +08:00
Yuting Jiang
83235433b2
Benchmarks: Add benchmark - add micro benchmark for cudnn test ( #89 )
...
* add python related cudnn microbenchmark
2021-06-01 22:24:35 +08:00
Yuting Jiang
0831748167
Benchmarks: Code Revision - add error return code for cublas microbenchmark ( #90 )
...
* add error return code for cublas micro benchmark
2021-06-01 16:12:15 +08:00
guoshzhao
40d7905e6b
Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. ( #85 )
...
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
2021-06-01 11:17:44 +08:00
Yuting Jiang
61c258fe5b
Benchmarks: Add benchmark - add source code of cudnn function micro benchmark ( #78 )
...
* Benchmarks: Add benchmark - add source code of cudnn function micro benchmark
2021-06-01 10:33:23 +08:00
Yifan Xiong
5e9f948df2
Executor - Save benchmark results to file ( #86 )
...
* Save benchmark results to json file.
2021-05-31 13:05:12 +08:00
Yuting Jiang
18398fbaa2
Benchmarks: Add benchmark - add micro benchmark for cublas test ( #80 )
...
* add benchmark for cublas test
* format
* revise error handling and test
* add interface to read json file, revise json file path and include .json in packaging
* add random_seed in arguments
* revise preprocess of cublas benchmark
* fix lint error and note error in source code
* update according comments
* revise input arguments from json file to custom str and convert json file to built-in dict list
* restore package config
* fit lint issue
* update platform and comments
* rename files to match source code dir and fix comments error
Co-authored-by: root <root@sb-validation-000001.51z1chmys5fuzfqyo4niepozre.bx.internal.cloudapp.net>
2021-05-31 10:31:53 +08:00
Yifan Xiong
8b4f613a76
Runner - Support torch.distributed mode in runner ( #81 )
...
* Support `torch.distributed` mode in runner.
* Support given `proc_num` and `node_num` in `torch.distributed` mode.
2021-05-28 12:29:39 +08:00
Yuting Jiang
87f6b371e8
Benchmarks: Add benchmark - add source code of cublas function micro benchmark ( #77 )
...
* Superbenchmark: Add benchmarks - add cublas function micro benchmark
* format
* add python benchmark for cublas functions, example and test file
* detele python related and rename some files
* revise cmd_helper and move json package to cmake
* resolve conflict
* revise error handing to try-catch and update some code style
* revise cmd_helper.h, cublas_helper.h, cublas_helper.cpp
* revise structure of the cublas function
* add some comments and move cuda_init and cuda_free
* add comments for class member
* add ramdom seed, revise input from file to json string, simplify cmake
* delete json file in source code of cublas
* update according comments
* limit batchcount=1 in initialization of cublas function which do not use batch count
* revise and fix some errors of annotations
* update according comments and revise construction of CublasFunction
Co-authored-by: root <root@sb-validation-000001.51z1chmys5fuzfqyo4niepozre.bx.internal.cloudapp.net>
2021-05-27 16:44:18 +08:00
Yuting Jiang
4489388cf0
reverse the order of cppbuild and lint ( #84 )
2021-05-26 20:38:27 +08:00
Yifan Xiong
e7f6d8ba78
CI/CD - Add integration tests for Ansible playbooks ( #82 )
...
* Add integration tests for Ansible playbooks
* Add `gpu_vendor` var to bypass gpu mount
2021-05-26 20:04:49 +08:00
Yuting Jiang
e996516259
Benchmarks: Build Pipeline - Revise path of installing cmake projects ( #83 )
...
* Unify SB_MICRO_PATH and SB_MICRO_LIB
* fix bug of lib path
2021-05-26 19:14:14 +08:00
TobeyQin
1652524aa0
Docs - Update README file on main page ( #79 )
...
* Update Readme file on main page
2021-05-25 15:06:54 +08:00
Yifan Xiong
c05e173b3d
Runner - Implement ansible client and runner ( #69 )
...
Implement ansible client and runner:
* add ansible client
* add deploy and check_env playbooks
2021-05-23 23:53:37 +08:00
guoshzhao
e977bbc17f
Benchmarks: Add Benchmark - Add kernel launch overhead benchmark. ( #74 )
...
* add kernel launch overhead benchmark.
2021-05-19 17:06:55 +08:00
Yuting Jiang
b7d0ee329f
expose interface of pin memory and modify cnn configuration ( #75 )
2021-05-19 10:52:45 +08:00
guoshzhao
7cfe7c16cf
Benchmarks: Add Benchmark - Add the source code of cuda kernel launch overhead benchmark. ( #71 )
...
* add cuda kernel launch overhead benchmark - source part.
* can customize the nvcc_archs_support.
* set SB_MICRO_PATH for azure pipeline tests.
2021-05-18 14:07:27 +08:00
guoshzhao
94d3765b49
Benchmarks: Build Pipeline - Call build script when setup environment. ( #76 )
...
* call build script in Makefile.
* add cppbuild command for testing and docker env.
2021-05-18 11:49:23 +08:00
Yifan Xiong
977b1a7355
CLI - Refine CLI handlers ( #68 )
...
* use absolute path of input file
* parse registry uri from image
* merge common parts for arguments processing
2021-05-18 11:34:15 +08:00
Yifan Xiong
af6eb004df
CI/CD - Add GitHub Action to build and push image ( #70 )
...
* add GitHub Action to build and push image
* update Dockerfile to copy from context
2021-05-17 21:34:15 +08:00
guoshzhao
2bc7ada149
Benchmarks: Add Feature - Add script to build all cmake benchmark projects. ( #72 )
...
* add script to build all native benchmarks with cmake.
2021-05-17 18:16:54 +08:00
Yifan Xiong
0b2952d459
Setup - Add lint for cpp sources ( #73 )
...
__Major Revisions__
* add clang-format to lint cpp sources
* add cpp lint in GitHub Actions
2021-05-17 11:36:41 +08:00
guoshzhao
729e04ab94
Benchmarks: Code Revision - Revise MicroBenchmark class to be more flexible. ( #66 )
...
* Revise MicroBenchmark class to be more flexible.
* use command index not the command as the parameter.
* changes according to discussion.
2021-05-13 18:58:47 +08:00
Yifan Xiong
57ce473a02
Utils - Support lazy import ( #67 )
...
__Major Revision__
* Support lazy import.
* Not importing benchmarks when running `help`, `version`, `deploy` commands, etc.
2021-05-11 10:49:22 +08:00
guoshzhao
65292ae55b
Benchmarks: Code Revision - Revise the settings of CNN example models. ( #65 )
...
* revise example settings of cnn models.
2021-04-26 21:23:11 +08:00
guoshzhao
a7184da3f3
Benchmarks: Fix Bug - Increase default sample count for benchmarking. ( #64 )
2021-04-26 20:47:12 +08:00
guoshzhao
0324117fd3
Benchmarks: Fix Bug - Fix dataset precision for CNN and LSTM benchmarks.
2021-04-26 20:37:06 +08:00
Yifan Xiong
2299d238dd
CI/CD - Speedup CPU pipeline ( #61 )
...
Speedup Azure pipeline for CPU unit test.
2021-04-21 11:58:55 +08:00
guoshzhao
2a7ab691f1
Benchmarks: Add Benchmark - Add LSTM model benchmarks. ( #60 )
...
* Benchmarks: Add Benchmark - Add LSTM model benchmarks.
2021-04-20 10:53:44 +08:00
guoshzhao
902ea211d1
Benchmarks: Add Benchmark - Add CNN model benchmarks. ( #59 )
...
* Benchmarks: Add Benchmark - Add CNN model benchmarks.
2021-04-20 10:43:02 +08:00
guoshzhao
ce3ed24ab7
Benchmarks: Code Revision - Fix some issue for BERT benchmark. ( #58 )
...
Benchmarks: Code Revision - Fix some issue for BERT benchmark. (#58 )
2021-04-16 13:17:42 +08:00
guoshzhao
af567cf650
Benchmarks: Add Benchmark - Add GPT2 model benchmark. ( #57 )
...
* Benchmarks: Add Benchmark - Add GPT2 model benchmark.
2021-04-16 11:39:57 +08:00
guoshzhao
fb850af760
Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks. ( #56 )
...
* Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks.
2021-04-14 22:38:26 +08:00
Yuting Jiang
435b2d5eeb
Benchmarks: Add Benchmark - Add computation and communication overlap micro benchmark ( #39 )
...
* Benchmarks: Add Benchmark - add computation and communication overlap micro benchmark
* Benchmarks: Add benchmark - fix some format issues and typo
* Benchmarks: Add Benchmark - update according comments and add test
* revise tests
* skip multi gpu test due to no multi gpu
Co-authored-by: v-yujiang <v-yujiang@microsoft.com>
2021-04-14 18:07:06 +08:00
Yifan Xiong
74c4d1b2a4
Setup: Code Revision - Rename dev branch to main in config and readme ( #55 )
...
* Rename dev branch to main and set it as default.
2021-04-14 17:40:34 +08:00
Yuting Jiang
5e6897200b
Benchmarks: Revise Test - Revise benchmark test util to support pytorch multi-GPU test ( #54 )
...
* Superbenchmark: Revise tests - revise benchmark test util to support multi gpu test
* modify test_sharding_matmul.py to match the tests util
2021-04-14 16:39:53 +08:00
Yifan Xiong
cb33c99ccb
Docs - Update README for public view ( #52 )
...
* update README for public view
2021-04-13 17:41:33 +08:00
Yifan Xiong
9bba27fa24
CI/CD - Fix issues in codecov ( #53 )
...
* add flags for different pipelines to merge reports
* add token for cuda pipeline
* update codecov/patch target
2021-04-13 16:25:35 +08:00