Граф коммитов

305 Коммитов

Автор SHA1 Сообщение Дата
Yifan Xiong e7b6af35c0
Website - Initialize SuperBench website (#102)
* Initialize SuperBench website.
* Add GitHub Actions for automatically build and publish.
2021-06-25 15:59:13 +08:00
Yifan Xiong 832e392f91
Docs - Update SuperBench documents (#101)
Update SuperBench documents.
2021-06-25 13:40:14 +08:00
Yifan Xiong c0c43b8f81
Bug bash - Fix bugs in multi GPU benchmarks (#98)
* Add `sb deploy` command content.
* Fix inline if-expression syntax in playbook.
* Fix quote escape issue in bash command.
* Add custom env in config.
* Update default config for multi GPU benchmarks.
* Update MANIFEST.in to include jinja2 template.
* Require jinja2 minimum version.
* Fix occasional duplicate output in Ansible runner.
* Fix mixed color from Ansible and Python colorlog.
* Update according to comments.
* Change superbench.env from list to dict in config file.
2021-06-23 18:16:43 +08:00
guoshzhao 216c5b5c71
Benchmarks: Add Feature - Add DistributedImpl and DistributedBackend arguments for micro benchmark. (#100) 2021-06-21 23:34:05 +08:00
Yuting Jiang 3d72c07807
Bug bash - Rename bin name and metric name of cublas and cudnn microbenchmark (#99)
rename bin name and result metric of cublas and cudnn microbenchmark
2021-06-20 15:42:37 +08:00
Yifan Xiong 25ec3a7c1c
Dockerfile - Update CUDA 11.1.1 Dockerfile (#96)
Update packages and add build cache for CUDA 11.1.1 Dockerfile:

* Remove duplicate cmake and ompi, which are already in base image
* Add hpcx and sharp lib
* Add cache for gitmodules build
* Sort apt-get packages
2021-06-16 16:47:52 +08:00
Yifan Xiong ddbc51a135
Bug bash - Fix bugs and refine log in single GPU benchmarks (#97)
Fix bugs and refine log in single GPU benchmarks:

* Fix none framework issue
* Fix empty parameter bug
* Remove missed mobilenet_v3 models
* Change benchmark registration log to debug level
* Add pid in logging
* Add missing benchmarks in default config
* Fix deprecated logging warn
2021-06-16 13:51:22 +08:00
guoshzhao 03b41be145
Benchmarks: Fix Bug - Fix OOM issue when run pytorch models sequentially. (#93)
* Clean up the cache.
2021-06-07 10:19:05 +08:00
guoshzhao 2d9be807a9
Benchmarks: Fix Bug - Fix return code overwrite issue (#94)
* fix return code reset issue
2021-06-04 18:02:12 +08:00
Yifan Xiong 6b0ca1cb05
Runner - Support local mode in runner (#88)
* Support local mode in runner.
2021-06-02 23:58:44 +08:00
guoshzhao 44c5103b5c
Benchmarks: Code Revision - Change default shape of sharding-matmul. (#92)
* Change default shape of sharding-matmul.
2021-06-02 10:50:09 +08:00
guoshzhao 6c6f526937
Benchmarks: Add Benchmark - Add FLOPs performance benchmark for cuda. (#87)
* add cuda flops performance benchmark.
2021-06-02 09:15:58 +08:00
guoshzhao 331c740a15
Benchmarks: Add Feature - Add nvml package to provide python interfaces of nvidia. (#91) 2021-06-01 23:31:07 +08:00
Yuting Jiang 83235433b2
Benchmarks: Add benchmark - add micro benchmark for cudnn test (#89)
* add python related cudnn microbenchmark
2021-06-01 22:24:35 +08:00
Yuting Jiang 0831748167
Benchmarks: Code Revision - add error return code for cublas microbenchmark (#90)
* add error return code for cublas micro benchmark
2021-06-01 16:12:15 +08:00
guoshzhao 40d7905e6b
Benchmarks: Build Pipeline - Add cutlass as a submodule and add build logic. (#85)
* add cutlass as submodule.
* add build script for cutlass.
* only support compute capability 7.0(V100) and 8.0(A100)
2021-06-01 11:17:44 +08:00
Yuting Jiang 61c258fe5b
Benchmarks: Add benchmark - add source code of cudnn function micro benchmark (#78)
* Benchmarks: Add benchmark - add source code of cudnn function micro benchmark
2021-06-01 10:33:23 +08:00
Yifan Xiong 5e9f948df2
Executor - Save benchmark results to file (#86)
* Save benchmark results to json file.
2021-05-31 13:05:12 +08:00
Yuting Jiang 18398fbaa2
Benchmarks: Add benchmark - add micro benchmark for cublas test (#80)
* add benchmark for cublas test

* format

* revise error handling and test

* add interface to read json file, revise json file path and include .json in packaging

* add random_seed in arguments

* revise preprocess of cublas benchmark

* fix lint error and note error in source code

* update according comments

* revise input arguments from json file to custom str and convert json file to built-in dict list

* restore package config

* fit lint issue

* update platform and comments

* rename files to match source code dir and fix comments error

Co-authored-by: root <root@sb-validation-000001.51z1chmys5fuzfqyo4niepozre.bx.internal.cloudapp.net>
2021-05-31 10:31:53 +08:00
Yifan Xiong 8b4f613a76
Runner - Support torch.distributed mode in runner (#81)
* Support `torch.distributed` mode in runner.
* Support given `proc_num` and `node_num` in `torch.distributed` mode.
2021-05-28 12:29:39 +08:00
Yuting Jiang 87f6b371e8
Benchmarks: Add benchmark - add source code of cublas function micro benchmark (#77)
* Superbenchmark: Add benchmarks - add cublas function micro benchmark

* format

* add python benchmark for cublas functions, example and test file

* detele python related and rename some files

* revise cmd_helper and move json package to cmake

* resolve conflict

* revise error handing to try-catch and update some code style

* revise cmd_helper.h, cublas_helper.h, cublas_helper.cpp

* revise structure of the cublas function

* add some comments and move cuda_init and cuda_free

* add comments for class member

* add ramdom seed, revise input from file to json string, simplify cmake

* delete json file in source code of cublas

* update according comments

* limit batchcount=1 in initialization of cublas function which do not use batch count

* revise and fix some errors of annotations

* update according comments and revise construction of CublasFunction

Co-authored-by: root <root@sb-validation-000001.51z1chmys5fuzfqyo4niepozre.bx.internal.cloudapp.net>
2021-05-27 16:44:18 +08:00
Yuting Jiang 4489388cf0
reverse the order of cppbuild and lint (#84) 2021-05-26 20:38:27 +08:00
Yifan Xiong e7f6d8ba78
CI/CD - Add integration tests for Ansible playbooks (#82)
* Add integration tests for Ansible playbooks
* Add `gpu_vendor` var to bypass gpu mount
2021-05-26 20:04:49 +08:00
Yuting Jiang e996516259
Benchmarks: Build Pipeline - Revise path of installing cmake projects (#83)
* Unify SB_MICRO_PATH and SB_MICRO_LIB

* fix bug of lib path
2021-05-26 19:14:14 +08:00
TobeyQin 1652524aa0
Docs - Update README file on main page (#79)
* Update Readme file on main page
2021-05-25 15:06:54 +08:00
Yifan Xiong c05e173b3d
Runner - Implement ansible client and runner (#69)
Implement ansible client and runner:
* add ansible client
* add deploy and check_env playbooks
2021-05-23 23:53:37 +08:00
guoshzhao e977bbc17f
Benchmarks: Add Benchmark - Add kernel launch overhead benchmark. (#74)
* add kernel launch overhead benchmark.
2021-05-19 17:06:55 +08:00
Yuting Jiang b7d0ee329f
expose interface of pin memory and modify cnn configuration (#75) 2021-05-19 10:52:45 +08:00
guoshzhao 7cfe7c16cf
Benchmarks: Add Benchmark - Add the source code of cuda kernel launch overhead benchmark. (#71)
* add cuda kernel launch overhead benchmark - source part.
* can customize the nvcc_archs_support.
* set SB_MICRO_PATH for azure pipeline tests.
2021-05-18 14:07:27 +08:00
guoshzhao 94d3765b49
Benchmarks: Build Pipeline - Call build script when setup environment. (#76)
* call build script in Makefile.
* add cppbuild command for testing and docker env.
2021-05-18 11:49:23 +08:00
Yifan Xiong 977b1a7355
CLI - Refine CLI handlers (#68)
* use absolute path of input file
* parse registry uri from image
* merge common parts for arguments processing
2021-05-18 11:34:15 +08:00
Yifan Xiong af6eb004df
CI/CD - Add GitHub Action to build and push image (#70)
* add GitHub Action to build and push image
* update Dockerfile to copy from context
2021-05-17 21:34:15 +08:00
guoshzhao 2bc7ada149
Benchmarks: Add Feature - Add script to build all cmake benchmark projects. (#72)
* add script to build all native benchmarks with cmake.
2021-05-17 18:16:54 +08:00
Yifan Xiong 0b2952d459
Setup - Add lint for cpp sources (#73)
__Major Revisions__

* add clang-format to lint cpp sources
* add cpp lint in GitHub Actions
2021-05-17 11:36:41 +08:00
guoshzhao 729e04ab94
Benchmarks: Code Revision - Revise MicroBenchmark class to be more flexible. (#66)
* Revise MicroBenchmark class to be more flexible.
* use command index not the command as the parameter.
* changes according to discussion.
2021-05-13 18:58:47 +08:00
Yifan Xiong 57ce473a02
Utils - Support lazy import (#67)
__Major Revision__

* Support lazy import.
* Not importing benchmarks when running `help`, `version`, `deploy` commands, etc.
2021-05-11 10:49:22 +08:00
guoshzhao 65292ae55b
Benchmarks: Code Revision - Revise the settings of CNN example models. (#65)
* revise example settings of cnn models.
2021-04-26 21:23:11 +08:00
guoshzhao a7184da3f3
Benchmarks: Fix Bug - Increase default sample count for benchmarking. (#64) 2021-04-26 20:47:12 +08:00
guoshzhao 0324117fd3
Benchmarks: Fix Bug - Fix dataset precision for CNN and LSTM benchmarks. 2021-04-26 20:37:06 +08:00
Yifan Xiong 2299d238dd
CI/CD - Speedup CPU pipeline (#61)
Speedup Azure pipeline for CPU unit test.
2021-04-21 11:58:55 +08:00
guoshzhao 2a7ab691f1
Benchmarks: Add Benchmark - Add LSTM model benchmarks. (#60)
* Benchmarks: Add Benchmark - Add LSTM model benchmarks.
2021-04-20 10:53:44 +08:00
guoshzhao 902ea211d1
Benchmarks: Add Benchmark - Add CNN model benchmarks. (#59)
* Benchmarks: Add Benchmark - Add CNN model benchmarks.
2021-04-20 10:43:02 +08:00
guoshzhao ce3ed24ab7
Benchmarks: Code Revision - Fix some issue for BERT benchmark. (#58)
Benchmarks: Code Revision - Fix some issue for BERT benchmark. (#58)
2021-04-16 13:17:42 +08:00
guoshzhao af567cf650
Benchmarks: Add Benchmark - Add GPT2 model benchmark. (#57)
* Benchmarks: Add Benchmark - Add GPT2 model benchmark.
2021-04-16 11:39:57 +08:00
guoshzhao fb850af760
Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks. (#56)
* Benchmarks: Add Feature - Add interface to get all predefine parameters of all benchmarks.
2021-04-14 22:38:26 +08:00
Yuting Jiang 435b2d5eeb
Benchmarks: Add Benchmark - Add computation and communication overlap micro benchmark (#39)
* Benchmarks: Add Benchmark - add computation and communication overlap micro benchmark

* Benchmarks: Add benchmark - fix some format issues and typo

* Benchmarks: Add Benchmark - update according comments and add test

* revise tests

* skip multi gpu test due to no multi gpu

Co-authored-by: v-yujiang <v-yujiang@microsoft.com>
2021-04-14 18:07:06 +08:00
Yifan Xiong 74c4d1b2a4
Setup: Code Revision - Rename dev branch to main in config and readme (#55)
* Rename dev branch to main and set it as default.
2021-04-14 17:40:34 +08:00
Yuting Jiang 5e6897200b
Benchmarks: Revise Test - Revise benchmark test util to support pytorch multi-GPU test (#54)
* Superbenchmark: Revise tests - revise benchmark test util to support multi gpu test

* modify test_sharding_matmul.py to match the tests util
2021-04-14 16:39:53 +08:00
Yifan Xiong cb33c99ccb
Docs - Update README for public view (#52)
* update README for public view
2021-04-13 17:41:33 +08:00
Yifan Xiong 9bba27fa24
CI/CD - Fix issues in codecov (#53)
* add flags for different pipelines to merge reports
* add token for cuda pipeline
* update codecov/patch target
2021-04-13 16:25:35 +08:00