Michael Wyatt
b361c72761
Update DeepSpeed copyright license to Apache 2.0 ( #3111 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-03-30 17:14:38 -07:00
Jeff Rasley
91d63e0228
update formatter version and style settings ( #3098 )
2023-03-27 07:55:19 -04:00
Joe Mayer
18713c6838
Updating API docs ( #2586 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-12-08 20:28:23 -08:00
Siddharth Singh
5fe9d61065
Tensor parallelism for Mixture of Experts ( #2074 )
...
* tensor parallelism for mixture of experts
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
2022-08-01 00:33:05 +00:00
Alex Hedges
316c4a43e0
Add flake8 to pre-commit checks ( #2051 )
2022-07-25 16:48:08 -07:00
Jianfeng Liu
b4513f6310
fix softmax dim of Residual MoE in moe/layer.py ( #2110 )
...
Thanks a lot for finding this issue and fixed it :)
Co-authored-by: Zhewei Yao <zheweiyao@gmail.com>
2022-07-20 01:51:34 +00:00
Karim Foda
735406e536
fix import errors ( #2026 )
...
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2022-06-20 13:22:43 -07:00
Ammar Ahmad Awan
36ad3119d5
DeepSpeed comm backend v1 ( #1985 )
...
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-06-10 16:47:33 -07:00
shjwudp
1e61c7a860
fix: Fix undefined variable in _create_expert_data_and_model_parallel and make it easier to understand ( #1826 )
...
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-03-17 14:00:37 -07:00
Ammar Ahmad Awan
c0af6d90f7
Refactor MoE and Groups API to simplify model creation and mangement ( #1798 )
...
Co-authored-by: yaozhewei <zheweiy@berkeley.edu>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
2022-02-28 11:46:40 -08:00
Jeff Rasley
e46d808a1b
MoE inference + PR-MoE model support ( #1705 )
...
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
2022-01-18 16:25:01 -08:00
alexandremuzio
2887349cd4
Adding Tutel to MoE layer ( #1528 )
...
Co-authored-by: Alex Muzio <alferre@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2021-11-05 14:45:28 -07:00
Ammar Ahmad Awan
56635d5b6c
enable/disable moe token dropping. ( #1492 )
...
* Add a flag to enable/disable token dropping in moe/top-1 gating.
* fix syntax and formatting.
2021-10-28 17:08:00 +00:00
Ammar Ahmad Awan
9f5939d2a7
Remove dropout as client code can do it independently. ( #1354 )
...
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2021-09-10 10:18:09 -07:00
Jeff Rasley
9cb64a1fc5
MoE read the docs update ( #1312 )
2021-08-17 10:50:08 -07:00
Ammar Ahmad Awan
f28432441b
DeepSpeed MoE ( #1310 )
...
Co-authored-by: Alex Muzio <Alex.Muzio@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Conglong Li <conglong.li@gmail.com>
Co-authored-by: Felipe Cruz Salinas <Andres.Cruz@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>
2021-08-17 05:18:45 +00:00