Граф коммитов

16 Коммитов

Автор SHA1 Сообщение Дата
Michael Wyatt b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2023-03-30 17:14:38 -07:00
Jeff Rasley 91d63e0228
update formatter version and style settings (#3098) 2023-03-27 07:55:19 -04:00
Joe Mayer 18713c6838
Updating API docs (#2586)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-12-08 20:28:23 -08:00
Siddharth Singh 5fe9d61065
Tensor parallelism for Mixture of Experts (#2074)
* tensor parallelism for mixture of experts
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
2022-08-01 00:33:05 +00:00
Alex Hedges 316c4a43e0
Add flake8 to pre-commit checks (#2051) 2022-07-25 16:48:08 -07:00
Jianfeng Liu b4513f6310
fix softmax dim of Residual MoE in moe/layer.py (#2110)
Thanks a lot for finding this issue and fixed it :)
Co-authored-by: Zhewei Yao <zheweiyao@gmail.com>
2022-07-20 01:51:34 +00:00
Karim Foda 735406e536
fix import errors (#2026)
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2022-06-20 13:22:43 -07:00
Ammar Ahmad Awan 36ad3119d5
DeepSpeed comm backend v1 (#1985)
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-06-10 16:47:33 -07:00
shjwudp 1e61c7a860
fix: Fix undefined variable in _create_expert_data_and_model_parallel and make it easier to understand (#1826)
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
2022-03-17 14:00:37 -07:00
Ammar Ahmad Awan c0af6d90f7
Refactor MoE and Groups API to simplify model creation and mangement (#1798)
Co-authored-by: yaozhewei <zheweiy@berkeley.edu>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
2022-02-28 11:46:40 -08:00
Jeff Rasley e46d808a1b
MoE inference + PR-MoE model support (#1705)
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
2022-01-18 16:25:01 -08:00
alexandremuzio 2887349cd4
Adding Tutel to MoE layer (#1528)
Co-authored-by: Alex Muzio <alferre@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2021-11-05 14:45:28 -07:00
Ammar Ahmad Awan 56635d5b6c
enable/disable moe token dropping. (#1492)
* Add a flag to enable/disable token dropping in moe/top-1 gating.

* fix syntax and formatting.
2021-10-28 17:08:00 +00:00
Ammar Ahmad Awan 9f5939d2a7
Remove dropout as client code can do it independently. (#1354)
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
2021-09-10 10:18:09 -07:00
Jeff Rasley 9cb64a1fc5
MoE read the docs update (#1312) 2021-08-17 10:50:08 -07:00
Ammar Ahmad Awan f28432441b
DeepSpeed MoE (#1310)
Co-authored-by: Alex Muzio <Alex.Muzio@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Conglong Li <conglong.li@gmail.com>
Co-authored-by: Felipe Cruz Salinas <Andres.Cruz@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>
2021-08-17 05:18:45 +00:00