DeepSpeed

История

wyooyw b647fb2470 Fix expert grad scaling problem with ZeRO optimizer (#6546 ) Fix [#6545] work: - expert gradient average: divide edp_world_size -> divide dp_world_size - unit test: make sure model with different dp/ep has same expert gradient --------- Co-authored-by: wangyiou <wangyiou@xiaohongshu.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>		2024-10-23 00:08:39 +00:00
..
autotuning	Update BUFSIZE to come from autotuner's constants.py, not numpy (#5686 )	2024-06-19 14:19:50 -07:00
checkpoint	fix the missing argument in test and typo (#5730 )	2024-07-08 21:44:33 +00:00
comm	Avoid security issues of subprocess shell (#6498 )	2024-09-11 20:07:06 +00:00
compression	fix: Remove duplicate word the (#4051 )	2023-07-27 09:33:13 -07:00
elasticity	Avoid security issues of subprocess shell (#6498 )	2024-09-11 20:07:06 +00:00
inference	Rearrange inference OPS and stop using builder.load (#5490 )	2024-10-09 01:22:28 +00:00
launcher	Accept btl_tcp_if_include option through launcher_args (#6613 )	2024-10-14 19:26:24 +00:00
linear	OptimizedLinear updates (#5791 )	2024-08-13 23:36:22 +00:00
model_implementations	Rearrange inference OPS and stop using builder.load (#5490 )	2024-10-09 01:22:28 +00:00
module_inject	Enabled Qwen2-MoE Tensor Parallelism (TP) inference (#6551 )	2024-10-09 15:23:16 +00:00
moe	reduce cpu host overhead when using moe (#5578 )	2024-08-21 21:52:48 +00:00
monitor	Pydantic v2 migration (#5167 )	2024-08-22 15:38:13 -07:00
nebula	fix typo with deepspeed/ (#3547 )	2023-06-02 00:47:14 +00:00
nvme	DeepNVMe perf tuning (#6560 )	2024-09-26 13:07:19 +00:00
ops	Rearrange inference OPS and stop using builder.load (#5490 )	2024-10-09 01:22:28 +00:00
pipe	Update DeepSpeed copyright license to Apache 2.0 (#3111 )	2023-03-30 17:14:38 -07:00
profiling	Add conditional on torch version for scaled_dot_product_attention (#6517 )	2024-09-11 23:21:43 +00:00
runtime	Fix expert grad scaling problem with ZeRO optimizer (#6546 )	2024-10-23 00:08:39 +00:00
sequence	Long sequence parallelism (Ulysses) integration with HuggingFace (#5774 )	2024-08-21 01:46:50 +00:00
utils	add option to disable logger while compiling to avoid graph breaks (#6496 )	2024-10-15 18:30:42 +00:00
__init__.py	Long sequence parallelism (Ulysses) integration with HuggingFace (#5774 )	2024-08-21 01:46:50 +00:00
accelerator	Abstract accelerator (step 2) (#2560 )	2023-01-06 23:40:58 -05:00
constants.py	Allow env var for timeout (#4405 )	2023-10-10 08:56:10 -07:00
env_report.py	Add Windows scripts (deepspeed, ds_report). (#5699 )	2024-07-09 01:05:09 +00:00
git_version_info.py	Make op builder detection adapt to accelerator change (#5206 )	2024-03-12 20:48:29 +00:00