DeepSpeed/deepspeed
Nadav Elyahu 2fc702ed9f
DeepSpeedCheckpoint: support custom final ln idx (#5506)
till today only last layer (idx=-1) was considered using
FINAL_LAYER_NORM_INDEX which is set to -1.
this PR allows the user to pass custom value for model where this
default value does not apply.
see example for usage in HabanaAI/Megatron-DeepSpeed fork repository:

c9feb8caca/tools/verify_checkpoint_non_tp_consistency.py (L296)

---------

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <loadams@microsoft.com>
2024-05-28 23:42:59 +00:00
..
autotuning Modified regular expression (#5306) 2024-03-27 15:38:25 +00:00
checkpoint DeepSpeedCheckpoint: support custom final ln idx (#5506) 2024-05-28 23:42:59 +00:00
comm Adapt doc for #4405 (#5552) 2024-05-21 21:58:47 +00:00
compression fix: Remove duplicate word the (#4051) 2023-07-27 09:33:13 -07:00
elasticity Cleanup required_torch_version code and references. (#5370) 2024-04-10 15:39:24 +00:00
inference Rocm warp size fix (#5402) 2024-05-17 20:35:58 +00:00
launcher add device config env for the accelerator (#5396) 2024-04-20 23:35:50 +00:00
linear Fix crash when creating Torch tensor on NPU with device=get_accelerator().current_device() (#5464) 2024-05-07 00:05:54 +00:00
model_implementations Capture short kernel sequences to graph (#4318) 2023-12-20 20:51:36 +00:00
module_inject Switch from double quotes to match single quotes (#5530) 2024-05-13 20:20:21 -07:00
moe Fix RuntimeError for moe on XPU: tensors found at least two devices (#5519) 2024-05-21 15:01:05 +00:00
monitor New integration - CometMonitor (#5466) 2024-05-15 16:04:44 +00:00
nebula fix typo with deepspeed/ (#3547) 2023-06-02 00:47:14 +00:00
ops [INF] DSAttention allow input_mask to have false as value (#5546) 2024-05-22 20:22:53 +00:00
pipe Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
profiling Update flops profiler to handle attn and __matmul__ (#4724) 2024-02-21 16:53:43 +00:00
runtime [MiCS] Remove the handle print on DeepSpeed side (#5574) 2024-05-28 17:08:42 +00:00
sequence Add Ulysses DistributedAttention compatibility (#5525) 2024-05-22 21:52:39 +00:00
utils Add throughput timer configuration (#5363) 2024-05-22 20:28:02 +00:00
__init__.py Add `distributed_port` for `deepspeed.initialize` (#5260) 2024-04-02 19:33:20 +00:00
accelerator Abstract accelerator (step 2) (#2560) 2023-01-06 23:40:58 -05:00
constants.py Allow env var for timeout (#4405) 2023-10-10 08:56:10 -07:00
env_report.py Make op builder detection adapt to accelerator change (#5206) 2024-03-12 20:48:29 +00:00
git_version_info.py Make op builder detection adapt to accelerator change (#5206) 2024-03-12 20:48:29 +00:00
pydantic_v1.py Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407) 2023-10-09 11:59:30 -07:00