зеркало из https://github.com/microsoft/DeepSpeed.git
b1cb0dfc46
This PR introduces Twin-Flow feature of ZeRO-Offload++, which improves e2e training iteration time by up to 6x on DGX-H100s. This PR includes: * Twin-Flow implementation inside ZeRO optimizer * json config tutorial * example using deepspeed * unit tests cc @jeffra @awan-10 @tjruwase @mrwyattii Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> |
||
---|---|---|
.. | ||
compression.md | ||
config-json.md | ||
deepspeed4science.md | ||
inference.md | ||
posts-landing.md | ||
posts_list_landing.md | ||
training.md | ||
tutorials-landing.md |