DeepSpeed/docs/_posts/2023-08-24-ulysses-chinese.md

252 B

title excerpt link date tags
DeepSpeed Ulysses: 训练极长序列Transformer模型的系统优化 https://github.com/microsoft/DeepSpeed/blob/master/blogs/deepspeed-ulysses/chinese/README.md 2023-08-24 00:00:00 training ZeRO Chinese