зеркало из https://github.com/microsoft/DeepSpeed.git
8e891aa568
* fixing the softmax masking when using triangular masking * fix a bug in the the layernorm backward kernels * revert back some changes & remove debug code * change the constants to a macro Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> |
||
---|---|---|
.. | ||
benchmarks | ||
model | ||
onebit | ||
perf | ||
small_model_debugging | ||
unit | ||
conftest.py |