DeepSpeed/op_builder
Logan Adams ccfdb84e2a
FP6 quantization end-to-end. (#5234)
The user interface: https://github.com/microsoft/DeepSpeed-MII/pull/433
nv-a6000 ci running against the MII branch linked above is
[here](https://github.com/microsoft/DeepSpeed/actions/runs/8192124606)

Co-authored-by: Zhen Zheng
[zhengzhen@microsoft.com](mailto:zhengzhen@microsoft.com)
Co-authored-by: Shiyang Chen [csycfl@gmail.com](mailto:csycfl@gmail.com)
Co-authored-by: Arash Bakhtiari
[abakhtiari@microsoft.com](mailto:abakhtiari@microsoft.com)
Co-authored-by: Haojun Xia
[xhjustc@mail.ustc.edu.cn](mailto:xhjustc@mail.ustc.edu.cn)

---------

Co-authored-by: ZHENG, Zhen <zhengzhen.z@qq.com>
Co-authored-by: Shiyang Chen <csycfl@gmail.com>
Co-authored-by: Haojun Xia <xhjustc@mail.ustc.edu.cn>
Co-authored-by: Arash Bakhtiari <arash@bakhtiari.org>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
2024-03-07 16:55:43 -08:00
..
cpu Use ninja to speed up build (#5088) 2024-02-21 02:20:11 +00:00
hpu Use ninja to speed up build (#5088) 2024-02-21 02:20:11 +00:00
npu [NPU] Add NPU to support hybrid engine (#4831) 2024-02-01 14:02:27 -08:00
xpu Use ninja to speed up build (#5088) 2024-02-21 02:20:11 +00:00
__init__.py add type checker ignore to resolve that pylance can't resolved noqa annotation (#4102) 2023-08-08 13:30:23 +00:00
all_ops.py Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
async_io.py move torch import (#4468) 2023-10-09 01:33:17 +00:00
builder.py Use ninja to speed up build (#5088) 2024-02-21 02:20:11 +00:00
cpu_adagrad.py ROCm 6.0 prep changes (#4537) 2023-10-20 19:05:54 +00:00
cpu_adam.py ROCm 6.0 prep changes (#4537) 2023-10-20 19:05:54 +00:00
cpu_lion.py feat: add Lion optimizer (#4331) 2023-10-05 22:32:14 +00:00
evoformer_attn.py fix multiple definition while building evoformer (#4556) 2023-10-26 21:48:07 +00:00
fused_adam.py Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
fused_lamb.py Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
fused_lion.py feat: add Lion optimizer (#4331) 2023-10-05 22:32:14 +00:00
inference_core_ops.py FP6 quantization end-to-end. (#5234) 2024-03-07 16:55:43 -08:00
inference_cutlass_builder.py Isolate src code and testing for DeepSpeed-FastGen (#4610) 2023-11-07 12:03:44 -06:00
quantizer.py Enable quantizer op on ROCm (#4114) 2024-01-10 23:48:54 +00:00
ragged_ops.py Use ninja to speed up build (#5088) 2024-02-21 02:20:11 +00:00
ragged_utils.py Isolate src code and testing for DeepSpeed-FastGen (#4610) 2023-11-07 12:03:44 -06:00
random_ltd.py ROCm 6.0 prep changes (#4537) 2023-10-20 19:05:54 +00:00
sparse_attn.py Update torch version check in building sparse_attn (#3152) 2023-04-14 16:24:39 +00:00
spatial_inference.py Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
stochastic_transformer.py Update DeepSpeed copyright license to Apache 2.0 (#3111) 2023-03-30 17:14:38 -07:00
transformer.py ROCm 6.0 prep changes (#4537) 2023-10-20 19:05:54 +00:00
transformer_inference.py Hybrid Engine Refactor and Llama Inference Support (#3425) 2023-05-03 17:20:07 -07:00