DeepSpeed

Граф коммитов

Автор	SHA1	Сообщение	Дата
Lev Kurilenko	fd1449c766	Port Reza's INT8-quantization fix to container architecture (#2725 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Heyang Qin <heyangqin@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2023-02-16 10:12:18 -08:00
Molly Smith	46784cb58e	Fix auto TP for duplicate modules with different gems (#2784 ) * Fix auto TP for duplicate modules with different gems * precommit and comments * Comment * Combine gem list of same named modules * remove duplicates from gem_list before updating policy * Add module attribute with name variation for ProphetNet --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-02-15 12:50:32 -08:00
Lev Kurilenko	10f3c301a0	Add container load checkpoint error reporting + refactor (#2792 ) This PR refactors the organization of meta tensor checkpoint loading as follows: - Move get_param_names() abstract method definition from TransformerPolicy into MetaTensorContainer - Model-specific get_param_names() definitions moved from policy into model-specific container - selected_policy_g, megatron_v2_g, and transformer_config_g globals replaced with a single container_g global, since the container will contain all of the information those globals previously captured - ckpt_load_enabled flag added to containers that's set to False by default in the base.py container and gets set to True when the MetaTensorContainer feature is inherited - Assertion added to replace_transformer_layer before performing checkpoint loading to check if ckpt_load_enabled ==True, otherwise an error message will be printed saying that the container does not support meta tensor checkpoint loading. The aim of these changes is to more closely couple meta tensor checkpoint loading code to the MetaTensorContainer and to allow for better error reporting of load checkpoint use on model types that don't support this feature.	2023-02-07 23:18:30 +00:00
Lev Kurilenko	0a73e6e613	Container param cleanup + remove qkv_merging (#2780 ) This PR cleans up some container items and removes an unused qkv_merging parameter: - Remove qkv_merging=True from BERT containers - Change containers config object to ds_model_config - Remove qkv_merging param	2023-02-03 21:49:33 +00:00
Reza Yazdani	9f41ffe4a6	Reset KV-cache at the beginning of text-generation (#2669 ) Co-authored-by: Martin Cai <martincai@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-02-03 12:07:44 -08:00
Reza Yazdani	2c6e819450	Fix Checkpoint-loading with Meta-tensor (#2781 ) * Reset KV-cache at the beginning of text-generation * Pass the ckpt-loading arguments to work with meta-tensor * remove unrelated changes	2023-02-03 07:12:53 +00:00
Michael Wyatt	ef6a958e70	Fix for diffusers v0.12.0 (#2753 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-01-31 15:54:46 -08:00
Ma, Guokai	98cc35b6a8	Abstract accelerator (step 3) (#2677 ) * Integrate accelerator abstraction interface into deepspeed/ * Fix error message in fp16/fused_optimizer * fix error message in fp16/unfused_optimizer.py * assign get_accelerator().pin_memory() result to input Tensor name * no need to check cuda and whether nvtx supported * move try-except into inner most block * call Event() and Stream() in get_accelerator() for data type * Make Stream and Event as properties of abstract interface so they can be used as data type in deepspeed * Apply op_builder backend api change from #2705 from @jeffra * fix tests where Builder NAME is used * keep original ...Builder.NAME interface instead of ...Builder().NAME interface * fix builder closure for installation * fix randomltd builder * add comments to clarify create_op_builder and get_op_builder * fix compatibility with pip install -e Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2023-01-26 06:03:12 -08:00
Molly Smith	d59b572911	Automatic tensor parallelism v2 (#2670 ) * loop through pipe.model * tp_parser first draft * client_module must be type object * Simplify layernorm tracking. Add unittest. * cleanup * Add more models to unittest * cleanup inference pytest for merging * Add unittest * cleanup * pre-commit * unittest id and pytest marker * try marian for unittest * precommit * Move tp code to seperate file * Add new auto tp file * pre-commit and type * Update deepspeed/module_inject/auto_tp.py Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> * Update deepspeed/module_inject/auto_tp.py Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> * Update tests/unit/inference/test_inference.py Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> * remove unused fillmask function Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2023-01-24 15:05:48 -08:00
Ammar Ahmad Awan	867da307d0	Inference Refactor (replace_with_policy, model_implementations) (#2554 ) Co-authored-by: Lev Kurilenko <lekurile@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-01-19 14:10:03 -08:00
Reza Yazdani	95d9a1b6c3	Fix Opt injection (#2541 ) * fix Opt injection & add injection verification check at inference test * fix several issues * remove fixture * remove check_injection when no kerenl is injected Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2023-01-06 13:21:49 -08:00
Jeff Rasley	d9b788d773	tweaks to ds-attn, distilbert policy, and mup (#2649 )	2022-12-28 10:16:02 -08:00
Jeff Rasley	e0aa84c5b5	Fix issue w. bloom when changing tp size (#2645 )	2022-12-23 03:27:33 +00:00
Lev Kurilenko	503706ac44	Remove GatheredParameters context from replace_with_policy (#2591 ) This PR removes the zero-infernece GatheredParameters context from replace_with_policy due to no longer needing zero-inference after the introduction of meta tensor support for BLOOM.	2022-12-16 13:43:28 -08:00
Jeff Rasley	35eabb0a33	Fix issues w. python 3.6 + add py-version checks to CI (#2589 )	2022-12-09 21:53:58 +00:00
Michael Wyatt	ccb8eb81fb	Add checkpoint sharding unit tests (#2561 ) * added checkpopint sharding tests	2022-12-08 14:35:43 -08:00
Lev Kurilenko	731965db33	Fix MegatronLayerPolicy to have megatron_v2=True (#2579 ) This PR updates the MegatronLayerPolicy to set megatron_v2=True, which is required in order to properly transpose in the replace_with_policy() function. After the change in this PR, in conjunction with PR #99 in the Megatron-DeepSpeed fork, the Megatron text-generation example works with DS inference.	2022-12-07 09:26:09 -08:00
Reza Yazdani	35b350b28c	Fix quantized-inference & Add generic support of checkpoint loading (#2547 ) * fix checkpoint loading when it is a dictionary * fix some issues with saving ckpt & int8 inference * fix quantized-inference & add generic support of checkpoint loading * remove int8 hard-coded flag * fix mlp return tensors * fix several issue to load checkpoints of GPT-J, GPT-NEOX, and OPT with different TP-size * add more comments & description for checkpoint-loading module Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-12-06 13:49:29 -08:00
Ammar Ahmad Awan	90ae688442	Pass down the new DS inference config to replace_transformer_layer. (#2539 ) * pass down the new DS inference config to replace_transformer_layer. * remove quantize_settings and rename the ep_mp_group. * Fix model_config passing. Fixes gptj issue with wrong output. * fix small bug in gpt-neo. Co-authored-by: Reza Yazdani and Michael Wyatt	2022-11-23 19:50:11 +00:00
Ammar Ahmad Awan	b5d18a6ab3	DeepSpeed inference config. (#2459 ) (#2472 ) Changes to inference API to use accept a config dict and cleaning up Inference Engine to utilize the newly added inference config. Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-11-15 00:45:43 +00:00
lokoppakmsft	f2710bbe1d	Make data contiguous before the inplace reshape-copy_ function (#2489 ) Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-11-11 14:04:31 -08:00
Connor Holmes	e7e7595502	Stable Diffusion Enhancements (#2491 ) Co-authored-by: cmikeh2 <connorholmes@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com>	2022-11-09 17:40:59 -08:00
Kevin Ko	6f77da1bae	Add `scale_attn_by_inverse_layer_idx` feature (#2486 ) * Add scale_attn_by_inverse_layer_idx feature * Fix layer_id bug * Fix scaling value Co-authored-by: Connor Holmes <connorholmes@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-11-09 15:29:10 -08:00
Reza Yazdani	9cfcf7431a	Add correct memory-allocation at DeepSpeed-Attention (#2474 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com>	2022-11-07 16:23:25 -08:00
Ammar Ahmad Awan	35458da0e0	Create a new folder structure to isolate model-specific code in DS (#2464 )	2022-11-03 17:00:44 -07:00
Connor Holmes	10e9d04c23	Cache Allocation and Softmax Fixes (#2433 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-11-02 10:48:18 -07:00
Jeff Rasley	ec13da6ba7	add SD injection policy (#2381 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-10-13 16:47:12 -07:00
Andrey Chernykh	cd3a70953a	Fix GPT Neo-X multi-gpu inference (#2401 ) Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-10-13 10:18:03 -07:00
lekurile	46a886c068	Change type to tuple in replace_wo_policy isinstance check (#2387 ) Update the isinstance check inside the `replace_wo_policy` function to `tuple` and `str` instead of `dict`, since the layers are provided as a `tuple` type. Co-authored-by: Lev Kurilenko <lekurile@microsoft.com> Co-authored-by: Molly Smith <mosm@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>	2022-10-07 15:32:10 -07:00
Ammar Ahmad Awan	993264388d	Inference profiling updates/fixes (#2348 ) (#2349 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-09-23 14:38:09 -07:00
Stas Bekman	b146aa3523	[ds-inference] fix progress bar (#2286 ) when loading the non-sharded checkpoint update the progress bar (fix by @RezaYazdaniAminabadi) - I've just tested it to work. Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2022-09-04 18:12:36 -04:00
Reza Yazdani	afdc72879f	Ds-inference Int8 support through ZeroQuant technology (#2217 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-08-30 16:39:34 -07:00
Molly Smith	a7ee688a6f	Update replace_module.py, test-gptj.py related fix (#2269 ) Fix RuntimeError: Boolean value of Tensor with more than one value is ambiguous when running test-gptj.py	2022-08-26 23:25:27 -07:00
Reza Yazdani	c35bfe89f6	fix ds-inference without policy (#2247 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-08-23 14:44:09 -07:00
Arash Bakhtiari	fae896ef60	Make OPT policy backward compatible with pre-OPT transformers versions (#2254 )	2022-08-23 14:38:48 -07:00
Jeff Rasley	dce3acaac7	allow saving ckpt w/o ckpt json + bloom copy fix (#2237 )	2022-08-19 15:01:15 -07:00
Arash Bakhtiari	8b2a63717a	Add support of OPT models (#2205 ) * add opt replace policy * simplify inf. api * fix opt replace policy * fix use-cash & add relu * Add support of custom MLP act. function * Revert "simplify inf. api" This reverts commit 9e910fcbd5471dec9b3c92008426f5ba590bf0b6. * fix the inference API (temp. solution) * fix code formatting * add unit tests for OPT models. * refactor pre-attention layer norm configuration * add support of opt-350m model * refactor the HF model config initialization * fix hf model config issue Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-08-15 07:31:51 -07:00
Reza Yazdani	8920308c66	Fix the tensor-slicing copy for qkv parameters (#2198 ) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2022-08-10 09:34:57 -07:00
Reza Yazdani	e7d9959540	fixing model partitioning without injection (#2179 )	2022-08-03 20:49:11 -07:00
Reza Yazdani	556f005152	Fix random token-generation issue + MP-checkpoint loading/saving (#2132 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2022-07-28 17:24:07 -07:00
Alex Hedges	316c4a43e0	Add flake8 to pre-commit checks (#2051 )	2022-07-25 16:48:08 -07:00
Michael Wyatt	ee7ea3b805	use HF NeoX (#2087 ) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-07-19 12:50:58 -07:00
Stas Bekman	16699d839f	[ds-inference] checkpoint loading => tqdm (#2107 ) * [ds-inference] checkpoint loading => tqdm solve 2 issues: - less noise using tqdm progress bar - more informative - tell users how much to wait and how many shards to load New way: ``` Loading 72 checkpoints: 12%\|█▎ \| 9/72 [01:12<08:39, 8.25s/it] ``` * write only from one process * style	2022-07-19 09:21:19 -07:00
Reza Yazdani	aa88137b8d	Add Inference support for running the BigScience-BLOOM Architecture (#2083 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>	2022-07-18 16:27:12 -07:00
Alex Hedges	76ea0534c1	Fix missing import in replace_module.py (#2050 ) * Fix missing import in replace_module.py * Change import from torch.distributed to deepspeed.comm	2022-06-29 17:16:26 +00:00
Jeff Rasley	b666d5cd73	[inference] test suite for ds-kernels (bert, roberta, gpt2, gpt-neo, gpt-j) (#1992 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-06-15 14:21:19 -07:00
Ammar Ahmad Awan	36ad3119d5	DeepSpeed comm backend v1 (#1985 ) Co-authored-by: Quentin Anthony <qganthony@yahoo.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-06-10 16:47:33 -07:00
Reza Yazdani	8164ea9e6d	Fixing several bugs in the inference-api and the kernels (#1951 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-05-24 13:27:50 -07:00
Jeff Rasley	b4fcd98ff0	Inference PP changes for neox (#1899 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2022-04-26 11:50:38 -07:00
Samyam Rajbhandari	c13457b756	Supporting multiple modules injection with a single policy when they have identical architectures (#1869 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-03-30 17:47:19 +00:00
Ammar Ahmad Awan	c0af6d90f7	Refactor MoE and Groups API to simplify model creation and mangement (#1798 ) Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Reza Yazdani <reyazda@microsoft.com>	2022-02-28 11:46:40 -08:00
Reza Yazdani	841f99d162	Load MoE checkpint at deepspeed inference-engine (#1759 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2022-02-11 10:56:13 -08:00
Reza Yazdani	fa0735760a	Fix the tensor-slicing with multi-GPU inference and kernel-injection (#1724 ) * use the right tensor-copy function when adding tensor-slicing * small fix in inference tutorial	2022-02-03 09:25:11 -08:00
Reza Yazdani	94de0229fb	Fix inference api & add more description on inference engine tutorial (#1711 )	2022-01-19 15:27:51 -08:00
Jeff Rasley	e46d808a1b	MoE inference + PR-MoE model support (#1705 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>	2022-01-18 16:25:01 -08:00
Reza Yazdani	289c3f9ba4	GPT-J inference support (#1670 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2022-01-08 02:40:31 +00:00
Alex Hedges	fc2f378ece	Improve pre-commit hooks (#1602 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2021-12-01 03:12:29 +00:00
Jeff Rasley	a10e4811fe	force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending ) (#1598 )	2021-11-29 15:41:18 -08:00
Reza Yazdani	9ce00a2171	Tensor-Parallelism general support (#1512 ) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2021-11-11 22:22:57 -08:00
Reza Yazdani	ee6a92c066	Fixing the transformer APIs to return tuple as the output (if needed) (#1491 )	2021-10-29 23:19:48 +00:00
Kamal Raj	c6d1418d85	Make the replace module more configurable (#1366 ) * DeepSpeedInferenceConfig get epsilon value from config * epsilon -> layer_norm_eps to keep var name same as in DeepSpeedTransformerConfig * DeepSpeedTransformerConfig get epsilon value from config * configurabale stochastic_mode eg: 1. For LM pre-training True 2. For LM fine-tuning on task False * Updated replace_module.py checking layer_norm_eps is attribute of config default 1e-12 Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2021-10-05 13:29:50 -07:00
Alex Hedges	be789b1665	Fix many typos (#1423 ) * Fix typos in docs/ * Fix typos in code comments and output strings * Fix typos in the code itself * Fix typos in tests/ Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2021-10-01 19:56:32 -07:00
Reza Yazdani	49b6a63251	Reducing the memory-overhead of creating model for multi-GPU run (#1244 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2021-08-26 11:05:43 -07:00
Reza Yazdani	6ba9628970	Fixing inference api for FP32 and non-masking GPT-based models (#1204 ) * fixing inference api for FP32 and non-masking GPT-based models * use a dummy tensor if input_mask is none * fix input_mask * minor fix * send input_mask to compute_attn func for checking	2021-07-20 13:54:16 -07:00
Hyunwoong Ko	429cbc89af	Fix bugs about non-contiguous tensor broadcasting (#1168 ) * Fix bugs about non-contiguous tensor broadcasting * Fix typo Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>	2021-06-23 20:52:54 -07:00
Reza Yazdani	aca7fc549a	Add local attention for GPT-Neo model architecture (#1114 ) * fix links for inference tutorial * Fix automatic injection. Add the local-attention for GPT-Neo * fix the inference for generation of large sequences (>1K & <32K) * fix format Co-authored-by: Jeff Rasley <jerasley@microsoft.com>	2021-06-08 11:44:59 -07:00
Jeff Rasley	96eb5b12e3	delay imports for replace policies and fix missing req (#1100 ) * delay imports for replace policies and fix missing req * fix issue with _orig_layer_class always being None	2021-05-24 16:43:36 -07:00
Reza Yazdani	ed3de0c21b	Quantization + inference release (#1091 ) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: eltonzheng <eltonz@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Elton Zheng <eltonz@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: eltonzheng <eltonz@microsoft.com> Co-authored-by: Arash Ashari <arashari@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com> Co-authored-by: eltonzheng <eltonz@microsoft.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Arash Ashari <arashari@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: eltonzheng <eltonz@microsoft.com> Co-authored-by: Shaden Smith <Shaden.Smith@microsoft.com> Co-authored-by: Arash Ashari <arashari@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: niumanar <60243342+niumanar@users.noreply.github.com>	2021-05-24 01:10:39 -07:00
Reza Yazdani	e2dfcadf3b	Fix the bias-add and add the layer-norm-eps parameter (#791 ) * fix the bias-add precision and indexing and also adding the layer-norm-eps as a configurable parameter for transformer * add ACC_HALF config * use defined to check if ACC_Half is defined	2021-02-24 13:51:48 -08:00
Reza Yazdani	48065c06d7	Fixing the module-inject Api (#786 )	2021-02-24 10:54:09 -08:00
Jeff Rasley	44bd538b11	Module replacement support (#586 ) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>	2021-01-06 11:03:35 -08:00

1 2 3 4

171 Коммитов