onnxruntime-extensions

Граф коммитов

Автор	SHA1	Сообщение	Дата
cao lei	95d65e4ec0	sync to flash attention kernel 2.5.9 and add document of how to write custom op (#757 ) * sync to flash attention kernel 2.5.9 * support users to overload GetMayInplace and ReleaseMayInplace * Undo the change for pybind11 dependency	2024-07-10 07:09:40 -07:00
Wenbing Li	97ee9eb56f	Refactor OrtxStatus to be header-only implmentation. (#720 )	2024-05-17 15:40:11 -07:00
Tang, Cheng	3b889fc42f	update custom op v2 struct to be able to invoke from eager mode (#700 ) Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2024-04-30 13:53:39 -07:00
Wenbing Li	f9290e8bac	Add a status class for future tokenizer API implementation (#690 ) * Add a status class for future API implementation * Update bpe_kernels.cc * fix the ios package pipeline * update mistral test model name	2024-04-18 21:12:14 -07:00
Wenbing Li	646462790b	Refactor the header file directory and integrate the eager tensor implementation (#689 ) * refactor the header file in include folder * fix the basic-token eager unit test case * a more flexible way to handle string tensor shape. * fix the unit test path issue * remove the multi-inherits to avoid issue during pointer casting * add api cmake build support * undo some temporary changes * code refinement * fix variadic arg * only expose the context for ort version >= 17 * fix a shape bug * fix the cuda build issue * change ifdef condition of GetAllocator * finalize the ort c abi wrapper file name * fix the iOS build break * align gtest version with triton * Update ext_apple_framework.cmake for iOS header files --------- Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>	2024-04-17 12:58:19 -07:00

5 Коммитов