onnxruntime-extensions

Граф коммитов

Автор	SHA1	Сообщение	Дата
Wenbing Li	aa2c82fa67	Add the MLlama Imaging Processing Support (#823 ) * initial checkins for mllama image process * fix some tests * some fixings * add more image * More test assertions * parity test passed * code clean up * code refinement	2024-10-22 14:24:09 -07:00
Chester Liu	e424838708	Added support for native image decoding (#808 ) This added support for native image decoding on Windows & Apple platforms. This helps us remove libpng & libjpeg completely on these platforms, and in the meantime support more image formats thanks to OS vendors,	2024-09-26 09:17:55 +08:00
Wenbing Li	6b94f4d7a5	Fix the Unicode code discrepency on CLIP model (#814 ) * refine the code structure * more fixing on unicode * fix the codepoint 304 * add the clip tokenizer data files abck	2024-09-23 16:49:24 -07:00
Wenbing Li	176c1d0138	Support the Unigram tokenizer kind from sentencepiece library (#811 ) * initial commit * Ugm vocab loaded is good * test passed * fixes unit test on win32 * finish the parity check * code refinement * code refinement for review	2024-09-19 15:46:13 -07:00
Wenbing Li	1b80794903	Remove OpenCV dependency from C_API mode (#800 ) * Remove OpenCV dependency from C_API model * fix build on Windows * switch ci build flag * try to fix the macOS build issue * more fixing * fix the macOS build issue * list jpeg source * verified on MacOS * update the pp_api too * avoid the codecs library conflicts * Add the unit tests * move the codec test * add the missing dl lib for extensions test * refine the code * a smaller fixing for Windows Python	2024-09-04 16:50:05 -07:00
Sayan Shaw	7851b51ee3	Add initial tiktoken and Phi3SmallTokenizer support (#729 ) * add initial tiktoken support * add vector hash and equal for bpe ranks map * change lambda comparator * move phi-3-small files * final changes * move tiktoken files from data2 to data * add unit test * add tokenizer module * merge json and tiktoken impl * fix tiktoken encoding problem * address comments * remove dummy tokens --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2024-08-02 10:24:02 -07:00
Wenbing Li	c3145b8f52	add the decoder_prompt_id for whisper tokenizer (#775 ) * add the decoder_prompt_id for whisper tokenizer * temporarily disable android prebuilt * disable the prebuilt for android * disable the prebuilt for android 2 * Add a unit test * correct test ids	2024-07-29 14:21:17 -07:00
Wenbing Li	620050fbe0	reimplement resize cpu kernel for image processing (#768 ) * reimplement resize cpu kernel for image processing * accuracy fixing and code refinement * fix the build issues * fix Linux build issue * more fixings * Fix the pipeline issue * fix the ci script * try to fix CUDA machine pool	2024-07-23 15:40:52 -07:00
Wenbing Li	38a3d85f8f	switch cmake cmp0169 flag to new (#762 ) * switch cmake cmp0169 flag to new * the missing spm code. * more refinement on cmake build targets * Update ci.yml * Update ci.yml * update the jpg files after using libjpeg instead of libjpeg-turbo * exclude cutlass too * upgrade the protobuf library to be consistent with ORT * update the protoc generated files * use the right patch name * Update cutlass.cmake	2024-07-15 23:28:49 -07:00
Wenbing Li	8153bc1a3a	Feature extraction C API for whipser model (#755 ) * Feature extraction C API for whipser model * Update the docs * Update the docs2 * refine the code * fix some issues * fix the Linux build * fix more data consistency issue * More code refinements	2024-07-11 11:20:36 -07:00
Wenbing Li	cbed8fd575	Add a generic image processor and its C API (#745 ) * Add a generic image processor * add more tests * Fix the test failures * Update runner.hpp	2024-06-20 10:53:49 -07:00
Wenbing Li	474540d8a5	Fix the image processing output data discrepancy (#722 ) * some data calc fixing * Update image_transforms.hpp * really split the images * Update image_transforms.hpp	2024-05-20 12:44:48 -07:00
Wenbing Li	311dd35401	Add ImageProcessor for Multimodel model Pre-processing (#715 ) * only keep the image decoder from opencv * initial build * refine the code * Add clear functions * Update CMakeLists.txt * Update opencv.cmake * change the output type to float * get the result * align image-process with original Python * move the LoadRawImages into library * fix the calculation error * fix the pipeline build issue * fix the build breaks in ci pipeline * support json configuration file and refactor the code.	2024-05-15 14:35:14 -07:00
Wenbing Li	c58c930739	Ignore all streaming output of invalid utf-8 string (#704 ) * Ignore all streaming output of invalid utf-8 string * Update bpe_streaming.hpp * add the phi-3 tokenizer test * add a streaming test for phi-3 model * fix the utf-8 validation * fix the utf-8 validation 2 * fix the utf-8 validation 3 * fix the utf-8 validation 4	2024-05-06 16:46:55 -07:00
Wenbing Li	a8bce4328b	Add the tokenizer C ABI (#693 ) * initial checkins * fix the selectedops build failures * add the tokenization implementation * update the windows DEF file for c abi in cmake file * fix the build on linux * fix some warnings and remove the unused code * initial import of unit tests from tfmtok * add streaming API support * fix the merges loading issues * complete export from tfmtok - needs input id fixing * fix the unit test failures. * fix all unit test failure * refactor streaming code * remove the unused code --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>	2024-04-29 16:45:49 -07:00
Sayan Shaw	a03eded71e	Add initial CUDA native UT (#625 ) * Add initial CUDA native UT * fix the build issue * fix other build error * add 30 mins to android packaging pipeline timeout due to early timing out * undo android pipeline timeout change - move to other PR * revert ifdef for testing ci * add if def for cuda * update ci ORT linux package name * update the package extraction path * Update ci.yml * Update ci.yml --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2024-01-13 15:34:16 -08:00
Rachel Guo	73e69bb779	Make `token_type_ids` optional output for HFBertTokenizer op (#583 ) * add initial changes for optional outpout for berttokenizer * add test tokenizer onnx model * update * fix * remove commented out code * fix extend list elements * fix * minor update * test * return optional in outputcharacteristic * update * update * debug * fix flag * remove print msg * minor update * update comments * update pr comments * minor update * minor update * fix --------- Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local> Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-11-09 18:38:50 -08:00
Wenbing Li	e899da29d2	add more hf models into converter APIs (#562 )	2023-09-18 14:38:32 -07:00
Scott McKay	c81981b74c	Enable using system certs on Android. (#543 ) Disable in-memory cert store and loading certs from model. - TBD if it will be needed - need to know how reliable using the Android system certs will be and whether any scenarios need to have custom cert management.	2023-08-24 12:17:07 +10:00
Scott McKay	613c5c0c9d	Make Azure ops work on Android (#532 ) * Use in-memory certs for curl on Android - could not get curl+openssl to be able to use the system ones * Use static build for curl and openssl - smaller binary size * Cache the x509 certificate store so we don't need to re-create for every request. * Read certs from node attribute for now. --------- Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2023-08-23 12:34:40 +10:00
Scott McKay	eb5aef38fb	Make Azure op timeout an attribute (#539 ) * Add `timeout_seconds` attribute for per-node timeout. Defaults to (arbitrary) value of 15 seconds. * Fix datatype - onnx only has int64_t attributes. Update test model to validate timeout is read correctly.	2023-08-23 08:21:37 +10:00
RandySheriffH	d853d31fc1	Document azure ops. (#530 ) * doc ops (#529) * Try and make CIs pass with Azure ops enabled by default. Misc. other cleanups * Fix some CI issues. Cleanups some bits and pieces. * Fix a couple of issues. * Fix arg to build.bat * Increase warning in triton client build to make binskim happy (hopefully). * Try patching the warning level in the triton grpc branch as well. Shouldn't matter but... * Run triton patch command for windows as well. * Add patch.exe directly so windows builds work. * override auth gen for AOAI * fix build * switch to windows-static * update model for azure chat * document triton invoker * doc chat endpoint * document triton invoker * format * format * format --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com> * address comments * move doc sect * typo * typo --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com> Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-08-17 14:12:02 -07:00
Scott McKay	3b947b5580	Minor build and test setup fixes (#523 ) * Build fixes - zlib needs to come from vcpkg if azures ops are being built and opencv isn't enabled - set the IR version to 8 for some of the azure ops test models so they can be tested when ORT 1.14 is used - pass through new ort version value so that a consistent version is used to a) pull the ORT package for the c++ unit tests and b) disable azure ops if ORT version is too old. * Update to automatically chain package to avoid build errors during the install if cmake runs commands in parallel * Define simplified ORT_FILE for older ORT versions	2023-08-17 15:09:31 +10:00
Scott McKay	f77a3b8ad2	Update domain in triton test models (#519 ) * Update domain in triton test models * Use 'model_name' everywhere. Test py and model/op were inconsistent.	2023-08-12 12:40:21 +10:00
Scott McKay	2bde82fce9	Refactor setup for Azure ops. Add Android support. (#507 ) * Refactor setup for Azure ops to try and make common things more re-usable, and for the actual ops to simply layer in the specific input/output constraints for that type of request. Currently builds on Linux, Windows (x64 only) and Android Android requires a manual pre-build of openssl and curl. Linux requires a manual pre-install of openssl. Windows currently only works for x64. Other targets need the triplet adjusted. * Address PR comments * Fix could of android build warnings. * Update .gitignore to remove old path * Fix build break from merge	2023-08-08 19:54:30 +10:00
Sayan Shaw	997e9ee007	Add Falcon-7b and Falcon-40b tokenizer support (#510 ) * Add Falcon-7b and Falcon-40b tokenizer support * fix alignment and add tokenizer file in test/data to speed up compute --------- Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>	2023-08-07 14:37:57 -07:00
RandySheriffH	9e7f8e5b1d	Add UT for Azure Ops during packaging (#502 ) * set before-test * test cmd * clean in yml * restore toml * add ut for triton endpoints * reset working path * rename suffix * install ort * pip install * make env * add extra env * make executable * set dir for linux * add switch * set env default * skip tests * simplify env * clean env for official --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2023-08-02 17:01:09 -07:00
Scott McKay	64f20828ce	Handle ONNX 1.14 in test scripts (#435 ) * Calculate and specify ir_version so we use the oldest possible for maximum compatibility * Don't use `ignore_unknown` in call to `find_min_ir_version_for` as it's only supported in the most recent ONNX release.	2023-05-12 07:13:37 +10:00
Wenbing Li	2fa0b710ea	Adding down-sampling and stereo mixing features for AudioDecoder (#420 ) * initial draft * second * third * polishing * fix the M_PI name in LINUX platform * fix bessel function issue * add a unit test case * fix the unit test name	2023-05-04 13:30:10 -07:00
JiCheng	db87dc416d	[object detection ppp] YoLo as example (#397 ) * object detection * Unit test add e2e fastestdet model test --------- Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-20 13:34:11 +08:00
JiCheng	154ead35a3	built-in bounding box op (#382 ) * built-in bounding box op * update boundary check * assert policy * more boundary test and check * XYXY--> X horizon --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>	2023-04-12 19:35:53 +08:00
Wenbing Li	b5dce955f0	Add an audio decoder custom op for whisper end-to-end processing (#385 ) * evaluate the audio decoder library * MP3 Decoder * rename it to test_audio_codec * add the audio decoder to whisper model * whisper end-to-end draft * fix the mp3 decoder * Running with ONNX models * Add more audio format supports * refine the end-to-end script * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * some fixings of comments and more test cases. * changes for review comments. * Update audio_decoder.hpp * Update audio_decoder.hpp * code refinement * Update operators/audio/audio_decoder.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> --------- Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2023-04-11 14:47:10 -07:00
Wenbing Li	9cd1284da8	Pre and Post processing example for openAI Whisper model (#380 ) * add a stft-norm custom op for log-mel spectrum. * undo the debug change * Support ONNX standard STFT op signature. * Add a unit test onnx STFT compatible mode. * add whisper pre-/post- processing example * Update dlib.cmake * undo test code changes * Update setup.cfg * update the end2end example with STFT op	2023-03-30 13:44:50 -07:00
Scott McKay	5e44a7c3c9	Add ability to prevent exception propagation if building as part of ORT when ORT has exceptions disabled (#368 ) * Add ability to prevent exception propagation with top level try/catch hander macros. If combined build with ORT has exceptions disabled in ORT but ort-ext has an operator that requires exceptions, we enable exceptions in ort-ext but prevent them propagating up via try/catch in the entry points that ORT can call - RegisterCustomOps - CustomOpBase constructor and Compute Removed some places in CustomOpApi that threw is OpKernelInfo* was nullptr but standardizing all kernels to store the OpKernelInfo provided in the ctor. Added unit tests - need to validate on more platforms and add CI for build where we don't want to allow exceptions to propagate * Update pyop * Update CMakeLists.txt Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update includes/exceptions.h Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update includes/exceptions.h Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Update includes/onnxruntime_customop.hpp Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> * Merge with main and update Address PR comments Fix some issues. * Delete local file * Fix pyop update * Add CI Address PR comments --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2023-02-27 10:31:44 -08:00
JiCheng	b375cb57e6	support mobilebert_ppp (#354 ) * support mobilebert_ppp * renaming IOEntryValuePreserver * generalize argmax step --------- Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>	2023-02-27 18:53:37 +08:00
Scott McKay	f3654e5bac	Fix opset 18 issues and bug due to ORT Resize issue (#362 ) * - Fix Split(18) requiring num_outputs. - Calculate `sizes` in Resize instead of using the simpler `scales` - ORT implementation does not round correctly when applying scales - Update center crop to use float so we are more accurate in choosing the crop area. - Fix minor issue with Debug step by only adding values that are altered to the renaming graph inputs. - Update unit tests expected output due to the change in Resize using sizes instead of scales. - Crop e2e example input so before/after image covers same area. * Simplify. CenteredCrop doesn't need to use float as it's dividing by 2 (so using float + floor gives the same result). Remove Resize impl using scales - we most likely will never go back to it. Address PR comments Update doc	2023-02-18 06:37:05 +10:00
Scott McKay	cd5ea11aaa	Move the pre/post processing scripts into the python module. (#349 ) * Move the pre/post processing scripts into the python module. Update usage/examples. * Use better version parsing. * Update tests, docs, * Address PR comments. Remove global Settings and pass onnx opset around directly where needed. Make PrePostProcessor the owner of the checker context.	2023-01-26 08:30:21 +10:00
Scott McKay	5d53f91f11	Minor opencv tweaks (#316 ) * Separate ops in operators/cv2 that do and do not require codecs so they're easier to include/exclude from a build. Remove jpeg2000 from opencv file formats. It costs 1MB and is (afaict) not a common format. Add ability to enable/disable cv2 ops to gen_selectedops.py. * Remove super resolution pre/post process ops that are no longer needed. * Replace super resolution e2e tutorial Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>	2022-12-20 13:47:17 +10:00
Scott McKay	230f0201d1	Updates to PPP model update scripts (#331 ) * - Fix incorrect weights matrix for conversion from YCbCr to BGR - Fix Resize when input is HWC or CHW by converting to NHWC/NCHW so bilinear interpolation is inferred. trilinear is inferred otherwise. - Not aware of a use case where we'd want to use trilinear - if one exists the temporary addition of the batch dim would need to be made explicit. - Allow Resize to use antialias if ONNX opset is 18 or higher - Skip adding unnecessary Mul if FloatToImageBytes multiplier is 1.0. - Make Debug step available at the top level Validated opset 18 Resize vs PT for super resolution and the results are now a really good match. * Update test data to use image that shows diffs better. Make super resolution output format configurable Address PR comments. * Adjust test for diffs on macos	2022-12-17 09:32:04 +10:00
Scott McKay	1cab9711ff	Starter changes for supporting pre/post processing for vision models. (#312 ) * Initial changes for supporting mobilenet and superresolution. - Script to update model with pre/post processing - custom ops for decode/encode - user just has to provide jpg or png bytes - superresolution can return the updated image in jpg or png - models for testing Updated cmake setup to enable building of the vision pre/post processing ops - opencv2 is treated as an internal dependency rather than the mechansim for selecting which operators to include. * Add extra check in decode.	2022-11-24 07:40:56 +10:00
shaahji	78d8dd5705	OpenCV Image Decoder & SuperResolution CustomOps	2022-09-30 12:08:38 -07:00
Adrian Lizarraga	a11c8128b2	Issue #279 : Fix bug that causes TryGetAttribute<std::string> to fail for valid inputs (#287 )	2022-09-08 17:14:11 -07:00
shaahji	abb0cf726f	Issue #230 , #226 : Add tests for BertTokenizer & HfBertTokenizer customop Includes a new onnx model for testing the HfBertTokenizer that accepts two strings as input and returns output compatible with inputs for pretrained HuggingFace models.	2022-09-06 15:08:16 -07:00
Wenbing Li	9bd453e0f1	initial checkins for onnxcompose (#185 ) * initial checkins for onnxcompose * update ci pipeline for the test. * add the missing quotes * Switch to looseVersion for torch version. * testif * padding_length * skip the gpt2 * add onnxruntime 1.9 test package. * fix a memory bug on pyop.	2021-11-22 21:02:39 -08:00
Mojimi	abdd5b1bd8	Add MaskedFill (#182 ) * add StringRemove * update Co-authored-by: Ze Tao <zetao@microsoft.com>	2021-11-04 09:29:17 +08:00
Mojimi	b5a8a1abd9	add test (#180 ) Co-authored-by: Ze Tao <zetao@microsoft.com>	2021-11-01 10:08:09 +08:00
Mojimi	537c492219	Add tests for mapping-related operators (#179 ) * init * finish vector_to_string * add more test Co-authored-by: Ze Tao <zetao@microsoft.com>	2021-10-29 09:00:06 +08:00
Zuwei Zhao	05f7ded825	Add check for empty input in StringJoin operator and fix empty string input error in BlingFire sentence breaker. (#175 ) * Add test cases and fix empty string error in BlingFire sentence breaker. * Throw error if input text to join is empty array. * Fix scalar support and access violation. * Resolve comments. * Resolve comments. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-10-27 20:21:16 +08:00
Mojimi	64c972fb02	Add test for StringECMARegexReplace (#176 ) * add test for ECMARegex * add empty input test case Co-authored-by: Ze Tao <zetao@microsoft.com>	2021-10-26 10:28:06 +08:00
Mojimi	46d096f1af	Fix ::tolower error when locale is not 'C' (#174 ) * add test and implement tolower * fix locale * fix locale Co-authored-by: Ze Tao <zetao@microsoft.com>	2021-10-20 20:59:29 -07:00

1 2

63 Коммитов