* Unify the image operations in extensions library
* fix the build configuration issue
* More build fixings
* Fix the native image codec
* fix encode_image
* Add bgr/rgb conversion for encoding image
* parity check
* build break
* update PNG encoding parameters
* build break on Linux
* using MSE to compare images
* fix the discrependency between Linux and Windows
* final code refinement
* one more change
* fix the C++ warnings
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* upgrade ci matrix
* typo
* revert python version
* update python and ort range
* update python range
* update for macos and linux too
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* initial checkins for mllama image process
* fix some tests
* some fixings
* add more image
* More test assertions
* parity test passed
* code clean up
* code refinement
* Add general regex support
* add case 5 support instead of replacing with s+
* add more test cases
* address comments
* add back gpt2 and llama regex methods for efficiency
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* add(tutorials): exporting yolo world model
This allows us to export yolo world onnx model which can be later used in mobile inference.
* add(tutorial): make classes optional
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
* add compatibility docs
continue updating the doc
updating doc 2
* support sentence-piece add_dummy_prefix for all models
* revert the flag
* initialize the add_dummy_prefx for llama model
This added support for native image decoding on Windows & Apple platforms.
This helps us remove libpng & libjpeg completely on these platforms, and
in the meantime support more image formats thanks to OS vendors,
* initial commit
* Ugm vocab loaded is good
* test passed
* fixes unit test on win32
* finish the parity check
* code refinement
* code refinement for review
* add C++ standard library regex support for GPT2 case
* reorder regex handling
* try without STL
* missing case
* add llama3 regex support
* add custom regex impl
* change regex based on model
* modify tests, add docs, and code cleanup
* add regex test and const strings
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* Remove OpenCV dependency from C_API model
* fix build on Windows
* switch ci build flag
* try to fix the macOS build issue
* more fixing
* fix the macOS build issue
* list jpeg source
* verified on MacOS
* update the pp_api too
* avoid the codecs library conflicts
* Add the unit tests
* move the codec test
* add the missing dl lib for extensions test
* refine the code
* a smaller fixing for Windows Python
* optimize the tokenizer for efficiency
* fix the unit test failures.
* fix the api test case failures
* removed the unused code.
* More test cases fixings
* One more fixing
* fix macOS build issues
* refine the test
* add more diagnosis info.
* fix unit test in CI Linux
* fix the pp_api test failure
* initial api for tokenizer
* More fixings and test data refinement
* add a simple wrapper for pre-processing APIs
* fix the test issues
* test if the tokenizer is spm based
* fix the failed test cases
* json pointer does not work
* support tokenizer build only in C API mode
* fix the python build.
* fix the selectedops build
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* Upgrade ESRP signing task from v2 to v5
* Upgrade ESRP signing task from v2 to v5 in win
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* add the decoder_prompt_id for whisper tokenizer
* temporarily disable android prebuilt
* disable the prebuilt for android
* disable the prebuilt for android 2
* Add a unit test
* correct test ids
* reimplement resize cpu kernel for image processing
* accuracy fixing and code refinement
* fix the build issues
* fix Linux build issue
* more fixings
* Fix the pipeline issue
* fix the ci script
* try to fix CUDA machine pool