This added support for native image decoding on Windows & Apple platforms.
This helps us remove libpng & libjpeg completely on these platforms, and
in the meantime support more image formats thanks to OS vendors,
* initial commit
* Ugm vocab loaded is good
* test passed
* fixes unit test on win32
* finish the parity check
* code refinement
* code refinement for review
* add C++ standard library regex support for GPT2 case
* reorder regex handling
* try without STL
* missing case
* add llama3 regex support
* add custom regex impl
* change regex based on model
* modify tests, add docs, and code cleanup
* add regex test and const strings
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* Remove OpenCV dependency from C_API model
* fix build on Windows
* switch ci build flag
* try to fix the macOS build issue
* more fixing
* fix the macOS build issue
* list jpeg source
* verified on MacOS
* update the pp_api too
* avoid the codecs library conflicts
* Add the unit tests
* move the codec test
* add the missing dl lib for extensions test
* refine the code
* a smaller fixing for Windows Python
* optimize the tokenizer for efficiency
* fix the unit test failures.
* fix the api test case failures
* removed the unused code.
* More test cases fixings
* One more fixing
* fix macOS build issues
* refine the test
* add more diagnosis info.
* fix unit test in CI Linux
* fix the pp_api test failure
* initial api for tokenizer
* More fixings and test data refinement
* add a simple wrapper for pre-processing APIs
* fix the test issues
* test if the tokenizer is spm based
* fix the failed test cases
* json pointer does not work
* support tokenizer build only in C API mode
* fix the python build.
* fix the selectedops build
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* add the decoder_prompt_id for whisper tokenizer
* temporarily disable android prebuilt
* disable the prebuilt for android
* disable the prebuilt for android 2
* Add a unit test
* correct test ids
* reimplement resize cpu kernel for image processing
* accuracy fixing and code refinement
* fix the build issues
* fix Linux build issue
* more fixings
* Fix the pipeline issue
* fix the ci script
* try to fix CUDA machine pool
* switch cmake cmp0169 flag to new
* the missing spm code.
* more refinement on cmake build targets
* Update ci.yml
* Update ci.yml
* update the jpg files after using libjpeg instead of libjpeg-turbo
* exclude cutlass too
* upgrade the protobuf library to be consistent with ORT
* update the protoc generated files
* use the right patch name
* Update cutlass.cmake
* Feature extraction C API for whipser model
* Update the docs
* Update the docs2
* refine the code
* fix some issues
* fix the Linux build
* fix more data consistency issue
* More code refinements
* first draft
* clang
* Draft for ScatterNFOfShape
* fix build
* disable test when cuda is missing
* fix implementation
* update test
* add MaskedScatterNdOfShape
* fix merge conflicts
* first draft for NegXPlus1
* complete
* fix unit test
* rename one test
* remove test if not cuda
---------
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* only keep the image decoder from opencv
* initial build
* refine the code
* Add clear functions
* Update CMakeLists.txt
* Update opencv.cmake
* change the output type to float
* get the result
* align image-process with original Python
* move the LoadRawImages into library
* fix the calculation error
* fix the pipeline build issue
* fix the build breaks in ci pipeline
* support json configuration file and refactor the code.
* Ignore all streaming output of invalid utf-8 string
* Update bpe_streaming.hpp
* add the phi-3 tokenizer test
* add a streaming test for phi-3 model
* fix the utf-8 validation
* fix the utf-8 validation 2
* fix the utf-8 validation 3
* fix the utf-8 validation 4
* initial checkins
* fix the selectedops build failures
* add the tokenization implementation
* update the windows DEF file for c abi in cmake file
* fix the build on linux
* fix some warnings and remove the unused code
* initial import of unit tests from tfmtok
* add streaming API support
* fix the merges loading issues
* complete export from tfmtok - needs input id fixing
* fix the unit test failures.
* fix all unit test failure
* refactor streaming code
* remove the unused code
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* add UT for neg_pos_cuda in eager mode and fix build break in Windows
* fix Linux build break
* adjust argument and path
* remove old cudaContext
* add ort cuda test back
* fix cuda tests
* undo debug code
* undo useless change
---------
Co-authored-by: jslhcl <jslhcl@gmail.com>
Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* refactor the header file in include folder
* fix the basic-token eager unit test case
* a more flexible way to handle string tensor shape.
* fix the unit test path issue
* remove the multi-inherits to avoid issue during pointer casting
* add api cmake build support
* undo some temporary changes
* code refinement
* fix variadic arg
* only expose the context for ort version >= 17
* fix a shape bug
* fix the cuda build issue
* change ifdef condition of GetAllocator
* finalize the ort c abi wrapper file name
* fix the iOS build break
* align gtest version with triton
* Update ext_apple_framework.cmake for iOS header files
---------
Co-authored-by: Cheng Tang <chenta@a100.crj0ad2y1kku1j4yxl4sj10o4e.gx.internal.cloudapp.net>
* Unify the spm/bpe tokenizers
* fix the build error
* fix the decoding issue
* add model name in exported onnx
* fixing the unit tests
* revert the unneccesary file format changes
* Add initial CUDA native UT
* fix the build issue
* fix other build error
* add 30 mins to android packaging pipeline timeout due to early timing out
* undo android pipeline timeout change - move to other PR
* revert ifdef for testing ci
* add if def for cuda
* update ci ORT linux package name
* update the package extraction path
* Update ci.yml
* Update ci.yml
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Add support for YOLO v8 Pose post-processing.
The output has additional values in the 'mask' data for the keypoints.
- Update the post processing steps to support extracting and scaling the keypoints.
- Simplify the existing step to split out the boxes and scores by using a basic Split operator if there is no confidence score for a bounding box to apply to the class scores.
- Confidence score for a bounding box is YOLO versions prior to 8.
- Update existing tests
TODO: Add unit tests for new Steps. They have been manually validated with the real model for now.
* Changes to support pre-decoded input.
Needs cleanup.
* Support an overall max number of detections as well as per-class detections.
* Expand Identity to support multiple inputs
Fix issue with incorrect score being selected by NMS (was max and not score for selected clas)
Fix TopK usage so result ordering is consistent when it is not used
Add unit tests.
* Update docs and some cleanups
* Use Union
* support float16
* add ut for float16
* support bfloat16
* refactor
* fetch prop
* ifdef f16
* remove header
* ifdef cuda
* typename mapped type
---------
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* add an experimental CUDA python unit test pipeline
* typo
* in ci.yml?
* winpycuda
* move it in optional
* enable cuda pytest in linuxbuild
* build in docker
* add the cuda pytest for windows
* cuda flag fixing
* minor fixing
* typo
---------
Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
* cmake fast gelu
* bridge func and cuda kernel
* tune ut
* fix build warning
* fix format
* tune ut
* drop OCOS_ENABLE_CONTRIB
* tune cmake
---------
Co-authored-by: Randy Shuai <rashuai@microsoft.com>