* initial checkins for mllama image process
* fix some tests
* some fixings
* add more image
* More test assertions
* parity test passed
* code clean up
* code refinement
This added support for native image decoding on Windows & Apple platforms.
This helps us remove libpng & libjpeg completely on these platforms, and
in the meantime support more image formats thanks to OS vendors,
* initial commit
* Ugm vocab loaded is good
* test passed
* fixes unit test on win32
* finish the parity check
* code refinement
* code refinement for review
* Remove OpenCV dependency from C_API model
* fix build on Windows
* switch ci build flag
* try to fix the macOS build issue
* more fixing
* fix the macOS build issue
* list jpeg source
* verified on MacOS
* update the pp_api too
* avoid the codecs library conflicts
* Add the unit tests
* move the codec test
* add the missing dl lib for extensions test
* refine the code
* a smaller fixing for Windows Python
* add the decoder_prompt_id for whisper tokenizer
* temporarily disable android prebuilt
* disable the prebuilt for android
* disable the prebuilt for android 2
* Add a unit test
* correct test ids
* reimplement resize cpu kernel for image processing
* accuracy fixing and code refinement
* fix the build issues
* fix Linux build issue
* more fixings
* Fix the pipeline issue
* fix the ci script
* try to fix CUDA machine pool
* switch cmake cmp0169 flag to new
* the missing spm code.
* more refinement on cmake build targets
* Update ci.yml
* Update ci.yml
* update the jpg files after using libjpeg instead of libjpeg-turbo
* exclude cutlass too
* upgrade the protobuf library to be consistent with ORT
* update the protoc generated files
* use the right patch name
* Update cutlass.cmake
* Feature extraction C API for whipser model
* Update the docs
* Update the docs2
* refine the code
* fix some issues
* fix the Linux build
* fix more data consistency issue
* More code refinements
* only keep the image decoder from opencv
* initial build
* refine the code
* Add clear functions
* Update CMakeLists.txt
* Update opencv.cmake
* change the output type to float
* get the result
* align image-process with original Python
* move the LoadRawImages into library
* fix the calculation error
* fix the pipeline build issue
* fix the build breaks in ci pipeline
* support json configuration file and refactor the code.
* Ignore all streaming output of invalid utf-8 string
* Update bpe_streaming.hpp
* add the phi-3 tokenizer test
* add a streaming test for phi-3 model
* fix the utf-8 validation
* fix the utf-8 validation 2
* fix the utf-8 validation 3
* fix the utf-8 validation 4
* initial checkins
* fix the selectedops build failures
* add the tokenization implementation
* update the windows DEF file for c abi in cmake file
* fix the build on linux
* fix some warnings and remove the unused code
* initial import of unit tests from tfmtok
* add streaming API support
* fix the merges loading issues
* complete export from tfmtok - needs input id fixing
* fix the unit test failures.
* fix all unit test failure
* refactor streaming code
* remove the unused code
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* Add initial CUDA native UT
* fix the build issue
* fix other build error
* add 30 mins to android packaging pipeline timeout due to early timing out
* undo android pipeline timeout change - move to other PR
* revert ifdef for testing ci
* add if def for cuda
* update ci ORT linux package name
* update the package extraction path
* Update ci.yml
* Update ci.yml
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
Co-authored-by: Wenbing Li <wenbingl@outlook.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
Disable in-memory cert store and loading certs from model.
- TBD if it will be needed - need to know how reliable using the Android system certs will be and whether any scenarios need to have custom cert management.
* Use in-memory certs for curl on Android
- could not get curl+openssl to be able to use the system ones
* Use static build for curl and openssl
- smaller binary size
* Cache the x509 certificate store so we don't need to re-create for every request.
* Read certs from node attribute for now.
---------
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Add `timeout_seconds` attribute for per-node timeout. Defaults to (arbitrary) value of 15 seconds.
* Fix datatype - onnx only has int64_t attributes.
Update test model to validate timeout is read correctly.
* doc ops (#529)
* Try and make CIs pass with Azure ops enabled by default.
Misc. other cleanups
* Fix some CI issues.
Cleanups some bits and pieces.
* Fix a couple of issues.
* Fix arg to build.bat
* Increase warning in triton client build to make binskim happy (hopefully).
* Try patching the warning level in the triton grpc branch as well. Shouldn't matter but...
* Run triton patch command for windows as well.
* Add patch.exe directly so windows builds work.
* override auth gen for AOAI
* fix build
* switch to windows-static
* update model for azure chat
* document triton invoker
* doc chat endpoint
* document triton invoker
* format
* format
* format
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* address comments
* move doc sect
* typo
* typo
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* Build fixes
- zlib needs to come from vcpkg if azures ops are being built and opencv isn't enabled
- set the IR version to 8 for some of the azure ops test models so they can be tested when ORT 1.14 is used
- pass through new ort version value so that a consistent version is used to a) pull the ORT package for the c++ unit tests and b) disable azure ops if ORT version is too old.
* Update to automatically chain package to avoid build errors during the install if cmake runs commands in parallel
* Define simplified ORT_FILE for older ORT versions
* Refactor setup for Azure ops to try and make common things more re-usable, and for the actual ops to simply layer in the specific input/output constraints for that type of request.
Currently builds on Linux, Windows (x64 only) and Android
Android requires a manual pre-build of openssl and curl.
Linux requires a manual pre-install of openssl.
Windows currently only works for x64. Other targets need the triplet adjusted.
* Address PR comments
* Fix could of android build warnings.
* Update .gitignore to remove old path
* Fix build break from merge
* Add Falcon-7b and Falcon-40b tokenizer support
* fix alignment and add tokenizer file in test/data to speed up compute
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* set before-test
* test cmd
* clean in yml
* restore toml
* add ut for triton endpoints
* reset working path
* rename suffix
* install ort
* pip install
* make env
* add extra env
* make executable
* set dir for linux
* add switch
* set env default
* skip tests
* simplify env
* clean env for official
---------
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Calculate and specify ir_version so we use the oldest possible for maximum compatibility
* Don't use `ignore_unknown` in call to `find_min_ir_version_for` as it's only supported in the most recent ONNX release.
* initial draft
* second
* third
* polishing
* fix the M_PI name in LINUX platform
* fix bessel function issue
* add a unit test case
* fix the unit test name
* object detection
* Unit test
add e2e fastestdet model test
---------
Co-authored-by: Changming Sun <chasun@microsoft.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
* built-in bounding box op
* update boundary check
* assert policy
* more boundary test and check
* XYXY--> X horizon
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
* evaluate the audio decoder library
* MP3 Decoder
* rename it to test_audio_codec
* add the audio decoder to whisper model
* whisper end-to-end draft
* fix the mp3 decoder
* Running with ONNX models
* Add more audio format supports
* refine the end-to-end script
* Update operators/audio/audio_decoder.hpp
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Update operators/audio/audio_decoder.hpp
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Update operators/audio/audio_decoder.hpp
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* some fixings of comments and more test cases.
* changes for review comments.
* Update audio_decoder.hpp
* Update audio_decoder.hpp
* code refinement
* Update operators/audio/audio_decoder.hpp
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* add a stft-norm custom op for log-mel spectrum.
* undo the debug change
* Support ONNX standard STFT op signature.
* Add a unit test onnx STFT compatible mode.
* add whisper pre-/post- processing example
* Update dlib.cmake
* undo test code changes
* Update setup.cfg
* update the end2end example with STFT op
* Add ability to prevent exception propagation with top level try/catch hander macros.
If combined build with ORT has exceptions disabled in ORT but ort-ext has an operator that requires exceptions, we enable exceptions in ort-ext but prevent them propagating up via try/catch in the entry points that ORT can call
- RegisterCustomOps
- CustomOpBase constructor and Compute
Removed some places in CustomOpApi that threw is OpKernelInfo* was nullptr but standardizing all kernels to store the OpKernelInfo provided in the ctor.
Added unit tests
- need to validate on more platforms and add CI for build where we don't want to allow exceptions to propagate
* Update pyop
* Update CMakeLists.txt
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Update includes/exceptions.h
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Update includes/exceptions.h
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Update includes/onnxruntime_customop.hpp
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
* Merge with main and update
Address PR comments
Fix some issues.
* Delete local file
* Fix pyop update
* Add CI
Address PR comments
---------
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* - Fix Split(18) requiring num_outputs.
- Calculate `sizes` in Resize instead of using the simpler `scales`
- ORT implementation does not round correctly when applying scales
- Update center crop to use float so we are more accurate in choosing the crop area.
- Fix minor issue with Debug step by only adding values that are altered to the renaming graph inputs.
- Update unit tests expected output due to the change in Resize using sizes instead of scales.
- Crop e2e example input so before/after image covers same area.
* Simplify.
CenteredCrop doesn't need to use float as it's dividing by 2 (so using float + floor gives the same result).
Remove Resize impl using scales - we most likely will never go back to it.
Address PR comments
Update doc
* Move the pre/post processing scripts into the python module.
Update usage/examples.
* Use better version parsing.
* Update tests, docs,
* Address PR comments.
Remove global Settings and pass onnx opset around directly where needed. Make PrePostProcessor the owner of the checker context.
* Separate ops in operators/cv2 that do and do not require codecs so they're easier to include/exclude from a build.
Remove jpeg2000 from opencv file formats. It costs 1MB and is (afaict) not a common format.
Add ability to enable/disable cv2 ops to gen_selectedops.py.
* Remove super resolution pre/post process ops that are no longer needed.
* Replace super resolution e2e tutorial
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* - Fix incorrect weights matrix for conversion from YCbCr to BGR
- Fix Resize when input is HWC or CHW by converting to NHWC/NCHW so bilinear interpolation is inferred. trilinear is inferred otherwise.
- Not aware of a use case where we'd want to use trilinear - if one exists the temporary addition of the batch dim would need to be made explicit.
- Allow Resize to use antialias if ONNX opset is 18 or higher
- Skip adding unnecessary Mul if FloatToImageBytes multiplier is 1.0.
- Make Debug step available at the top level
Validated opset 18 Resize vs PT for super resolution and the results are now a really good match.
* Update test data to use image that shows diffs better.
Make super resolution output format configurable
Address PR comments.
* Adjust test for diffs on macos
* Initial changes for supporting mobilenet and superresolution.
- Script to update model with pre/post processing
- custom ops for decode/encode
- user just has to provide jpg or png bytes
- superresolution can return the updated image in jpg or png
- models for testing
Updated cmake setup to enable building of the vision pre/post processing ops
- opencv2 is treated as an internal dependency rather than the mechansim for selecting which operators to include.
* Add extra check in decode.
Includes a new onnx model for testing the HfBertTokenizer that accepts
two strings as input and returns output compatible with inputs for
pretrained HuggingFace models.
* initial checkins for onnxcompose
* update ci pipeline for the test.
* add the missing quotes
* Switch to looseVersion for torch version.
* testif
* padding_length
* skip the gpt2
* add onnxruntime 1.9 test package.
* fix a memory bug on pyop.
* Add test cases and fix empty string error in BlingFire sentence breaker.
* Throw error if input text to join is empty array.
* Fix scalar support and access violation.
* Resolve comments.
* Resolve comments.
Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>