* Add `timeout_seconds` attribute for per-node timeout. Defaults to (arbitrary) value of 15 seconds.
* Fix datatype - onnx only has int64_t attributes.
Update test model to validate timeout is read correctly.
* Update ci.yml for Azure Pipelines
* Fix the command lines
* is requirements-dev.txt
* switch to windows
* Update windows task
* Update ci.yml for Azure Pipelines
* add cmake in path on windows
* Update ci.yml for Azure Pipelines
* add explicit azure python build flag
---------
Co-authored-by: Sayan Shaw <52221015+sayanshaw24@users.noreply.github.com>
* doc ops (#529)
* Try and make CIs pass with Azure ops enabled by default.
Misc. other cleanups
* Fix some CI issues.
Cleanups some bits and pieces.
* Fix a couple of issues.
* Fix arg to build.bat
* Increase warning in triton client build to make binskim happy (hopefully).
* Try patching the warning level in the triton grpc branch as well. Shouldn't matter but...
* Run triton patch command for windows as well.
* Add patch.exe directly so windows builds work.
* override auth gen for AOAI
* fix build
* switch to windows-static
* update model for azure chat
* document triton invoker
* doc chat endpoint
* document triton invoker
* format
* format
* format
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* address comments
* move doc sect
* typo
* typo
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* Build fixes
- zlib needs to come from vcpkg if azures ops are being built and opencv isn't enabled
- set the IR version to 8 for some of the azure ops test models so they can be tested when ORT 1.14 is used
- pass through new ort version value so that a consistent version is used to a) pull the ORT package for the c++ unit tests and b) disable azure ops if ORT version is too old.
* Update to automatically chain package to avoid build errors during the install if cmake runs commands in parallel
* Define simplified ORT_FILE for older ORT versions
* - Ensure we log an error message before throwing on Android
- message in exception will be lost due to how the shared libraries are built (both onnxruntime and extensions use static libc++ so there are no shared exception types between them)
- support static or dynamic build of curl/openssl on android
- TBD which we want to use.
- add infra for anything deriving from BaseKernel to log messages using the ORT logger
- ensures messages from custom kernels end up in the same place as messages from ORT
* Fix GPT2 and Falcon tokenizer cvt for AutoTokenizer imp
* fix fast tokenizer issue
* small fix
* use slow tokenizer in test script
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* Fix LNK4098 warning from sentencepiece forcibly changing the build flags.
> LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library
* Use CMAKE_MSVC_RUNTIME_LIBRARY to determine whether /MT should be used.
* Add TrieTokenizer for RWKV-like LLM models
* add more tests
* fix the windows build
* downloading file instead of check in the vocab file
* a small bug fixing
* Refactor setup for Azure ops to try and make common things more re-usable, and for the actual ops to simply layer in the specific input/output constraints for that type of request.
Currently builds on Linux, Windows (x64 only) and Android
Android requires a manual pre-build of openssl and curl.
Linux requires a manual pre-install of openssl.
Windows currently only works for x64. Other targets need the triplet adjusted.
* Address PR comments
* Fix could of android build warnings.
* Update .gitignore to remove old path
* Fix build break from merge
* Add Falcon-7b and Falcon-40b tokenizer support
* fix alignment and add tokenizer file in test/data to speed up compute
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
- ifdef out some test code that requires RE2 if RE2 is not enabled
- add ability to plugin custom output validator for C++ unit tests
- OpenAI responses can have different punctuation. used in the new unit tests that will be in the refactoring PR
* set before-test
* test cmd
* clean in yml
* restore toml
* add ut for triton endpoints
* reset working path
* rename suffix
* install ort
* pip install
* make env
* add extra env
* make executable
* set dir for linux
* add switch
* set env default
* skip tests
* simplify env
* clean env for official
---------
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Update ci.yml for Azure Pipelines
macOS ci pipeline fixing.
* Update ci.yml for Azure Pipelines
* Update ci.yml for Azure Pipelines
* drop python 3.8 support in macOS due ADO
* fix macos wheel pipeline
* revert the change to add 3.9 back.
* Nodes can be called concurrently and Compute needs to be stateless due to that.
Update the kernels to make Compute const.
* Fix test that uses ustring.h.
Would be better to not have duplicate declarations for GetTensorMutableDataString and FillTensorDataString in ustring.h and string_tensor.h.
* initial checkins
* test pass
* basic impl
* first unit test pass
* merge error
* refine a little bit
* add more unit test
* fix unit test
* Fix the unit test.
* add one more whisper audiodecoder test case
* update the docs
* More updates
* upgrade gradle wrapper to unblock android CI
* Revert "upgrade gradle wrapper to unblock android CI"
This reverts commit 4df4f253aa.
* set timeout of gradlew
* "re-update it"
* Update gradle-wrapper.properties
* using gradle 8.0.1
* reformat the files
* fix the build.gradle issue
* one more fixing
* delay gradle starts
* rerun the pipeline
* add perf changes for CLIP and Roberta
* add perf improvement for BERT
* remove global var
---------
Co-authored-by: Sayan Shaw <sayanshaw@microsoft.com>
* using the latest Ort header instead of minimum compatible headers
* Update ext_ortlib.cmake
* Update ortcustomops.def
* change the default ORT API version value
Pass `-gpu swiftshader_indirect` emulator args as a workaround to get Android emulator running on the MacOS 13 hosted agents.
See https://github.com/actions/runner-images/issues/7671
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>