Disable in-memory cert store and loading certs from model.
- TBD if it will be needed - need to know how reliable using the Android system certs will be and whether any scenarios need to have custom cert management.
* Use in-memory certs for curl on Android
- could not get curl+openssl to be able to use the system ones
* Use static build for curl and openssl
- smaller binary size
* Cache the x509 certificate store so we don't need to re-create for every request.
* Read certs from node attribute for now.
---------
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Add `timeout_seconds` attribute for per-node timeout. Defaults to (arbitrary) value of 15 seconds.
* Fix datatype - onnx only has int64_t attributes.
Update test model to validate timeout is read correctly.
* doc ops (#529)
* Try and make CIs pass with Azure ops enabled by default.
Misc. other cleanups
* Fix some CI issues.
Cleanups some bits and pieces.
* Fix a couple of issues.
* Fix arg to build.bat
* Increase warning in triton client build to make binskim happy (hopefully).
* Try patching the warning level in the triton grpc branch as well. Shouldn't matter but...
* Run triton patch command for windows as well.
* Add patch.exe directly so windows builds work.
* override auth gen for AOAI
* fix build
* switch to windows-static
* update model for azure chat
* document triton invoker
* doc chat endpoint
* document triton invoker
* format
* format
* format
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* address comments
* move doc sect
* typo
* typo
---------
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
* - Ensure we log an error message before throwing on Android
- message in exception will be lost due to how the shared libraries are built (both onnxruntime and extensions use static libc++ so there are no shared exception types between them)
- support static or dynamic build of curl/openssl on android
- TBD which we want to use.
- add infra for anything deriving from BaseKernel to log messages using the ORT logger
- ensures messages from custom kernels end up in the same place as messages from ORT
* Refactor setup for Azure ops to try and make common things more re-usable, and for the actual ops to simply layer in the specific input/output constraints for that type of request.
Currently builds on Linux, Windows (x64 only) and Android
Android requires a manual pre-build of openssl and curl.
Linux requires a manual pre-install of openssl.
Windows currently only works for x64. Other targets need the triplet adjusted.
* Address PR comments
* Fix could of android build warnings.
* Update .gitignore to remove old path
* Fix build break from merge
* set before-test
* test cmd
* clean in yml
* restore toml
* add ut for triton endpoints
* reset working path
* rename suffix
* install ort
* pip install
* make env
* add extra env
* make executable
* set dir for linux
* add switch
* set env default
* skip tests
* simplify env
* clean env for official
---------
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com>
* Nodes can be called concurrently and Compute needs to be stateless due to that.
Update the kernels to make Compute const.
* Fix test that uses ustring.h.
Would be better to not have duplicate declarations for GetTensorMutableDataString and FillTensorDataString in ustring.h and string_tensor.h.