Граф коммитов

158 Коммитов

Автор SHA1 Сообщение Дата
Yi Zhang 8e8b62b8b5
Build CUDA and DML together (#22602)
### Description
Now, we need to build cuda and dml in one package.
But CUDA EP and DML EP can't run in one process.
It will throw the exception of `the GPU device instance has been
suspended`
So the issue is CUDA EP and DML EP coexist in compile time but can't
exist in run time.

This PR is to split cuda ep test and dml ep test in all unit tests.
The solution is to use 2 environment variable, NO_CUDA_TEST and
NO_DML_TEST, in CI.

For example, if NO_CUDA_TEST is set, the DefaultCudaExecutionProvider
will be nullptr, and the test will not run with CUDA EP.
In debugging, the CUDAExecutionProvider will not be called. 
I think, as long as cuda functions, like cudaSetDevice, are not called,
DML EP tests can pass.

Disabled java test of testDIrectML because it doesn't work now even
without CUDA EP.
2024-10-31 15:51:13 -07:00
wejoncy 20a45dd67b
[CoreML ML Program] support acclerators selector (#22383)
### Description
For no, CoreML only support run mlmodels on CPU/ALL, However, sometimes
CPU_GPU would be faster a lot.

We support the option to select different hardware to boost performance
in this PR.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-10-15 11:50:11 +08:00
sheetalarkadam c06ecd415c
RC releases to Maven for Android (#22391)
### Description
Aallows alpha, beta and rc version releases to Maven for Android
artifacts.



### Motivation and Context
Helpful to release rc versions or test artifacts to Maven for testing.
For example, a new QNN android package is being released and it will be
nice to test the RC version for dependencies before release

## Future Work
Allow RC version for all Maven artifacts.
2024-10-11 08:58:02 -07:00
sheetalarkadam dd2ea8469e
Add qnn android package (#22296)
### Description
Pre built QNN Android package


### Future Work
1. Setting up CI with Browserstack- onnxruntime_tests and Android test
2. ESRP Release to Maven
2024-10-10 10:37:22 -07:00
Yulong Wang c5d28cac4d
Initial WebGPU EP checkin (#22318)
### Description

This change introduces the WebGPU EP into ONNX Runtime.

To make the PR as simple as possible, this PR excluded the following:
- C API changes for WebGPU EP
- actual implementation of WebGPU EP. Currently in this PR, WebGPU is a
stub implementation that does not register any kernel.
- Python IO Binding update
- Node.js IO Binding update

This PR now contains only 43 file changes (while the working branch
contains 130+) and hopefully this makes it easier to review.

There is going to be separated PRs for each mentioned above.

Current working branch: #21904
2024-10-08 16:10:46 -07:00
jingyanwangms 5036e63d58
Increanse TensorRT tolerance from default 1e-5 to 1e-3 after TRT 10.4 (#22321)
### Description
Increanse TensorRT tolerance from default 1e-5 to 1e-3 after TRT 10.4

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-05 18:06:06 +10:00
Adam Pocock 14d1bfc34b
[java] Multi-LoRA support (#22280)
### Description
Java parts of Multi-LoRA support - #22046.

### Motivation and Context
API equivalence with Python & C#.

---------

Co-authored-by: Dmitri Smirnov <dmitrism@microsoft.com>
2024-10-01 13:54:37 -07:00
Edward Chen c24e55b1f1
[Java] Add API for appending QNN EP (#22208)
- Add Java API for appending QNN EP
- Update Java unit test setup
  - Fix issues with setting system properties for tests
  - Unify Windows/non-Windows setup to simplify
2024-10-01 10:18:04 -07:00
Adam Pocock cfa45df6b5
[java] Migrate OnnxTensors created from arrays over to a backing Java buffer (#18556)
### Description
Following from #16578 and #16835 this migrates over
`OnnxTensor.createTensor(<array>)` to first instantiate a
`java.nio.Buffer` and then copy the array into that buffer in Java
before creating the tensor. It also changes the `OnnxTensor.getValue()`
method which returns a multidimensional array so it does the array
construction and value copy in Java. This allows the removal of some
unpleasant recursive C code which repeatedly calls into the JVM to
traverse Java's arrays. The equivalent Java code is still unpleasant and
recursive, but it's easier to reason about and memory safe. As a bonus,
more `OnnxTensor`s are now backed by buffers which allow users to pin
memory and reduce allocations by reusing them for same sized inputs.

Some of the JNI code which parses Java arrays still exists as it's used
by `OnnxMap`, removing that will be the target of a future refactor.
Strings are still processed in JNI as it is easier to work with String
tensors and UTF-8 arrays in C.

### Motivation and Context
Minimizing the amount of JNI code makes it easier to maintain and using
buffers in preference to arrays allows for fewer allocations.
2024-09-24 15:36:52 +10:00
Adam Pocock 6d7235ba5a
[Java] Exposing SessionOptions.SetDeterministicCompute (#18998)
### Description
Exposes `SetDeterministicCompute` in Java, added to the C API by #18944.

### Motivation and Context
Parity between C and Java APIs.
2024-09-16 11:55:38 +10:00
Adam Pocock 02e00dc023
[java] Adding ability to load a model from a memory mapped byte buffer (#20062)
### Description
Adds support for constructing an `OrtSession` from a
`java.nio.ByteBuffer`. These buffers can be memory mapped from files
which means there doesn't need to be copies of the model protobuf held
in Java, reducing peak memory usage during session construction.

### Motivation and Context
Reduces memory usage on model construction by not requiring as many
copies on the Java side. Should help with #19599.
2024-09-16 08:31:55 +10:00
Michael Tyler 904b850b44
Update Arm Compute Library Execution Provider (#22032)
### Description
This PR makes the following updates to the Arm Compute Library execution
provider:

- Target Arm Compute Library 24.07  
- Add support for the following operators: 
  - Conv (FP16) 
  - NhwcConv 
  - QLinearConv 
  - MatMul 
  - FusedMatMul 
  - MatMulIntegerToFloat 
- Optimize memory usage and performance
- Expose the enable_fast_math setting 
- Use the main runtime thread pool 



### Motivation and Context
These updates improve performance and memory usage, and enable use of a
more recent version of Arm Compute Library.

@microsoft-github-policy-service agree company="Arm Ltd"

---------

Signed-off-by: Michael Tyler <michael.tyler@arm.com>
2024-09-12 20:51:59 -07:00
Adam Pocock 22437b581b
[java] Fix for OnnxTensor creation when passing in a ByteBuffer containing elements of a different type (#21774)
### Description
Fixes a bug where the buffer offset and position was incorrectly
computed if the user supplied a `ByteBuffer` to `createTensor` but set
the type of the tensor to something other than `INT8`. This would be
more common if the user was trying to load the initializers from a
serialized representation and didn't want to bother with the type
information (which is the case in #21321).

### Motivation and Context
Partial fix for #21321. The remainder of the fix is to add a helper
which allows users to load initializers out of an `onnx_data` file, but
that will require adding protobuf as a dependency for the Java API to
allow the parsing of an ONNX file separately from the native code. It
might be nicer to put that functionality into ORT's C API so it can
return the lengths & offsets of the initializers when provided with an
ONNX file containing external initializers. We hit this kind of thing in
Java more often than other languages as in Java models can be supplied
as classpath resources which we can easily read, but not materialize on
disk for the ORT native library to read.
2024-09-13 12:38:17 +10:00
mindest 5b9369e93c
Fix typos according to reviewdog report. (#21335)
### Description
Fix typos based on reviewdog report but with some
exceptions/corrections.
2024-07-22 13:37:32 -07:00
Edward Chen 9c2b85ad58
Fix Android build on Windows (#21304)
- Pass a list of files instead of path separator-delimited string to project.files(). See this issue: https://github.com/gradle/gradle/issues/19817
- Check for host (instead of target) being Windows when using fallback patch program.
2024-07-15 12:29:02 -07:00
Jian Chen f81c0ec32a
Remove warning suppression from Java Packaging pipeline. (#21010)
### Description
Remove warning suppression from Java Packaging pipeline.


### Motivation and Context
We want the CI step not to produce warning.
2024-06-24 16:46:21 -07:00
Edward Chen 981893c318
Remove deprecated "mobile" packages (#20941)
# Description

This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed.

Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it.

# Motivation and Context

The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.
2024-06-07 16:20:32 -05:00
Jian Chen d32adb26f2
Refactor deprecated gradle syntax (#20922)
To replaced deprecated API. 
Should verify with the `Gradle cmakeCheck` step from
`Windows_Packaging_CPU_x64_default` stage from the Zip-Nuge-...
pipeline.
2024-06-07 11:08:52 -07:00
Jian Chen 228713f635
adding publishing stage to publish java CUDA 12 pkg to ado (#20834) 2024-05-29 16:24:23 -07:00
Adam Pocock a36692066d
[java] CUDA & TensorRT options fix (#20549)
### Description
I misunderstood how UpdateCUDAProviderOptions and
UpdateTensorRTProviderOptions work in the C API, I had assumed that they
updated the options struct, however they re-initialize the struct to the
defaults then only apply the values in the update. I've rewritten the
Java bindings for those classes so that they aggregate all the updates
and apply them in one go. I also updated the C API documentation to note
that these classes have this behaviour. I've not checked if any of the
other providers with an options struct have this behaviour, we only
expose CUDA and TensorRT's options in Java.

There's a small unrelated update to add a private constructor to the
Fp16Conversions classes to remove a documentation warning (they
shouldn't be instantiated anyway as they are utility classes containing
static methods).

### Motivation and Context
Fixes #20544.
2024-05-05 00:16:55 -07:00
Adam Pocock 262b6bd3b7
[java][DML EP] Modifying dml_provider_factory.h so it can compile as a C header file (#20157)
### Description
The dml_provider_factory header file can't be used in C programs as it
defines C++ inline operators. This PR rearranges that header file so
that it looks like valid C when used from C, and also makes a couple of
small modifications to the Java code so it correctly binds to the DML EP
at build time.

I'm having some difficulty testing it as I think it's pulling in the old
version of DirectML on my computer and I can't figure out what the
library loading path is in Java to make it look at the recent version I
downloaded. So the test I added fails with:

```
InferenceTest > testDirectML() FAILED
    ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: Exception during initialization: <path-to-ort>\onnxruntime\core\providers\dml\DmlExecutionProvider\src\AbiCustomRegistry.cpp(518)\onnxruntime.dll!00007FFF74819333: (caller: 00007FFF74793509) Exception(3) tid(4f58) 80070057 The parameter is incorrect.
        at app//ai.onnxruntime.OrtSession.createSession(Native Method)
        at app//ai.onnxruntime.OrtSession.<init>(OrtSession.java:74)
        at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:236)
        at app//ai.onnxruntime.OrtEnvironment.createSession(OrtEnvironment.java:221)
        at app//ai.onnxruntime.InferenceTest.openSessionSqueezeNet(InferenceTest.java:1961)
        at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:665)
        at app//ai.onnxruntime.InferenceTest.testDirectML(InferenceTest.java:657)
```

But it does correctly compile, and this error seems very similar to
other issues with the DML provider when it doesn't like a model due to
the loaded library being old. The test is using the squeezenet file
that's been in the repo since 2019. If someone can help me figure out
how to get the right version of DML in the library path I can test it
more on my end. I tried adding the folder with the new version into the
system path, but I'm not very familiar with Windows' library loading
behaviour.

### Motivation and Context
Fixes #19656 to allow use of the DirectML EP from ORT Java.

cc @martinb35
2024-04-01 21:58:50 -07:00
Adam Pocock 2f82400b13
[java] Java 21 build support (#19876)
### Description
Bump spotless and the Gradle wrapper to 6.25.0 and 8.6 respectively to
allow compiling ORT on Java 21. The build still targets Java 8.

I'm not sure if there will be CI changes necessary to use this PR,
specifically for the Gradle version as I don't know if that is cached
somewhere earlier in the CI build process.

The new Gradle version adds a warning that using `--source` and
`--target` to select the Java language version is obsolete which is
annoying, we can fix it if we decide to only allow building on newer
versions of Java, while still supporting running on Java 8.

### Motivation and Context
Java 21 is the latest LTS release of Java and ORT should be able to
build on it.
2024-03-28 15:51:22 -07:00
Adam Pocock e5ce81ae84
[java] Adding ML program flag for CoreML (#19551)
### Description
Adds the new CoreML enum flags to enable ML Program support in Java.

### Motivation and Context
Adds support for #19347 to the Java API.
2024-02-21 12:24:41 -08:00
Tianlei Wu fbff99a432
Change Jave Test Threshold (#19508)
### Description
Increase the threshold to 1e-5 to avoid test failed in CUDA when
difference is slightly larger than 1e-6.
May because TF32 is used in those CUDA tests.

### Motivation and Context


https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1291322&view=logs&j=f2f63060-d9d6-52d0-adee-b97db5a9ab91&t=28e21ca6-87a4-5e1e-0441-72b5e8326f2d

ProviderOptionsTest > testCUDAOptions() FAILED
org.opentest4j.AssertionFailedError: array contents differ at index
[103], expected: <0.0102678> but was: <0.010266338>
at
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
at
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
at
app//org.junit.jupiter.api.AssertArrayEquals.failArraysNotEqual(AssertArrayEquals.java:440)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:290)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:123)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:119)
at
app//org.junit.jupiter.api.Assertions.assertArrayEquals(Assertions.java:1360)
at
app//ai.onnxruntime.providers.ProviderOptionsTest.runProvider(ProviderOptionsTest.java:99)
at
app//ai.onnxruntime.providers.ProviderOptionsTest.testCUDAOptions(ProviderOptionsTest.java:43)
        

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1293200&view=logs&jobId=f2f63060-d9d6-52d0-adee-b97db5a9ab91&j=f2f63060-d9d6-52d0-adee-b97db5a9ab91&t=28e21ca6-87a4-5e1e-0441-72b5e8326f2d
        
InferenceTest > testCUDA() FAILED
org.opentest4j.AssertionFailedError: array contents differ at index
[103], expected: <0.0102678> but was: <0.010266337>
at
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
at
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
at
app//org.junit.jupiter.api.AssertArrayEquals.failArraysNotEqual(AssertArrayEquals.java:440)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:290)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:123)
at
app//org.junit.jupiter.api.AssertArrayEquals.assertArrayEquals(AssertArrayEquals.java:119)
at
app//org.junit.jupiter.api.Assertions.assertArrayEquals(Assertions.java:1360)
at app//ai.onnxruntime.InferenceTest.runProvider(InferenceTest.java:676)
at app//ai.onnxruntime.InferenceTest.testCUDA(InferenceTest.java:615)
2024-02-14 10:08:46 -08:00
Changming Sun a28abeb241
Change "#ifdef WIN32" to "#ifdef _WIN32" (#19254)
### Description
`_WIN32` is a standard macro listed at
https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-170
. But `WIN32` is not.
2024-01-24 14:35:44 -08:00
Heflin Stephen Raj 0ea48fc73e
Modified the condition to load the optimiser model (#18891) 2024-01-23 10:10:54 -08:00
Adam Pocock 191525301f
[java] Updating TensorInfo so it contains the named dimensions (#18962)
### Description
The Java `TensorInfo` object which is used to describe a tensor's shape,
along with the input and output placeholders for a model couldn't show
any symbolic/named dimensions in that tensor. Now this information is
stored in Java strings on construction and included in the toString.

### Motivation and Context
Setting symbolic dimensions required external information in Java, the
names were not discoverable from within the API.
2024-01-15 14:42:50 -08:00
Adam Pocock 71657d1eb8
[java] Fix double close (#19133)
### Description
The `OnnxValue` and `OrtProviderOptions` implementations now check to
see if they've been closed before accessing the native pointer, and also
before close is called.

### Motivation and Context
Before they could be closed twice which SIGSEGV'd the JVM. Fixes #19125.
2024-01-14 14:53:26 -08:00
Adam Pocock 3456831413
[java] Make the backing byte buffer in an OrtValue accessible (#16578)
### Description
Adds a method to access the backing direct byte buffer from a Java
`OnnxTensor` object, assuming it is backed by a direct byte buffer
(tensors created by ORT's run call or ones created in Java from
multidimensional arrays are not). Also adds a method to check if the
backing byte buffer was copied from the user's buffer supplied on
creation (this could be tested via a pointer comparison from the output
of `getBufferRef` and the user's input buffer, so I'm not sure if it's
necessary).

### Motivation and Context
This is the first part of changes necessary to support output pinning in
Java OrtSession.run/OrtTrainingSession.run calls. I split it out from
the rest of the work as it's useful by itself (e.g. to allow users to
keep a single input tensor and rewrite it each time with new inputs
rather than allocate a fresh one) and the other change will be much more
involved so splitting it makes it easier to review.

cc @yuslepukhin
2023-10-17 10:03:49 -07:00
Chi Lo 569876fb16
[TensorRT EP] Refactor OrtTensorRTProviderOptions initialization and make it easy to add new field (#17617)
Two major modifications of this PR:

1. Refactor OrtTensorRTProviderOptions initialization and make it easy
to add new field.
2. Make Python API capable of using TensorRT plugins by adding new
Python binding api `register_tensorrt_plugins_as_custom_ops`. (It needs
to register ep's custom op domain before model load. For C++ API, it's
slightly different, when calling
SessionOptionsAppendExecutionProvider_TensorRT_XX, it appends cutom op
domain to session option. Later ORT can register custom op domain from
session option before model loading)
2023-10-06 14:12:20 -07:00
Adam Pocock 522cc968e8
[java] Filling out the javadoc for the float8 types (#17694) 2023-09-27 10:52:11 -07:00
Adam Pocock aed43f429a
[java] Enable output pinning in OrtSession and OrtTrainingSession (#16835) 2023-09-26 01:49:13 -07:00
Adam Pocock 03c3e91b0d
[java] Relaxing CoreML test (#16777)
### Description
Reduces precision on the CoreML provider test as it returns slightly
different answers than the other tested providers. Checked on a 2020 13"
M1 MBP.

### Motivation and Context
Fixes Java CoreML test failure after #16763.
2023-08-09 11:43:05 -07:00
Adam Pocock 340f4ded73
[java] Fills out the javadoc so there are no more documentation warnings (#16776)
### Description
Adds javadoc for all protected and public members, methods and classes.

### Motivation and Context
The javadoc warnings were annoying me when running the builds. Also,
those types should have been documented.

---------

Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
2023-07-27 16:17:03 +10:00
Adam Pocock a1bb670536
[java] Fp16 fix for android/react native (#16832)
### Description
This PR splits out the FP16 conversions into a separate package we can
override in the android build with a version which works on old versions
of Android.

I'm not sure the android build system changes are correct as I haven't
got an android build environment configured on my workstation.
@YUNQIUGUO if the CI build fails we should follow up offline to get my
environment configured so I can iterate on it.

### Motivation and Context
Fixes the CI failure after #16703.
2023-07-25 12:31:32 -07:00
Adam Pocock a8e776b78b
[java] Adds support for fp16 and bf16 tensors (#16703)
### Description
The Java API currently only supports fp16 output tensors which it
automatically casts to floats on the way out. This PR adds support for
creating fp16 and bf16 tensors (from `java.nio.Buffer` objects or as the
output of models, creation from Java short arrays is not supported),
along with efficient methods for casting `FloatBuffer` into
`ShortBuffer` filled with fp16 or bf16 values and vice versa.

The fp16 conversions use a trick to pull in the efficient conversion
methods added to Java 20, falling back to ports of the MLAS methods
otherwise. The Java 20 methods can be special cased by the C2 JIT
compiler to emit the single instruction on x86 and ARM which converts
fp32<->fp16, or the vectorized versions thereof, so they should be quite
a bit faster than the MLAS ported one.

### Motivation and Context
fp16 and bf16 are increasingly popular formats and we've had several
requests for this functionality. Fixes #7003.

cc @yuslepukhin  @cassiebreviu

---------

Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
2023-07-21 21:14:41 +10:00
Adam Pocock ba91457183
[java] Adding addExternalInitializers and addInitializer to OrtSession.SessionOptions (#16198)
### Description
Adds support for adding external initializers or overriding initializers
to a session options from Java.

### Motivation and Context
We want to instantiate large models from Java without filesystem access.

cc @yuslepukhin
2023-07-05 12:51:59 -07:00
Adam Pocock 13cc6192e5
[java] Adding native library loader to SessionOptions and RunOptions static init (#16435)
### Description
Unlike most ORT classes `SessionOptions` and `RunOptions` don't trigger
native library loading of the JNI binding and ORT when the classes are
initialized (after class loading). This was initially because I thought
that loading an inner class would trigger the static initialization of
the outer class, but this is not true. So if you create a
`SessionOptions` instance before referencing `OrtEnvironment` then you
won't trigger library loading and you'll get an error saying it couldn't
link the native method that creates a `SessionOptions` object.

Note this doesn't prevent users from creating a `SessionOptions` and
modifying it before the `OrtEnvironment` is created, which can still
cause issues. It would be a breaking API change to modify the
`SessionOptions` constructor to take an environment, and it wouldn't
mirror the way it works in the C API which requires this by convention
rather than API design, but we can discuss making that modification
later.

### Motivation and Context
Reduces the occurrence of mysterious Java library loading errors. Helps
with #16434.
2023-07-03 15:59:03 -07:00
Baiju Meswani 10ba1e270c
Minimal Build for On-Device Training (#16326)
🛠️ __Changes in this pull request:__

This pull request introduces two significant changes to the project:

- Changing on device training checkpoint format: The current
implementation stores the on device training checkpoint as a sequence of
tensors in multiple files inside a checkpoint folder, which can be
inefficient in terms of storage and performance. In this PR, I have
modified the checkpoint format to utilize the flatbuffer table to save
the checkpoint to a single file, providing a more compact and efficient
representation. The changes around this are twofold:
- Add the checkpoint flatbuffer schema that will generate the necessary
checkpoint source files.
- Update the checkpoint saving and loading functionality to use the new
format.

- Adding support for onnxruntime minimal build: To support scenarios
where binary size is a constraint, I made changes to ensure that the
training build can work well with the minimal build.

🔍 __Open Issues:__
- In order to extract the optimizer type, the existing implementation
re-loaded the onnx optimizer model and parsed it. This is no longer
possible, since the model format can either be onnx or ort. One idea is
to do the same for ort format optimizer model. This needs some
investigation.
- Changes to the offline tooling to generate ort format training
artifacts.
- End-to-end training example showcasing the use of the minimal training
build.
- Add support for export model for inferencing in a minimal build.
2023-06-22 12:27:23 -07:00
Adam Pocock bca49d62a0
Fixing CoreML in Java (#16231)
### Description
The name of the flag we set when compiling the JNI binding to enable the CoreML EP changed at some point in the past. This PR fixes it by updating the flag in the JNI. I also added a quick smoke test for the CoreML provider to make sure it doesn't crash and can be enabled.

### Motivation and Context
All the EPs should work as expected in Java. Fixes #16230.
2023-06-07 12:24:57 -07:00
Adam Pocock 3c2a11f2f1
[java] Allow the creation of boolean tensors from ByteBuffer (#15556)
### Description
The tensor creation code now allows the creation of boolean tensors from
non-direct `ByteBuffer` instances. It previously only allowed them from
arrays and direct `ByteBuffer` instances and this fixes that
inconsistency. The boolean tensor test has been updated to cover all
three cases.

### Motivation and Context
Fixes #15509.
2023-06-05 09:58:50 -07:00
Xavier Dupré e726151b5c
Introduce float 8 types (#14731)
### Description
The PR implements FloatE4M3FN, FloatE5M2, FloatE4MEFNUZ, FloatE5M2FNUZ
as described in PR https://github.com/onnx/onnx/pull/4805. It uses CUDA
API to cast float/half to float8 if CUDA>=11.8, a custom implementation
if CUDA<11.8.

* It implements, Cast, QuantizeLinear, DequantizeLinear for all types on
CPU, only for types FloatE4M3FN, FloatE5M2 on CUDA.
* It extends the supported types for control flow operator, Shape,
Reshape, Identity, If, Loop, Scan, Reshape
* It implements Equal(19).
* Cast, QuantizeLinear, DequantizeLinear operators now support a
parameter `saturate` only valid for float 8 types. It is true by
default. In that case, any value out of range is converted into the
maximum float 8 value. If false, it is infinite.
* QuantizeLinear, DequantizeLinear now supports multiple scales on CUDA
(and ROCm by extension), scale = 1D tensor with one scale per channel

### Motivation and Context
Supports latest onnx version.

Fixes
[AB#15395](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15395)

---------

Co-authored-by: Xavier Dupre <xadupre@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Randy Shuai <rashuai@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <Scott.McKay@microsoft.com>
2023-05-30 13:25:58 -07:00
Jian Chen ea7b2deffd
Removing C4090 warning suppression (#15994)
### Description
Removing C4090 warning  suppression after windows pipelines adapt vs2022


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-18 10:08:05 -07:00
Dmitri Smirnov 896a963492
Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921)
### Description

This PR partially reverts changes introduced in
https://github.com/microsoft/onnxruntime/pull/15643

We make two API return std::string always in UTF-8.

We also move the entry points from OrtApiBase to OrtApi to make them
versioned.

### Motivation and Context

`GetVersionString` always returns x.y.z numbers that are not subject to
internationalization.
`GetBuildInfoString` can hold international chars, but UTF-8 should be
fine to contain those.
We prefix them with u8"" in case the compiler default charset is not
UTF-8.
Furthermore, creating platform dependent APIs is discouraged.
`ORTCHAR_T` is platform dependent and was created for paths only.
On non-unix platforms would still produce `std::string` that can only
contain UTF-8

The API was introduced after the latest release, and can still be
adjusted.
2023-05-13 13:45:07 -07:00
RandySheriffH 7c4e8267e7
Implement openAI endpoint invoker for nuget (#15797)
Implement openAI audio endpoint, and enable nuget packaging.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-11 22:04:02 -07:00
Ashwini Khade 0ffae8073b
Creating Nuget and Android packages for Training (#15712)
### Description
This PR creates Nuget and Android for Training. 


### Motivation and Context
These packages are intended to be released in ORT 1.15 to enable
On-Device Training Scenarios.

## Packaging Story for Learning On The Edge Release
### Nuget Packages:
1. New Native package -> **Microsoft.ML.OnnxRuntime.Training** (Native
package will contain binaries for: win-x86, win-x64, win-arm, win-arm64,
linux-x64, linux-arm64, android)
2. C# bindings will be added to existing package ->
**Microsoft.ML.OnnxRuntime.Managed**

### Android Package published to Maven:
1. New package for training (full build) ->
**onnxruntime-training-android-full-aar**

### Python Package published to PyPi:
1. Python bindings and offline tooling will be added to the existing ort
training package -> **onnxruntime-training**
2023-05-01 12:59:56 -07:00
Yuhong Guo 41dcf0d32e
Expose build information in dynamic lib (#15643)
### Description
<!-- Describe your changes. -->
1. Add Build Info API to onnx.
2. Fix compile error while building onnxruntime_benchmark in MacOs.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
1. When Onnxruntime lib is serving online, we need a way to detect how
this lib is built. This PR helps the developer to get the build
information using `strings` such as git branch, git commit id, build
type and cmake cxx flags, which is showed as follows.


![image](https://user-images.githubusercontent.com/19584326/233794371-b2f95a2c-27fb-4709-a6dd-bf4bb12b0b5b.png)


![image](https://user-images.githubusercontent.com/19584326/233794360-f96f5d2e-332c-405c-83f1-370ccc2b86f8.png)

If the build env has no git, there will be no git related infor:


![image](https://user-images.githubusercontent.com/19584326/234558596-298c1b01-9a90-41bf-9372-7259a8f8e5be.png)


3. Fix the following compile error while building benchmark in MacOs.

![image](https://user-images.githubusercontent.com/19584326/233793571-c261ac1f-47b2-434d-a293-7e9edc6c8a66.png)

---------

Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>
2023-04-28 21:57:31 -07:00
Adam Pocock 8a1a40ac63
[Java] CheckpointState AddProperty & GetProperty support (#15730) 2023-04-28 09:52:52 -07:00
Ashwini Khade ccb2243ee7
Update build option for training in java to enable_training_api (#15638)
### Description
Updating the build option for enabling training in java builds from
ENABLE_TRAINING -> ENABLE_TRAINING_APIS.
In the native codebase ENABLE_TRAINING is used for enabling full
training and ENABLE_TRAINING_APIS is used for creating the lte builds
with training apis. Making the change to sync the naming convention
across all the language bindings.

It was a bit confusing to see ENABLE_TRAINING when debugging the android
build failures for training. Making this change just to improve
readability of logs during debugging.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-04-24 11:53:08 -07:00
Baiju Meswani b5a1941835
C, C++, Python, C# API update for on device training (#15518) 2023-04-21 11:36:01 -07:00