fd8ee4894d
### Description Implement JSEP GroupQueryAttention ### Motivation and Context Required to enable certain LLM models to run using WebGPU. |
||
---|---|---|
.. | ||
.vscode | ||
common | ||
node | ||
react_native | ||
scripts | ||
web | ||
.eslintrc.js | ||
.gitignore | ||
.prettierignore | ||
.prettierrc | ||
README.md | ||
build_jsep.bat | ||
package-lock.json | ||
package.json | ||
tsconfig.json | ||
tsconfig.tools.json |
README.md
ONNX Runtime JavaScript API
This directory contains multiple NPM projects:
Development
This folder contains a .vscode
folder for Visual Studio Code workspace configs. Using VSCode to open this folder
will allow code-formatting and linting features on typescript and C/C++ source code inside this folder. Following files
are used for code-formatting and linting features for developers:
- .vscode/**
- package.json
- packages-lock.json
- .eslintrc.js
- .prettierignore
- .prettierrc
Please follow the steps described below to setup development environment.
Prerequisites
-
Node.js (16.0+): https://nodejs.org/ - (Optional) Use nvm (Windows / Mac/Linux) to install Node.js
-
Python (2.7 or 3.6+): https://www.python.org/downloads/
- python should be added to the PATH environment variable
-
Visual Studio Code: https://code.visualstudio.com/
- required extension: ESLint
- required extension: Prettier
- required extension: JavaScript Debugger
-
Chrome or Edge Browser
Setup TypeScript development environment
In <ORT_ROOT>/js
, run:
npm ci
This will install Prettier and ESLint for code-formatting and linting features. This is a one-time setup unless a git clean
is performed or folder <ORT_ROOT>/js/node_modules
is removed manually.
Using VSCode:
Use VSCode to open folder <ORT_ROOT>/js
.
Make sure to open the correct folder to allow VSCode to load workspace configuration. Otherwise typescript and code formatter may not work as expected.
To populate typescript type declarations, in each project folder, run npm ci
.
Run code formatter and linter manually
In <ORT_ROOT>/js
, use npm run lint
to run ESLint , and use npm run format
to run code formatter.
onnxruntime-common
language: typescript
dependency:
folder: <ORT_ROOT>/js/common
This project is designed to include all "common" code, which are pure javascript that can run in both Node.js and browsers.
Requirements
Node.js v12+ (recommended v14+)
Build
Use following command in folder <ORT_ROOT>/js/common
to install NPM packages, build typescript files and generate bundles:
npm ci
Distribution
It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules
folder that generated by npm install onnxruntime-common
) and from a CDN service that serves a .min.js
bundle file.
Features
Following features are included in onnxruntime-common
:
InferenceSession
interfacesTensor
/OnnxValue
interfaces, implementation and a set of utility functionsBackend
interfaces and a set of functions for backend registration
Generate API reference document
Use following command in folder <ORT_ROOT>/js/common
to generate API reference document:
npx typedoc
Document will be generated in folder <ORT_ROOT>/js/common/docs
.
onnxruntime-node
language: typescript/C++
dependency: onnxruntime-common, ONNXRuntime.dll
folder: <ORT_ROOT>/js/node
This project is designed to be used as a NPM package to enable Node.js users to consume ONNX Runtime via Node.js binding, in Node.js or any Node.js compatible environment.
Requirements
Node.js v12+ (recommended v14+)
Build
Build ONNX Runtime and Node.js binding
Follow instructions for building ONNX Runtime Node.js binding
Build Node.js binding only
Use following command in folder <ORT_ROOT>/js/node
to install NPM packages and build typescript files:
npm ci
This will download the latest pre-built ONNX Runtime binaries for the current platform.
Distribution
It should be able to consumed by from projects that uses NPM packages (through a Node.js folder structure of node_modules
folder that generated by npm install onnxruntime-node
).
onnxruntime-web
language: typescript
dependency: onnxruntime-common, ONNXRuntime WebAssembly
folder: <ORT_ROOT>/js/web
This project is a library for running ONNX models on browsers. It is the successor of ONNX.js.
Build
onnxruntime-web build instructions
Test
We use command npm test
(test runner) and npm run test:e2e
(E2E test) for tests in ONNXRuntime Web.
test runner
In folder <ORT_ROOT>/js/web
,
- Run
npm test -- --help
for a full CLI instruction. - Run
npm test -- <your-args> --debug
to run one or more test cases.
There are multiple levels of tests for ONNXRuntime Web:
-
unit test: tests for individual components written in TypeScript. Launch unit test by:
npm test -- unittest
-
model test: run a single model. The model folder should contains one .onnx model file and one or more folders for test cases, each folder contains several input**.pb and output**.pb as test data. Launch model test by:
npm test -- model <model_folder>
-
op test: test a single operator. An op test is described in a
.jsonc
file which specify the operator type, its attributes and one or more test case(s), each includes a list of expected input tensor(s) and output tensor(s). The.jsonc
file is located at<ORT_ROOT>/js/web/test/data/ops
. Launch op test by:npm test -- op <file_name>
-
suite test: suite test includes unit test, a list of model tests and op tests. Launch suite test by:
npm test
E2E test
E2E test is for testing end-to-end package consuming. In this test, NPM packages for onnxruntime-common
and onnxruntime-web
are generated and a clean folder is used for installing packages. Then a simple mocha test is performed to make sure package can be consumed correctly.
To launch E2E test:
npm run test:e2e
Debugging
Debugging TypeScript on Desktop/Chrome
To debug the code from test-runner on Chrome:
- Launch
npm test -- <your_args> --debug
. It opens an instance of Chrome browser. - In the open Chrome browser, click the
DEBUG
button on the top-right of the page. - In VSCode, click [side bar]->Run and Debug->select [Attach to Chrome]->click [Start Debugging] to attach.
- put breakpoints in source code, and Refresh the page to reload.
Debugging TypeScript on iOS/Safari
To debug on an Apple iOS device, please refer to the following steps:
- install RemoteDebug iOS WebKit Adapter by following its instructions.
- launch the adapter in commandline:
remotedebug_ios_webkit_adapter --port=9000
. - in VSCode, select debug configuration
Remote Browser via Webkit Adaptor
. - follow the steps above to debug.
Debugging TypeScript on Android/Chrome
To debug on an Android device, please refer to the following steps:
- Install Android SDK Platform Tools and make sure
adb
is ready to use. - Follow instructions in Remote Debugging on Android to launch
adb
. Make sure to use port 9000 so that the existing debug configuration works. - in VSCode, select debug configuration
Remote Browser via Webkit Adaptor
. - follow the steps above to debug.
Debugging C/C++ for ONNX Runtime WebAssembly
To debug C/C++ code for ONNX Runtime WebAssembly, you need to build ONNX Runtime with debug info (see Build).
Currently debugging C/C++ code in WebAssembly is not supported in VSCode yet. Please follow this instruction to debug in browser devtool using extension C/C++ DevTools Support (DWARF).
Generating Document
This section describes how to generate the latest document for ONNX Runtime Web.
The document contains information about operators WebGL backend supports. It should align with the operator resolve rules in code and spec definition from ONNX.
In folder <ORT_ROOT>/js/web
, use command npm run build:doc
to generate the latest documents.
Distribution
It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules
folder that generated by npm install onnxruntime-web
) and from a CDN service that serves a ort.min.js
file and one or multiple .wasm
file(s).
Reduced WebAssembly artifacts
By default, the WebAssembly artifacts from onnxruntime-web package allows use of both standard ONNX models (.onnx) and ORT format models (.ort). There is an option to use a minimal build of ONNX Runtime to reduce the binary size, which only supports ORT format models. See also ORT format model for more information.
Reduced JavaScript bundle file fize
By default, the main bundle file ort.all.min.js
of ONNX Runtime Web contains all features. However, its size is over 500kB and for some scenarios we want a smaller sized bundle file, if we don't use all the features. The following table lists all available bundles with their support status of features.
bundle file name | file size | file size (gzipped) | WebGL | WASM | WebGPU |
---|---|---|---|---|---|
ort.all.min.js | 682 KB | 166 KB | O | O | O |
ort.min.js | 434 KB | 102 KB | O | O | X |
ort.webgl.min.js | 411 KB | 93.6 KB | O | X | X |
ort.webgpu.min.js | 293 KB | 80.1 KB | X | O | O |
ort.wasm.min.js | 46 KB | 14.8 KB | X | O | X |
Build ONNX Runtime as a WebAssembly static library
When --build_wasm_static_lib
is given instead of --build_wasm
, it builds a WebAssembly static library of ONNX Runtime and creates a libonnxruntime_webassembly.a
file at a build output directory. Developers who have their own C/C++ project and build it as WebAssembly with ONNX Runtime, this build option would be useful. This static library is not published by a pipeline, so a manual build is required if necessary.
onnxruntime-react-native
language: typescript, java, objective-c
dependency: onnxruntime-common
folder: <ORT_ROOT>/js/react_native
This project provides an ONNX Runtime React Native JavaScript library to run ONNX models on React Native Android and iOS app.
Requirements
- Yarn
- Android SDK and NDK, which can be installed via Android Studio or sdkmanager command line tool
- A Mac computer with the latest macOS
- Xcode
- CMake
- Python 3
Models with ORT format
Prior to ORT v1.13, the ONNX Runtime React Native package utilized the ONNX Runtime Mobile package, which required an ONNX model to be converted to ORT format. Follow these instructions to convert ONNX model to ORT format. Note that the ONNX Runtime Mobile package includes a reduced set of operators and types, so not all models are supported. See here for the list of supported operators and types.
From ORT v1.13 onwards, the 'full' ONNX Runtime package is used. It supports both ONNX and ORT format models, and all operators and types.
From ORT v1.19 onwards, the ONNX Runtime Mobile packages are no longer published.
Build
-
Install NPM packages for ONNX Runtime common JavaScript library and required React Native JavaScript libraries
- in
<ORT_ROOT>/js/
, runnpm ci
. - in
<ORT_ROOT>/js/common/
, runnpm ci
. - in
<ORT_ROOT>/js/react_native/
, runyarn
.
- in
-
Acquire or build the Android ONNX Runtime package
-
To use a published Android ONNX Runtime package from Maven, go to step 5.
-
Set up an Android build environment using these instructions. Note that the dependencies are quite convoluted, so using the specified JDK and Gradle versions is important.
-
In
<ORT_ROOT>
, run the below python script to build the ONNX Runtime Android archive file. On a Windows machine, this requires an admin account to build.
You can build a 'full' package that supports all operators and types, or a reduced size package that supports a limited set of operators and types based on your model/s to miminize the binary size. See here for information about how the reduced build works, including creating the configuration file using your model/s. The instructions here show how to build a 'full' package.
python tools/ci_build/github/android/build_aar_package.py tools/ci_build/github/android/default_full_aar_build_settings.json --config Release --android_sdk_path <ANDROID_SDK_PATH> --android_ndk_path <ANDROID_NDK_PATH> --build_dir <BUILD_DIRECTORY>
-
Move the generated ONNX Runtime Android archive file to
<ORT_ROOT>/js/react_native/android/libs/
.Copy
<BUILD_DIRECTORY>/aar_out/Release/com/microsoft/onnxruntime/onnxruntime-android/<version>/onnxruntime-android-<version>.aar
into<ORT_ROOT>/js/react_native/android/libs
directory. -
To verify, open the Android Emulator and run this command from
<ORT_ROOT>/js/react_native/android
./gradlew connectedDebugAndroidTest
-
-
Build iOS ONNX Runtime package
-
To use the published C/C++ ONNX Runtime package from CocoaPods, skip all steps below.
-
Set up iOS build environment using these instructions.
-
Build a fat ONNX Runtime Framework for iOS and iOS simulator from
<ORT_ROOT>
using this command:python tools/ci_build/github/apple/build_apple_framework.py tools/ci_build/github/apple/default_full_apple_framework_build_settings.json --config Release
The build creates
Headers
,LICENSE
, andonnxruntime.xcframework
inbuild/iOS_framework/framework_out
directory. Fromframework_out
directory, create an archive file namedonnxruntime-c.zip
and copy to<ORT_ROOT>/js/react_native/local_pods
directory.zip -r onnxruntime-c.zip .
-
To verify, open the iOS Simulator and run the below command from
<ORT_ROOT>/js/react_native/ios
. Change the destination argument as needed to specify a running iOS Simulator.pod install xcodebuild test -workspace OnnxruntimeModule.xcworkspace -scheme OnnxruntimeModuleTest -destination 'platform=iOS Simulator,OS=latest,name=iPhone 13'
-
-
Test Android and iOS apps. In Windows, open Android Emulator first.
debug.keystore
must be generated ahead for Android example.keytool -genkey -v -keystore <ORT_ROOT>/js/react_native/e2e/android/debug.keystore -alias androiddebugkey -storepass android -keypass android -keyalg RSA -keysize 2048 -validity 999999 -dname "CN=Android Debug,O=Android,C=US"
From `<ORT_ROOT>/js/react_native,
yarn bootstrap
When testing with a custom built ONNX Runtime Android package, copy
<BUILD_DIRECTORY>/aar_out/MinSizeRel/com/microsoft/onnxruntime/onnxruntime-android/<version>/onnxruntime-android-<version>.aar
into the<ORT_ROOT>/js/react_native/e2e/android/app/libs
directory.When testing with a custom built ONNX Runtime iOS package, copy
onnxruntime-c.zip
into the<ORT_ROOT>/js/react_native/local_pods
directory.
-
Run E2E Testing with Detox framework
When testing with integrated Detox framework for Android and iOS e2e apps:
-
Detox prerequisites:
Install detox command line tools:
yarn global add detox-cli
Install applesimutils which is required by Detox to work with iOS simulators. (Requires a MacOS device)
brew tap wix/brew brew install applesimutils
Main Detox project files:
.detoxrc.js
-Detox config file;e2e/jest.config.js
-Jest configuration;e2e/OnnxruntimeModuleExample.test.js
- initial react native onnxruntimemodule e2e detox test.
-
Build the detox e2e testing app.
From
<ORT_ROOT>/js/react_native/e2e
, run the command to build the e2e testing app. Before that ensure you have android emulator/ios simulator started locally.iOS (Debug):
detox build --configuration ios.sim.debug
Android (Debug):
detox build --configuration android.emu.debug
- Note: If names of local testing android/ios devices do not match the default setting in
.detoxrc.js
file, modify the device name in config files accordingly to match local device name otherwise would cause a build failure.
- Note: If names of local testing android/ios devices do not match the default setting in
-
Run the detox e2e tests.
In a debug configuration, you need to have React Native packager running in parallel before you start Detox tests:
npm start > react-native start
From
<ORT_ROOT>/js/react_native/e2e
, run Detox tests using the following command:iOS (Debug):
detox test --configuration ios.sim.debug
Android (Debug):
detox test --configuration android.emu.debug
To record logs for testing results, add
--record-logs
. Output logs and test results will be produced in thee2e/artifacts/
folder. See: Detox/logger#artifactsyarn bootstrap
changespackages.json
andyarn.lock
files. Once testing is done, restore changes to avoid unwanted commit.
-
-
Run Android and iOS apps.
yarn e2e android yarn e2e ios
NPM Packaging
-
Update a version using
npm version <version>
from<ORT_ROOT>/js/react_native
folder. If it's for a dev, usenpm version <version>-dev.<subversion>
-
Run
npm pack
and verify NPM package contents -
Run
npm publish <tgz> --dry-run
to see how it's going to be published -
Run
npm publish <tgz>
to publish to npmjs. If it's for a dev, add flag--tag dev
.
Distribution
It should be able to consumed by React Native projects that uses Yarn packages through yarn add onnxruntime-react-native
.