onnxruntime/js
Satya Kumar Jandhyala fd8ee4894d
[JS/WebGPU] GroupQueryAttention rewrite (#20946)
### Description
Implement JSEP GroupQueryAttention



### Motivation and Context
Required to enable certain LLM models to run using WebGPU.
2024-10-23 10:14:09 -07:00
..
.vscode [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
common bumps up version in main from 1.20 -> 1.21 (#22482) 2024-10-17 12:32:35 -07:00
node bumps up version in main from 1.20 -> 1.21 (#22482) 2024-10-17 12:32:35 -07:00
react_native bumps up version in main from 1.20 -> 1.21 (#22482) 2024-10-17 12:32:35 -07:00
scripts [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
web [JS/WebGPU] GroupQueryAttention rewrite (#20946) 2024-10-23 10:14:09 -07:00
.eslintrc.js [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
.gitignore [js/web] use esbuild to accelerate bundle build (#17745) 2023-10-06 13:37:37 -07:00
.prettierignore [js] add big data file to formatter ignore list (#21767) 2024-08-26 22:08:26 -07:00
.prettierrc [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
README.md [js] update docs for new code formatter (#21743) 2024-08-15 20:17:08 -07:00
build_jsep.bat [js/web] update the build script for webgpu to enable model dump by default (#19707) 2024-08-09 05:55:34 -07:00
package-lock.json upgrade micromatch to v4.0.8 (#22174) 2024-09-23 14:39:32 -07:00
package.json [js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728) 2024-08-14 16:51:22 -07:00
tsconfig.json [js/web] fix typescript type check (#18343) 2023-11-10 16:03:38 -08:00
tsconfig.tools.json [js/web] fix typescript type check (#18343) 2023-11-10 16:03:38 -08:00

README.md

ONNX Runtime JavaScript API

This directory contains multiple NPM projects:

Development

This folder contains a .vscode folder for Visual Studio Code workspace configs. Using VSCode to open this folder will allow code-formatting and linting features on typescript and C/C++ source code inside this folder. Following files are used for code-formatting and linting features for developers:

  • .vscode/**
  • package.json
  • packages-lock.json
  • .eslintrc.js
  • .prettierignore
  • .prettierrc

Please follow the steps described below to setup development environment.

Prerequisites

Setup TypeScript development environment

In <ORT_ROOT>/js, run:

npm ci

This will install Prettier and ESLint for code-formatting and linting features. This is a one-time setup unless a git clean is performed or folder <ORT_ROOT>/js/node_modules is removed manually.

Using VSCode:

Use VSCode to open folder <ORT_ROOT>/js.

Make sure to open the correct folder to allow VSCode to load workspace configuration. Otherwise typescript and code formatter may not work as expected.

To populate typescript type declarations, in each project folder, run npm ci.

Run code formatter and linter manually

In <ORT_ROOT>/js, use npm run lint to run ESLint , and use npm run format to run code formatter.

onnxruntime-common

language: typescript

dependency:

folder: <ORT_ROOT>/js/common

This project is designed to include all "common" code, which are pure javascript that can run in both Node.js and browsers.

Requirements

Node.js v12+ (recommended v14+)

Build

Use following command in folder <ORT_ROOT>/js/common to install NPM packages, build typescript files and generate bundles:

npm ci

Distribution

It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-common) and from a CDN service that serves a .min.js bundle file.

Features

Following features are included in onnxruntime-common:

  • InferenceSession interfaces
  • Tensor/OnnxValue interfaces, implementation and a set of utility functions
  • Backend interfaces and a set of functions for backend registration

Generate API reference document

Use following command in folder <ORT_ROOT>/js/common to generate API reference document:

npx typedoc

Document will be generated in folder <ORT_ROOT>/js/common/docs.

onnxruntime-node

language: typescript/C++

dependency: onnxruntime-common, ONNXRuntime.dll

folder: <ORT_ROOT>/js/node

This project is designed to be used as a NPM package to enable Node.js users to consume ONNX Runtime via Node.js binding, in Node.js or any Node.js compatible environment.

Requirements

Node.js v12+ (recommended v14+)

Build

Build ONNX Runtime and Node.js binding

Follow instructions for building ONNX Runtime Node.js binding

Build Node.js binding only

Use following command in folder <ORT_ROOT>/js/node to install NPM packages and build typescript files:

npm ci

This will download the latest pre-built ONNX Runtime binaries for the current platform.

Distribution

It should be able to consumed by from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-node).

onnxruntime-web

language: typescript

dependency: onnxruntime-common, ONNXRuntime WebAssembly

folder: <ORT_ROOT>/js/web

This project is a library for running ONNX models on browsers. It is the successor of ONNX.js.

Build

onnxruntime-web build instructions

Test

We use command npm test (test runner) and npm run test:e2e (E2E test) for tests in ONNXRuntime Web.

test runner

In folder <ORT_ROOT>/js/web,

  • Run npm test -- --help for a full CLI instruction.
  • Run npm test -- <your-args> --debug to run one or more test cases.

There are multiple levels of tests for ONNXRuntime Web:

  • unit test: tests for individual components written in TypeScript. Launch unit test by:

    npm test -- unittest
    
  • model test: run a single model. The model folder should contains one .onnx model file and one or more folders for test cases, each folder contains several input**.pb and output**.pb as test data. Launch model test by:

    npm test -- model <model_folder>
    
  • op test: test a single operator. An op test is described in a .jsonc file which specify the operator type, its attributes and one or more test case(s), each includes a list of expected input tensor(s) and output tensor(s). The .jsonc file is located at <ORT_ROOT>/js/web/test/data/ops. Launch op test by:

    npm test -- op <file_name>
    
  • suite test: suite test includes unit test, a list of model tests and op tests. Launch suite test by:

    npm test
    

E2E test

E2E test is for testing end-to-end package consuming. In this test, NPM packages for onnxruntime-common and onnxruntime-web are generated and a clean folder is used for installing packages. Then a simple mocha test is performed to make sure package can be consumed correctly.

To launch E2E test:

npm run test:e2e

Debugging

Debugging TypeScript on Desktop/Chrome

To debug the code from test-runner on Chrome:

  • Launch npm test -- <your_args> --debug. It opens an instance of Chrome browser.
  • In the open Chrome browser, click the DEBUG button on the top-right of the page.
  • In VSCode, click [side bar]->Run and Debug->select [Attach to Chrome]->click [Start Debugging] to attach.
  • put breakpoints in source code, and Refresh the page to reload.

Debugging TypeScript on iOS/Safari

To debug on an Apple iOS device, please refer to the following steps:

  • install RemoteDebug iOS WebKit Adapter by following its instructions.
  • launch the adapter in commandline: remotedebug_ios_webkit_adapter --port=9000.
  • in VSCode, select debug configuration Remote Browser via Webkit Adaptor.
  • follow the steps above to debug.

Debugging TypeScript on Android/Chrome

To debug on an Android device, please refer to the following steps:

  • Install Android SDK Platform Tools and make sure adb is ready to use.
  • Follow instructions in Remote Debugging on Android to launch adb. Make sure to use port 9000 so that the existing debug configuration works.
  • in VSCode, select debug configuration Remote Browser via Webkit Adaptor.
  • follow the steps above to debug.

Debugging C/C++ for ONNX Runtime WebAssembly

To debug C/C++ code for ONNX Runtime WebAssembly, you need to build ONNX Runtime with debug info (see Build).

Currently debugging C/C++ code in WebAssembly is not supported in VSCode yet. Please follow this instruction to debug in browser devtool using extension C/C++ DevTools Support (DWARF).

Generating Document

This section describes how to generate the latest document for ONNX Runtime Web.

The document contains information about operators WebGL backend supports. It should align with the operator resolve rules in code and spec definition from ONNX.

In folder <ORT_ROOT>/js/web, use command npm run build:doc to generate the latest documents.

Distribution

It should be able to consumed by both from projects that uses NPM packages (through a Node.js folder structure of node_modules folder that generated by npm install onnxruntime-web) and from a CDN service that serves a ort.min.js file and one or multiple .wasm file(s).

Reduced WebAssembly artifacts

By default, the WebAssembly artifacts from onnxruntime-web package allows use of both standard ONNX models (.onnx) and ORT format models (.ort). There is an option to use a minimal build of ONNX Runtime to reduce the binary size, which only supports ORT format models. See also ORT format model for more information.

Reduced JavaScript bundle file fize

By default, the main bundle file ort.all.min.js of ONNX Runtime Web contains all features. However, its size is over 500kB and for some scenarios we want a smaller sized bundle file, if we don't use all the features. The following table lists all available bundles with their support status of features.

bundle file name file size file size (gzipped) WebGL WASM WebGPU
ort.all.min.js 682 KB 166 KB O O O
ort.min.js 434 KB 102 KB O O X
ort.webgl.min.js 411 KB 93.6 KB O X X
ort.webgpu.min.js 293 KB 80.1 KB X O O
ort.wasm.min.js 46 KB 14.8 KB X O X

Build ONNX Runtime as a WebAssembly static library

When --build_wasm_static_lib is given instead of --build_wasm, it builds a WebAssembly static library of ONNX Runtime and creates a libonnxruntime_webassembly.a file at a build output directory. Developers who have their own C/C++ project and build it as WebAssembly with ONNX Runtime, this build option would be useful. This static library is not published by a pipeline, so a manual build is required if necessary.

onnxruntime-react-native

language: typescript, java, objective-c

dependency: onnxruntime-common

folder: <ORT_ROOT>/js/react_native

This project provides an ONNX Runtime React Native JavaScript library to run ONNX models on React Native Android and iOS app.

Requirements

  • Yarn
  • Android SDK and NDK, which can be installed via Android Studio or sdkmanager command line tool
  • A Mac computer with the latest macOS
  • Xcode
  • CMake
  • Python 3

Models with ORT format

Prior to ORT v1.13, the ONNX Runtime React Native package utilized the ONNX Runtime Mobile package, which required an ONNX model to be converted to ORT format. Follow these instructions to convert ONNX model to ORT format. Note that the ONNX Runtime Mobile package includes a reduced set of operators and types, so not all models are supported. See here for the list of supported operators and types.

From ORT v1.13 onwards, the 'full' ONNX Runtime package is used. It supports both ONNX and ORT format models, and all operators and types.

From ORT v1.19 onwards, the ONNX Runtime Mobile packages are no longer published.

Build

  1. Install NPM packages for ONNX Runtime common JavaScript library and required React Native JavaScript libraries

    • in <ORT_ROOT>/js/, run npm ci.
    • in <ORT_ROOT>/js/common/, run npm ci.
    • in <ORT_ROOT>/js/react_native/, run yarn.
  2. Acquire or build the Android ONNX Runtime package

    1. To use a published Android ONNX Runtime package from Maven, go to step 5.

    2. Set up an Android build environment using these instructions. Note that the dependencies are quite convoluted, so using the specified JDK and Gradle versions is important.

    3. In <ORT_ROOT>, run the below python script to build the ONNX Runtime Android archive file. On a Windows machine, this requires an admin account to build.

    You can build a 'full' package that supports all operators and types, or a reduced size package that supports a limited set of operators and types based on your model/s to miminize the binary size. See here for information about how the reduced build works, including creating the configuration file using your model/s. The instructions here show how to build a 'full' package.

    python tools/ci_build/github/android/build_aar_package.py tools/ci_build/github/android/default_full_aar_build_settings.json --config Release --android_sdk_path <ANDROID_SDK_PATH> --android_ndk_path <ANDROID_NDK_PATH> --build_dir <BUILD_DIRECTORY>
    
    1. Move the generated ONNX Runtime Android archive file to <ORT_ROOT>/js/react_native/android/libs/.

      Copy <BUILD_DIRECTORY>/aar_out/Release/com/microsoft/onnxruntime/onnxruntime-android/<version>/onnxruntime-android-<version>.aar into <ORT_ROOT>/js/react_native/android/libs directory.

    2. To verify, open the Android Emulator and run this command from <ORT_ROOT>/js/react_native/android

      ./gradlew connectedDebugAndroidTest
      
  3. Build iOS ONNX Runtime package

    1. To use the published C/C++ ONNX Runtime package from CocoaPods, skip all steps below.

    2. Set up iOS build environment using these instructions.

    3. Build a fat ONNX Runtime Framework for iOS and iOS simulator from <ORT_ROOT> using this command:

      python tools/ci_build/github/apple/build_apple_framework.py tools/ci_build/github/apple/default_full_apple_framework_build_settings.json --config Release
      

      The build creates Headers, LICENSE, and onnxruntime.xcframework in build/iOS_framework/framework_out directory. From framework_out directory, create an archive file named onnxruntime-c.zip and copy to <ORT_ROOT>/js/react_native/local_pods directory.

      zip -r onnxruntime-c.zip .
      
    4. To verify, open the iOS Simulator and run the below command from <ORT_ROOT>/js/react_native/ios. Change the destination argument as needed to specify a running iOS Simulator.

      pod install
      xcodebuild test -workspace OnnxruntimeModule.xcworkspace -scheme OnnxruntimeModuleTest -destination 'platform=iOS Simulator,OS=latest,name=iPhone 13'
      
  4. Test Android and iOS apps. In Windows, open Android Emulator first.

    debug.keystore must be generated ahead for Android example.

    keytool -genkey -v -keystore <ORT_ROOT>/js/react_native/e2e/android/debug.keystore -alias androiddebugkey -storepass android -keypass android -keyalg RSA -keysize 2048 -validity 999999 -dname "CN=Android Debug,O=Android,C=US"
    

    From `<ORT_ROOT>/js/react_native,

    yarn bootstrap
    

    When testing with a custom built ONNX Runtime Android package, copy <BUILD_DIRECTORY>/aar_out/MinSizeRel/com/microsoft/onnxruntime/onnxruntime-android/<version>/onnxruntime-android-<version>.aar into the <ORT_ROOT>/js/react_native/e2e/android/app/libs directory.

    When testing with a custom built ONNX Runtime iOS package, copy onnxruntime-c.zip into the <ORT_ROOT>/js/react_native/local_pods directory.

  • Run E2E Testing with Detox framework

    When testing with integrated Detox framework for Android and iOS e2e apps:

    • Detox prerequisites:

      Install detox command line tools:

      yarn global add detox-cli
      

      Install applesimutils which is required by Detox to work with iOS simulators. (Requires a MacOS device)

      brew tap wix/brew
      brew install applesimutils
      

      Main Detox project files:

      • .detoxrc.js -Detox config file;
      • e2e/jest.config.js -Jest configuration;
      • e2e/OnnxruntimeModuleExample.test.js - initial react native onnxruntimemodule e2e detox test.
    • Build the detox e2e testing app.

      From <ORT_ROOT>/js/react_native/e2e, run the command to build the e2e testing app. Before that ensure you have android emulator/ios simulator started locally.

      iOS (Debug):

      detox build --configuration ios.sim.debug
      

      Android (Debug):

      detox build --configuration android.emu.debug
      
      • Note: If names of local testing android/ios devices do not match the default setting in .detoxrc.js file, modify the device name in config files accordingly to match local device name otherwise would cause a build failure.
    • Run the detox e2e tests.

      In a debug configuration, you need to have React Native packager running in parallel before you start Detox tests:

      npm start
      
      > react-native start
      

      From <ORT_ROOT>/js/react_native/e2e, run Detox tests using the following command:

      iOS (Debug):

      detox test --configuration ios.sim.debug
      

      Android (Debug):

      detox test --configuration android.emu.debug
      

      To record logs for testing results, add --record-logs. Output logs and test results will be produced in the e2e/artifacts/ folder. See: Detox/logger#artifacts

      yarn bootstrap changes packages.json and yarn.lock files. Once testing is done, restore changes to avoid unwanted commit.

  1. Run Android and iOS apps.

    yarn e2e android
    yarn e2e ios
    

NPM Packaging

  1. Update a version using npm version <version> from <ORT_ROOT>/js/react_native folder. If it's for a dev, use npm version <version>-dev.<subversion>

  2. Run npm pack and verify NPM package contents

  3. Run npm publish <tgz> --dry-run to see how it's going to be published

  4. Run npm publish <tgz> to publish to npmjs. If it's for a dev, add flag --tag dev.

Distribution

It should be able to consumed by React Native projects that uses Yarn packages through yarn add onnxruntime-react-native.