onnxruntime/cmake
Yulong Wang 7a8fa12850
Add implementation of WebGPU EP (#22591)
### Description

This PR adds the actual implementation of the WebGPU EP based on
https://github.com/microsoft/onnxruntime/pull/22318.

This change includes the following:

<details>
<summary><b>core framework of WebGPU EP</b></summary>

  - WebGPU EP factory classes for:
    - handling WebGPU options
    - creating WebGPU EP instance
    - creating WebGPU context
  - WebGPU Execution Provider classes
    - GPU Buffer allocator
    - data transfer
  - Buffer management classes
    - Buffer Manager
    - BufferCacheManager
      - DisabledCacheManager
      - SimpleCacheManager
      - LazyReleaseCacheManager
      - BucketCacheManager
  - Program classes
    - Program (base)
    - Program Cache Key
    - Program Manager
  - Shader helper classes
    - Shader Helper
    - ShaderIndicesHelper
    - ShaderVariableHelper
  - Utils
    - GPU Query based profiler
    - compute context
    - string utils
  - Miscs
    - Python binding webgpu support (basic)
 
</details>

<details>
<summary><b>Kernel implementation</b></summary>


  - onnx.ai (default opset):
- Elementwise (math): Abs, Neg, Floor, Ceil, Reciprocal, Sqrt, Exp, Erf,
Log, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh,
Tanh, Not, Cast
- Elementwise (activation): Sigmoid, HardSigmoid, Clip, Elu, Relu,
LeakyRelu, ThresholdedRelu, Gelu
- Binary (math): Add, Sub, Mul, Div, Pow, Equal, Greater,
GreaterOrEqual, Less, LessOrEqual
    - (Tensors): Shape, Reshape, Squeeze, Unsqueeze
    - Where
    - Transpose
    - Concat
    - Expand
    - Gather
    - Tile
    - Range
    - LayerNormalization
  - com.microsoft
    - FastGelu
    - MatMulNBits
    - MultiHeadAttention
    - RotaryEmbedding
    - SkipLayerNormalization
    - LayerNormalization
    - SimplifiedLayerNormalization
    - SkipSimplifiedLayerNormalization

</details>

<details>
<summary><b>Build, test and CI pipeline integration</b></summary>

  - build works for Windows, macOS and iOS
  - support onnxruntime_test_all and python node test
  - added a new unit test for `--use_external_dawn` build flag.
  - updated MacOS pipeline to build with WebGPU support
  - added a new pipeline for WebGPU Windows

</details>

This change does not include:

- Node.js binding support for WebGPU (will be a separate PR)
2024-10-29 18:29:40 -07:00
..
external Add implementation of WebGPU EP (#22591) 2024-10-29 18:29:40 -07:00
patches Add implementation of WebGPU EP (#22591) 2024-10-29 18:29:40 -07:00
tensorboard Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
CMakeLists.txt Add implementation of WebGPU EP (#22591) 2024-10-29 18:29:40 -07:00
CMakePresets.json Create CMake option `onnxruntime_USE_VCPKG` (#21348) 2024-09-10 16:39:27 -07:00
CMakeSettings.json Fork the WinML APIs into the Microsoft namespace (#3503) 2020-04-17 06:18:54 -07:00
EnableVisualStudioCodeAnalysis.props Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
Info.plist.in Enable build dynamic framework for macOS/iOS (#7343) 2021-04-15 16:47:53 -07:00
Sdl.ruleset Add a Github workflow for Prefast (#15763) 2023-05-03 11:42:51 -07:00
adjust_global_compile_flags.cmake [JS/WebGPU] Support WASM64 (#21836) 2024-10-24 20:21:51 -07:00
arm64x.cmake Dev/mookerem/arm64x update (#20536) 2024-05-07 12:50:38 -07:00
codeconv.runsettings CMake changes (#2961) 2020-02-03 19:33:14 -08:00
deps.txt Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
deps_update_and_upload.py Update google benchmark to 1.8.3. (#19734) 2024-03-01 11:01:58 -08:00
gdk_toolchain.cmake Enable building with a GDK (#11126) 2022-04-07 15:06:31 -07:00
hip_fatbin_insert [MIGraphX EP/ ROCm EP] add gfx1200, gfx1201 to CMAKE_HIP_ARCHITECTURES (#22348) 2024-10-11 17:31:36 -07:00
libonnxruntime.pc.cmake.in cmake: support install target with generated pkg-config file (#7076) 2021-03-22 19:36:31 -07:00
linux_arm32_crosscompile_toolchain.cmake Add a build validation for Linux ARM64 cross-compile (#18200) 2023-11-08 13:03:18 -08:00
linux_arm64_crosscompile_toolchain.cmake Add a build validation for Linux ARM64 cross-compile (#18200) 2023-11-08 13:03:18 -08:00
maccatalyst_prepare_objects_for_prelink.py Support xcframework for mac catalyst builds. (#19534) 2024-03-20 10:55:19 -07:00
nuget_helpers.cmake Update nuget.exe used in WindowsAI nuget packaging so `readme` property is supported. (#22141) 2024-09-19 19:06:47 +10:00
onnxruntime.cmake Fix Xcode 16 iOS build issues (#22379) 2024-10-14 09:24:38 -07:00
onnxruntime_codegen_tvm.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_common.cmake Enable QNN HTP support for Node (#20576) 2024-05-09 13:11:07 -07:00
onnxruntime_compile_triton_kernel.cmake [CUDA] Add SparseAttention operator for Phi-3-small (#20216) 2024-04-30 09:06:29 -07:00
onnxruntime_config.h.in Get build working on Xcode 16 (#22168) 2024-09-24 08:33:03 -07:00
onnxruntime_csharp.cmake Refactor training build options (#13964) 2023-01-03 13:28:16 -08:00
onnxruntime_flatbuffers.cmake Rework some external targets to ease building with `-DFETCHCONTENT_FULLY_DISCONNECTED=ON` (#15323) 2023-04-03 17:45:12 -07:00
onnxruntime_framework.cmake Adding CUDNN Frontend and use for CUDA NN Convolution (#19470) 2024-08-02 15:16:42 -07:00
onnxruntime_framework.natvis [C#, CPP] Introduce Float16/BFloat16 support and tests for C#, C++ (#16506) 2023-07-14 10:46:52 -07:00
onnxruntime_fuzz_test.cmake [Fuzzer] Add two new ORT libfuzzer (Linux clang support for now) (#22055) 2024-09-12 11:50:34 -07:00
onnxruntime_graph.cmake [Apple framework] Fix minimal build with training enabled. (#19858) 2024-03-12 11:33:30 -07:00
onnxruntime_ios.toolchain.cmake Support visionos build (#20365) 2024-04-23 18:15:07 -07:00
onnxruntime_java.cmake Remove deprecated "mobile" packages (#20941) 2024-06-07 16:20:32 -05:00
onnxruntime_java_unittests.cmake [Java] Add API for appending QNN EP (#22208) 2024-10-01 10:18:04 -07:00
onnxruntime_kernel_explorer.cmake [ROCm] prefer hip interfaces over roc during hipify (#22394) 2024-10-14 20:34:03 -07:00
onnxruntime_lora.cmake Multi-Lora support (#22046) 2024-09-30 15:59:07 -07:00
onnxruntime_mlas.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_nodejs.cmake Initial WebGPU EP checkin (#22318) 2024-10-08 16:10:46 -07:00
onnxruntime_objectivec.cmake Initial WebGPU EP checkin (#22318) 2024-10-08 16:10:46 -07:00
onnxruntime_opschema_lib.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_optimizer.cmake Flash attention recompute (#20603) 2024-05-21 13:38:19 +08:00
onnxruntime_providers.cmake Initial WebGPU EP checkin (#22318) 2024-10-08 16:10:46 -07:00
onnxruntime_providers_acl.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_armnn.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_azure.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_cann.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_coreml.cmake Fix Objective-C static analysis warnings. (#20417) 2024-04-24 11:48:29 -07:00
onnxruntime_providers_cpu.cmake Initial WebGPU EP checkin (#22318) 2024-10-08 16:10:46 -07:00
onnxruntime_providers_cuda.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_dml.cmake Delay load dxcore.dll in addition to ext-ms-win-dxcore-l1-1-0.dll (#18913) 2023-12-26 12:33:42 -08:00
onnxruntime_providers_dnnl.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_js.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_migraphx.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_nnapi.cmake Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723) 2024-03-12 10:55:49 +10:00
onnxruntime_providers_openvino.cmake Ovep develop lnl 1.2 (#22424) 2024-10-14 12:10:01 -07:00
onnxruntime_providers_qnn.cmake Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723) 2024-03-12 10:55:49 +10:00
onnxruntime_providers_rknpu.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_rocm.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_tensorrt.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_tvm.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_vitisai.cmake [VitisAI] remove wrong error msg, required by Microsoft (#21715) 2024-08-21 21:10:28 -07:00
onnxruntime_providers_vsinpu.cmake Remove nsync (#20413) 2024-10-21 15:32:14 -07:00
onnxruntime_providers_webgpu.cmake Add implementation of WebGPU EP (#22591) 2024-10-29 18:29:40 -07:00
onnxruntime_providers_webnn.cmake Split onnxruntime_providers.cmake to multiple (#17853) 2023-10-09 20:33:44 -07:00
onnxruntime_providers_xnnpack.cmake Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723) 2024-03-12 10:55:49 +10:00
onnxruntime_python.cmake Initial WebGPU EP checkin (#22318) 2024-10-08 16:10:46 -07:00
onnxruntime_rocm_hipify.cmake [ROCm] redo hipify of version controlled files (#22449) 2024-10-18 12:40:54 -07:00
onnxruntime_session.cmake Multi-Lora support (#22046) 2024-09-30 15:59:07 -07:00
onnxruntime_snpe_provider.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_training.cmake Multi-Lora support (#22046) 2024-09-30 15:59:07 -07:00
onnxruntime_unittests.cmake Add implementation of WebGPU EP (#22591) 2024-10-29 18:29:40 -07:00
onnxruntime_util.cmake Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
onnxruntime_visionos.toolchain.cmake Support visionos build (#20365) 2024-04-23 18:15:07 -07:00
onnxruntime_webassembly.cmake [JS/WebGPU] Support WASM64 (#21836) 2024-10-24 20:21:51 -07:00
precompiled_header.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
riscv64.toolchain.cmake Enable RISC-V 64-bit Cross-Compiling Support for ONNX Runtime on Linux (#19238) 2024-01-24 16:27:05 -08:00
set_winapi_family_desktop.h Fix WCOS/Win32 linking bugs (#3126) 2020-03-19 08:52:40 -07:00
target_delayload.cmake Remove Windows Store specific code 2022-03-17 23:38:14 -07:00
uwp_stubs.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
vcpkg-configuration.json Auto regenerate LORA's fbs files (#22313) 2024-10-04 10:01:19 -07:00
vcpkg.json Create CMake option `onnxruntime_USE_VCPKG` (#21348) 2024-09-10 16:39:27 -07:00
wcos_rules_override.cmake Stop using apiset in OneCore build: use onecoreuap.lib instead of onecoreuap_apiset.lib (#19632) 2024-02-23 22:31:57 -08:00
winml.cmake Change libonnxruntime.so's SONAME: remove the minor and patch version. (#21339) 2024-07-15 14:21:34 -07:00
winml_cppwinrt.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
winml_sdk_helpers.cmake Merge windowsai (winml layering) into master (#2956) 2020-02-04 17:12:19 -08:00
winml_unittests.cmake Multi-Lora support (#22046) 2024-09-30 15:59:07 -07:00