onnxruntime

История

Tianlei Wu 72186bbb71 [CUDA] Build nhwc ops by default (#22648 ) ### Description * Build cuda nhwc ops by default. * Deprecate `--enable_cuda_nhwc_ops` in build.py and add `--disable_cuda_nhwc_ops` option Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops will be disabled automatically. ### Motivation and Context In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with Tensor Cores, and this could improve performance for vision models. This is the first step to prefer NHWC for CUDA in 1.21 release. Next step is to do some tests on popular vision models. If it help in most models and devices, set `prefer_nhwc=1` as default cuda provider option.		2024-11-06 09:54:55 -08:00
..
external	Add implementation of WebGPU EP (#22591 )	2024-10-29 18:29:40 -07:00
patches	Add implementation of WebGPU EP (#22591 )	2024-10-29 18:29:40 -07:00
tensorboard	Improve dependency management (#13523 )	2022-12-01 09:51:59 -08:00
CMakeLists.txt	[CUDA] Build nhwc ops by default (#22648 )	2024-11-06 09:54:55 -08:00
CMakePresets.json	Create CMake option `onnxruntime_USE_VCPKG` (#21348 )	2024-09-10 16:39:27 -07:00
CMakeSettings.json	Fork the WinML APIs into the Microsoft namespace (#3503 )	2020-04-17 06:18:54 -07:00
EnableVisualStudioCodeAnalysis.props	Fix SDL warnings in CPU EP (#9975 )	2021-12-19 20:54:29 -08:00
Info.plist.in	Enable build dynamic framework for macOS/iOS (#7343 )	2021-04-15 16:47:53 -07:00
Sdl.ruleset	Add a Github workflow for Prefast (#15763 )	2023-05-03 11:42:51 -07:00
adjust_global_compile_flags.cmake	[JS/WebGPU] Support WASM64 (#21836 )	2024-10-24 20:21:51 -07:00
arm64x.cmake	Dev/mookerem/arm64x update (#20536 )	2024-05-07 12:50:38 -07:00
codeconv.runsettings	CMake changes (#2961 )	2020-02-03 19:33:14 -08:00
deps.txt	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
deps_update_and_upload.py	Update google benchmark to 1.8.3. (#19734 )	2024-03-01 11:01:58 -08:00
gdk_toolchain.cmake	Enable building with a GDK (#11126 )	2022-04-07 15:06:31 -07:00
hip_fatbin_insert	[MIGraphX EP/ ROCm EP] add gfx1200, gfx1201 to CMAKE_HIP_ARCHITECTURES (#22348 )	2024-10-11 17:31:36 -07:00
libonnxruntime.pc.cmake.in	cmake: support install target with generated pkg-config file (#7076 )	2021-03-22 19:36:31 -07:00
linux_arm32_crosscompile_toolchain.cmake	Add a build validation for Linux ARM64 cross-compile (#18200 )	2023-11-08 13:03:18 -08:00
linux_arm64_crosscompile_toolchain.cmake	Add a build validation for Linux ARM64 cross-compile (#18200 )	2023-11-08 13:03:18 -08:00
maccatalyst_prepare_objects_for_prelink.py	Support xcframework for mac catalyst builds. (#19534 )	2024-03-20 10:55:19 -07:00
nuget_helpers.cmake	Update nuget.exe used in WindowsAI nuget packaging so `readme` property is supported. (#22141 )	2024-09-19 19:06:47 +10:00
onnxruntime.cmake	Refactor the cmake code that is related to delay loading (#22646 )	2024-11-04 16:30:50 -08:00
onnxruntime_codegen_tvm.cmake	Use target name for flatbuffers (#13991 )	2022-12-20 11:44:02 -08:00
onnxruntime_common.cmake	Enable QNN HTP support for Node (#20576 )	2024-05-09 13:11:07 -07:00
onnxruntime_compile_triton_kernel.cmake	[CUDA] Add SparseAttention operator for Phi-3-small (#20216 )	2024-04-30 09:06:29 -07:00
onnxruntime_config.h.in	Get build working on Xcode 16 (#22168 )	2024-09-24 08:33:03 -07:00
onnxruntime_csharp.cmake	Refactor training build options (#13964 )	2023-01-03 13:28:16 -08:00
onnxruntime_flatbuffers.cmake	Rework some external targets to ease building with `-DFETCHCONTENT_FULLY_DISCONNECTED=ON` (#15323 )	2023-04-03 17:45:12 -07:00
onnxruntime_framework.cmake	Adding CUDNN Frontend and use for CUDA NN Convolution (#19470 )	2024-08-02 15:16:42 -07:00
onnxruntime_framework.natvis	[C#, CPP] Introduce Float16/BFloat16 support and tests for C#, C++ (#16506 )	2023-07-14 10:46:52 -07:00
onnxruntime_fuzz_test.cmake	[Fuzzer] Add two new ORT libfuzzer (Linux clang support for now) (#22055 )	2024-09-12 11:50:34 -07:00
onnxruntime_graph.cmake	[Apple framework] Fix minimal build with training enabled. (#19858 )	2024-03-12 11:33:30 -07:00
onnxruntime_ios.toolchain.cmake	Support visionos build (#20365 )	2024-04-23 18:15:07 -07:00
onnxruntime_java.cmake	Remove deprecated "mobile" packages (#20941 )	2024-06-07 16:20:32 -05:00
onnxruntime_java_unittests.cmake	[Java] Add API for appending QNN EP (#22208 )	2024-10-01 10:18:04 -07:00
onnxruntime_kernel_explorer.cmake	[ROCm] prefer hip interfaces over roc during hipify (#22394 )	2024-10-14 20:34:03 -07:00
onnxruntime_lora.cmake	Multi-Lora support (#22046 )	2024-09-30 15:59:07 -07:00
onnxruntime_mlas.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_nodejs.cmake	Initial WebGPU EP checkin (#22318 )	2024-10-08 16:10:46 -07:00
onnxruntime_objectivec.cmake	Initial WebGPU EP checkin (#22318 )	2024-10-08 16:10:46 -07:00
onnxruntime_opschema_lib.cmake	Use target name for flatbuffers (#13991 )	2022-12-20 11:44:02 -08:00
onnxruntime_optimizer.cmake	Flash attention recompute (#20603 )	2024-05-21 13:38:19 +08:00
onnxruntime_providers.cmake	Initial WebGPU EP checkin (#22318 )	2024-10-08 16:10:46 -07:00
onnxruntime_providers_acl.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_armnn.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_azure.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_cann.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_coreml.cmake	Fix Objective-C static analysis warnings. (#20417 )	2024-04-24 11:48:29 -07:00
onnxruntime_providers_cpu.cmake	Initial WebGPU EP checkin (#22318 )	2024-10-08 16:10:46 -07:00
onnxruntime_providers_cuda.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_dml.cmake	Refactor the cmake code that is related to delay loading (#22646 )	2024-11-04 16:30:50 -08:00
onnxruntime_providers_dnnl.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_js.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_migraphx.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_nnapi.cmake	Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723 )	2024-03-12 10:55:49 +10:00
onnxruntime_providers_openvino.cmake	Ovep develop lnl 1.2 (#22424 )	2024-10-14 12:10:01 -07:00
onnxruntime_providers_qnn.cmake	Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723 )	2024-03-12 10:55:49 +10:00
onnxruntime_providers_rknpu.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_rocm.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_tensorrt.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_tvm.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_vitisai.cmake	[VitisAI] remove wrong error msg, required by Microsoft (#21715 )	2024-08-21 21:10:28 -07:00
onnxruntime_providers_vsinpu.cmake	Remove nsync (#20413 )	2024-10-21 15:32:14 -07:00
onnxruntime_providers_webgpu.cmake	Add implementation of WebGPU EP (#22591 )	2024-10-29 18:29:40 -07:00
onnxruntime_providers_webnn.cmake	Split onnxruntime_providers.cmake to multiple (#17853 )	2023-10-09 20:33:44 -07:00
onnxruntime_providers_xnnpack.cmake	Make partitioning utils QDQ aware so it does not break up QDQ node units (#19723 )	2024-03-12 10:55:49 +10:00
onnxruntime_python.cmake	Refactor the cmake code that is related to delay loading (#22646 )	2024-11-04 16:30:50 -08:00
onnxruntime_rocm_hipify.cmake	[ROCm] redo hipify of version controlled files (#22449 )	2024-10-18 12:40:54 -07:00
onnxruntime_session.cmake	Multi-Lora support (#22046 )	2024-09-30 15:59:07 -07:00
onnxruntime_snpe_provider.cmake	Use target name for flatbuffers (#13991 )	2022-12-20 11:44:02 -08:00
onnxruntime_training.cmake	Multi-Lora support (#22046 )	2024-09-30 15:59:07 -07:00
onnxruntime_unittests.cmake	Add implementation of WebGPU EP (#22591 )	2024-10-29 18:29:40 -07:00
onnxruntime_util.cmake	Improve dependency management (#13523 )	2022-12-01 09:51:59 -08:00
onnxruntime_visionos.toolchain.cmake	Support visionos build (#20365 )	2024-04-23 18:15:07 -07:00
onnxruntime_webassembly.cmake	[JS/WebGPU] Support WASM64 (#21836 )	2024-10-24 20:21:51 -07:00
precompiled_header.cmake	Fix Windows Store build (#8753 )	2021-08-23 11:19:03 -07:00
riscv64.toolchain.cmake	Enable RISC-V 64-bit Cross-Compiling Support for ONNX Runtime on Linux (#19238 )	2024-01-24 16:27:05 -08:00
set_winapi_family_desktop.h	Fix WCOS/Win32 linking bugs (#3126 )	2020-03-19 08:52:40 -07:00
target_delayload.cmake	Refactor the cmake code that is related to delay loading (#22646 )	2024-11-04 16:30:50 -08:00
uwp_stubs.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
vcpkg-configuration.json	Auto regenerate LORA's fbs files (#22313 )	2024-10-04 10:01:19 -07:00
vcpkg.json	Create CMake option `onnxruntime_USE_VCPKG` (#21348 )	2024-09-10 16:39:27 -07:00
wcos_rules_override.cmake	Stop using apiset in OneCore build: use onecoreuap.lib instead of onecoreuap_apiset.lib (#19632 )	2024-02-23 22:31:57 -08:00
winml.cmake	Change libonnxruntime.so's SONAME: remove the minor and patch version. (#21339 )	2024-07-15 14:21:34 -07:00
winml_cppwinrt.cmake	Fix Windows Store build (#8753 )	2021-08-23 11:19:03 -07:00
winml_sdk_helpers.cmake	Merge windowsai (winml layering) into master (#2956 )	2020-02-04 17:12:19 -08:00
winml_unittests.cmake	Multi-Lora support (#22046 )	2024-09-30 15:59:07 -07:00