onnxruntime

Граф коммитов

Автор	SHA1	Сообщение	Дата
Dmitri Smirnov	754dba2674	Change to std::fill (#21759 ) ### Description Replace `memset(0)` with `std::fill(T{})`. This would ensure that all the types are initialized in a portable way. ### Motivation and Context Some platforms exhibit intermittent failures with NaN results. Follow up to: https://github.com/microsoft/onnxruntime/pull/21525 Cc: @ranjitshs	2024-08-15 16:16:54 -07:00
Tianlei Wu	b9f3a5d5b6	Exclude cudnn 8 DLLs from manylinux package (#21746 ) ### Description It is a follow up of https://github.com/microsoft/onnxruntime/pull/21738 to exclude cudnn 8 DLLs since some python packaging pipelines (like training package) are still using cudnn 8.9 and cuda 11.8. ### Motivation and Context Size of python package for training pipeline increases a lot due to some DLLs are added to package: ![image](https://github.com/user-attachments/assets/643a808e-760b-4382-ba55-57d7d722ee9a)	2024-08-15 07:48:42 -07:00
Yi Zhang	8a59b4dc4b	Move Python Training CUDA 12.2 pipeline to another pool. (#21745 ) ### Description <!-- Describe your changes. --> ### Motivation and Context [Python Training CUDA 12.2 pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary) has been always cancelled by remote provider since Aug 2nd. But other workflows with the same pool haven't this issue. It looks like there're some weird things in Azure devops. It works by using another pool. In fact, the SKU is smaller than the old. ### Verification https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary	2024-08-15 17:31:56 +08:00
Tianlei Wu	212bcc9967	Exclude cuDNN 9 and CUDA 12 DLLs from manylinux wheel (#21738 ) ### Description Exclude cuDNN 9 and CUDA 12 DLLs from manylinux wheel to reduce python package size. ### Motivation and Context The 1.20.0 ort-nightly-gpu python wheels on linux are suddenly > 800 MB in size. The wheels built on 1.19 release branch have a size of around 220 MB. The size change is caused by https://github.com/microsoft/onnxruntime/pull/19470.	2024-08-15 00:03:10 -07:00
Yulong Wang	abdc31de40	[js] change default formatter for JavaScript/TypeScript from clang-format to Prettier (#21728 ) ### Description See `454996d496` for manual changes (excluded auto-generated formatting changes) ### Why Because the toolsets for old clang-format is out-of-date. This reduces the development efficiency. - The NPM package `clang-format` is already in maintenance mode. not updated since 2 years ago. - The VSCode extension for clang-format is not maintained for a while, and a recent Node.js security update made it not working at all in Windows. No one in community seems interested in fixing those. Choose Prettier as it is the most popular TS/JS formatter. ### How to merge It's easy to break the build: - Be careful of any new commits on main not included in this PR. - Be careful that after this PR is merged, other PRs that already passed CI can merge. So, make sure there is no new commits before merging this one, and invalidate js PRs that already passed CI, force them to merge to latest.	2024-08-14 16:51:22 -07:00
Satya Kumar Jandhyala	6d8de1f7b8	Upgrade emsdk from 3.1.59 to 3.1.62 (#21421 ) ### Description Upgrade EM SDK to 3.1.62. ### Motivation and Context The changes are required to clear wasm64 errors.	2024-08-14 12:38:52 -07:00
Guenther Schmuelling	d82f15d0e3	add Gelu opset-20 to webgpu (#21725 ) https://github.com/microsoft/onnxruntime/issues/21618	2024-08-14 09:45:05 -07:00
Frank Dong	a0708a0d96	avoid redundant memory allocation for external initializers (#21682 ) ### Description avoid redundant memory allocation for external initializers, we will use mmap for external initializers later so no point to allocate memory in advance then release them later. ### Motivation and Context In current implementation, we will: 1. Allocate memory (with desired size of current initializer) for initializer first: [https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/framework/session_state_utils.cc#L131](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fonnxruntime%2Fblob%2Fmain%2Fonnxruntime%2Fcore%2Fframework%2Fsession_state_utils.cc%23L131&data=05%7C02%7Cfrdong%40microsoft.com%7C1e126797c95149aa217d08dcb781cc60%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638587015340041125%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=6fN57MUsergrCX%2BBS7jztWBRmc8nx19EVvn0lUJ2Gtk%3D&reserved=0) 2. For external initializer, we will point initializer to mmaped object in memory and release previously allocated tensor: [https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/framework/session_state_utils.cc#L89](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fonnxruntime%2Fblob%2Fmain%2Fonnxruntime%2Fcore%2Fframework%2Fsession_state_utils.cc%23L89&data=05%7C02%7Cfrdong%40microsoft.com%7C1e126797c95149aa217d08dcb781cc60%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638587015340054491%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=yBtXLc%2Bhpx3IT1%2FX0664foqQ5X5O%2Fy5XNhj4Oed%2BAt4%3D&reserved=0) For large models, we are keep allocating and release memory for external initializers which seems unnecessary. For phi silica model, with this change we can reduce transient memory usage from 4,566MB to 2,724MB. Since these redundant memory is released quickly when we mmap external initializers so this change has no much impact on peak memory usage.	2024-08-13 23:13:49 -07:00
Xu Xing	7172aff1cf	[js/webgpu] Fix max pool shape end with 0 (#21698 ) Bug: https://github.com/microsoft/onnxruntime/issues/21386 ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-13 20:59:24 -07:00
Prathik Rao	e32e3575d8	pin pytorch lightning version for training CI (#21731 ) ### Description <!-- Describe your changes. --> Pins pytorch-lightning package to version 2.3.3 since version >=2.4.0 requires torch > 2.1.0 which is not compatible with cu118. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> ORT 1.19 Release Preparation	2024-08-13 20:04:56 -07:00
Yi Zhang	b92908e197	[Fix] Python API doc generation (#21717 ) ### Description <!-- Describe your changes. --> ### Motivation and Context Make Python API doc generation workflow work. ### Verification Run https://github.com/microsoft/onnxruntime/actions/runs/10364762858	2024-08-14 08:48:29 +08:00
Dmitri Smirnov	c2911bbb1c	[CUDA] Special case for K==0 in CUDA MatMul (#21525 ) ### Description This change addresses a case where we multiply two matrices, and their inner dimension is 0. numpy and Eigen which is being used in our CPU EP implementation correctly handle this case and output a [M, N] matrix filled with zeros. ### Motivation and Context This is required to support GenAI empty input Lora implementation. Addresses: https://github.com/microsoft/onnxruntime/issues/21483	2024-08-13 11:27:05 -07:00
Scott McKay	6af5394bd7	Replace usage of jcenter in React Native build.gradle files (#21714 ) ### Description <!-- Describe your changes. --> Replace jcenter. It's deprecated and not responding. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix CIs	2024-08-13 11:10:51 -07:00
liqun Fu	3439429717	Fix neural-speed ci failure (#21694 ) ### Description fix https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1461029&view=logs&j=3565c00d-48fa-5c65-7ab9-a05e12e29ed0&t=e43fe03a-689e-5dc5-9ad5-9f116eba3e9d&l=6341 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2024-08-13 10:48:25 -07:00
xhcao	9c6ee89fa7	[js/webgpu] fix two errors of attention operator (#21687 ) Fix two issues: (1) scale shall be fp32 instead of f16 (2) Softmax program does not handle the normalized dispatch group values, so if the sequence length is over 65535, the result is not correct for this program.	2024-08-13 09:42:34 -07:00
Yi Zhang	6db3d63add	move the A100 stage to main build (#21722 ) ### Description <!-- Describe your changes. --> ### Motivation and Context We couldn't get enough A100 agent time to finish the jobs since today. The PR makes the A100 job only runs in main branch to unblock other PRs if it's not recovered in a short time.	2024-08-13 22:48:58 +08:00
George Wu	a8462ffb61	enable qnn python arm64ec packaging (#21575 ) create the x64 qnn python package as arm64ec so it can be published publicly.	2024-08-12 22:43:17 -07:00
Sumit Agarwal	c5592fdcef	[DML EP] Update DML to 1.15.1 (#21695 ) ### Description Update DML runtime binary to 1.15.1 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-12 14:16:43 -07:00
jingyanwangms	154084efaa	Security Fuzz Test Fixes (#21608 ) ### Description Fix address sanitizer and memory access Bug 1, 4, 5, 7, 8 found in security fuzz test ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-11 03:28:41 -07:00
Yulong Wang	6ae7e02d34	Web CI: make multi-browser test job optional (#21669 ) ### Description This job is a little bit unstable. Make it optional to avoid blocking other PRs before we revise it.	2024-08-09 23:53:26 -07:00
Chi Lo	2abebb2a47	[TensorRT EP] No workspace size limit to TRT memory pool (#21643 ) We saw some models failed to run due to OOM and can be fixed by increase trt_max_workspace_size. This PR makes no size limitation by default (max device memory) which is aligned with trtexec.	2024-08-09 17:30:51 -07:00
Caroline Zhu	eeef0c8aca	Enable exporting for inference when loading from buffer without behavior changes (#21601 ) ### Description Added eval model buffer as optional field in Module so that you can export for inference using the eval model stored as a buffer. ### Motivation and Context - Resolves #21152 - Previous solution (PR #21422) produced an eval model that was specific to the EP's used to train because of unavoidable runtime optimizations that changed the graph stored with the eval session.	2024-08-09 16:59:50 -07:00
Krishna Bindumadhavan	37be90c9c8	[Quant tool]: Improve symmetric quantization range update for Relu/Clip (#21573 ) ### Description This PR improves the range calculation for input to Relu/Clip nodes for the symmetric quantization case. ### Motivation and Context Currently, the issue we face is that for the common scenario of conv followed by relu in the symmetric quantization config, different scales could assigned for the tensors corresponding to input & output of relu. The downside is that this may introduce noise due to multiple re-quant, and makes it difficult to fuse conv-relu nodes for hardware accelerators that support fused conv-relu. Instead, it is more efficient to assign the output range of relu as the input range of relu / output range of upstream op wherever possible. This adjustment is currently only being done for the asymmetric quantization case. For the scenario where the upstream op has multiple consumers, this assumption could be incorrect. For this case we do not adjust the ranges.	2024-08-09 14:48:09 -07:00
Adrian Lizarraga	390f0fd8ce	[QNN Quant tool] Fix validation of per-channel overrides for models with external data (#21656 ) ### Description Fixes validation of per-channel quantization overrides by not trying to unnecessary load the external weights. ### Motivation and Context The `get_qnn_qdq_config()` explicitly loads models without external data (i.e., `onnx.load_model(load_external_data=False)`). Afterwards, `get_qnn_qdq_config()` calls `tensor_proto_to_array()`, which expects that the external weights are stored in the current working directory. If the external weights are stored in a different directory, then we get a crash. Loading the actual weight values is unnecessary because we only need the weight shape. This PR removes the unnecessary call to `tensor_proto_to_array()` call.	2024-08-09 14:46:52 -07:00
Satya Kumar Jandhyala	51b2044120	[JS/WebGPU] Add Dequantizelinear operator (#21642 ) ### Description Added DequantizeLinear operator for JSEP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-09 14:44:19 -07:00
Yifan Li	906ae77eea	[TensorRT EP] Add null_ptr check to avoid crash when running session which was failed to generate trt_engine previously (#21621 ) ### Description <!-- Describe your changes. --> Add null_ptr check to avoid crash when running session which was failed to generate trt_engine previously ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Reported and verified by https://github.com/microsoft/onnxruntime/issues/21567	2024-08-09 14:09:22 -07:00
saurabh	88788474b9	fix handling of multiple QuantizeLinear nodes (#21675 ) ### Description This fix addresses the issue of handling multiple QLinear nodes as outputs from the target node in OVEP. Previously, the stripping logic only supported a single Q node, leading to incorrect stripping of additional Q nodes. ### Motivation and Context The OVEP stripping logic was limited to handling a single Q node as an output from the target node. As a result, additional Q nodes were being stripped, despite the stripping rules indicating they should be retained. With this fix, OVEP can now properly handle multiple Q nodes according to the specified stripping rules, ensuring that the fate of each Q node is correctly determined. --------- Co-authored-by: sfatimar <sahar.fatima@intel.com>	2024-08-09 14:04:05 -07:00
Jing Fang	53a66f4e02	When quantize 4bit mamtul, force upgrade onnx domain opset to 21 (#21693 ) ### Description When quantize MatMul to DQ + MatMul using 4bit QDQ tool chain, previously the opsets of domains are not changed. Now, when quantize MatMul to DQ + MatMul in QDQ format, force upgrade onnx domain to opset 21. ### Motivation and Context In QDQ format, DQ with int4 and blocked quantization is used. This requires DQ with opset >= 21. When quantize MatMul to DQ + MatMul, force upgrade onnx domain to opset 21.	2024-08-09 13:50:12 -07:00
duanshengliu	c6a73defb8	Fix wrong per-tensor quantized weight type for matmul (#21347 ) ### Description <!-- Describe your changes. --> Fix wrong per-tensor quantized weight type for matmul. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix related bug as described in https://github.com/microsoft/onnxruntime/issues/21346	2024-08-09 13:36:25 -07:00
Jing Fang	f30581ed2c	[CPU EP] Add block quantized Gather contrib op (#21630 ) ### Description Add a gather that supports block-quantized input data. ### Motivation and Context To support Web inference scenario with quantized vocabulary embeddings.	2024-08-09 12:15:11 -07:00
Sumit Agarwal	702b2e28e0	Fuse Pad even if Cast is present in-between (#21640 ) ### Description This change enhances the existing Pad Fusion to fuse Pad even if a Cast operator is present between Pad and Conv/MaxPool/AveragePool. It keeps the Cast as it is. <pre> /* * Before Fusion: * Pad * \| * Cast (Optional) * \| * Conv/MaxPool/AveragePool * * After Fusion: * Cast (Optional) * \| * Conv/MaxPool/AveragePool */ </pre> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-09 06:52:59 -07:00
Yulong Wang	e6e4047a77	[js/web] update the build script for webgpu to enable model dump by default (#19707 ) ### Description update the build script for webgpu to enable model dump by default Now if using build_jsep.bat to build debug, the model dump is enabled. Using [`optimizedModelFilePath`](https://onnxruntime.ai/docs/api/js/interfaces/InferenceSession.SessionOptions.html#optimizedModelFilePath) in session option can dump the optimized model in browser ### Motivation and Context Helps to debug/rule out problems may related to model optimizer.	2024-08-09 05:55:34 -07:00
Yulong Wang	f4ec85259a	[js/web] allow relative path matching (#21657 ) ### Description <!-- Describe your changes. --> This change allows to match external data path like `a.data` to `./a.data`. <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-09 03:13:40 -07:00
Yulong Wang	ae2b4d31ea	update pipeline list for run_CIs_for_external_pr.py (#21665 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-09 03:08:47 -07:00
Tianlei Wu	9334d4e362	[CUDA] Fix MHA mask (#21655 ) ### Description Fix a check of mask type introduced by me in a recent commit. Add tests.	2024-08-09 01:31:00 -07:00
Scott McKay	410ae94e9e	Use zipped xcframework in nuget package (#21663 ) ### Description <!-- Describe your changes. --> The xcframework now uses symlinks to have the correct structure according to Apple requirements. Symlinks are not supported by nuget on Windows. In order to work around that we can store a zip of the xcframeworks in the nuget package. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix nuget packaging build break	2024-08-09 17:38:18 +10:00
Tianlei Wu	a46e49b439	Unblock migraphx and linux GPU training ci pipelines (#21662 ) ### Description * Fix migraphx build error caused by https://github.com/microsoft/onnxruntime/pull/21598: Add a conditional compile on code block that depends on ROCm >= 6.2. Note that the pipeline uses ROCm 6.0. Unblock orttraining-linux-gpu-ci-pipeline and orttraining-ortmodule-distributed and orttraining-amd-gpu-ci-pipeline pipelines: * Disable a model test in linux GPU training ci pipelines caused by https://github.com/microsoft/onnxruntime/pull/19470: Sometime, cudnn frontend throws exception that cudnn graph does not support a Conv node of keras_lotus_resnet3D model on V100 GPU. Note that same test does not throw exception in other GPU pipelines. The failure might be related to cudnn 8.9 and V100 GPU used in the pipeline (Amper GPUs and cuDNN 9.x do not have the issue). The actual fix requires fallback logic, which will take time to implement, so we temporarily disable the test in training pipelines. * Force install torch for cuda 11.8. (The docker has torch 2.4.0 for cuda 12.1 to build torch extension, which it is not compatible cuda 11.8). Note that this is temporary walkround. More elegant fix is to make sure right torch version in docker build step, that might need update install_python_deps.sh and corresponding requirements.txt. * Skip test_gradient_correctness_conv1d since it causes segment fault. Root cause need more investigation (maybe due to cudnn frontend as well). * Skip test_aten_attention since it causes assert failure. Root cause need more investigation (maybe due to torch version). * Skip orttraining_ortmodule_distributed_tests.py since it has error that compiler for torch extension does not support c++17. One possible fix it to set the following compile argument inside setup.py of extension fused_adam: extra_compile_args['cxx'] = ['-std=c++17']. However, due to the urgency of unblocking the pipelines, just disable the test for now. * skip test_softmax_bf16_large. For some reason, torch.cuda.is_bf16_supported() returns True in V100 with torch 2.3.1, so the test was run in CI, but V100 does not support bf16 natively. * Fix typo of deterministic ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-08 19:44:15 -07:00
Yulong Wang	5e66fcc703	[js/web] allow op test to use f16 type for inputs/outputs (#21664 ) ### Description allow op test to use f16 type for inputs/outputs. This PR introduces "@petamoriken/float16" as Float16Array polyfill but restricts it to be only used for test runner.	2024-08-08 09:56:37 -07:00
Scott McKay	d616025884	Match changes in gh-pages PR (#21628 ) ### Description <!-- Describe your changes. --> Update to match #21627 and make the info for Split consistent. As a Split that doesn't split anything is a no-op it doesn't seem meaningful to call that limitation out in the docs. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-08 10:29:15 +10:00
Xiang Zhang	c93b92a43f	fix wrong check for tree ensemble regressor (#21595 ) Fix missed ORT_ENFORCE check which caused heap buffer overflow because of out of bound access.	2024-08-07 16:27:18 -07:00
Yi Zhang	621b16f478	Pin transformer and optimum version (#21650 ) ### Description <!-- Describe your changes. --> ### Motivation and Context To fix whisper test failure	2024-08-07 17:47:15 +08:00
duanshengliu	b95aa0563f	Improve speed in combining per-channel data (#21563 ) ### Description <!-- Describe your changes. --> Improve speed in combining `per-channel` data for using a single `np.concatenate` instead of multiple `np.concatenates` within a for loop. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix the issue https://github.com/microsoft/onnxruntime/issues/21562 Signed-off-by: duansheng.liu <44742794+duanshengliu@users.noreply.github.com>	2024-08-06 16:23:20 -07:00
Edward Chen	4ad87ca2e1	Fix usability checker CoreML config file path. (#21626 ) Fix usability checker CoreML config file path. The files got renamed but one place was still referring to the old name.	2024-08-06 12:42:57 -07:00
Adrian Lizarraga	0acefc7988	[QNN EP] Update QNN SDK to 2.25 (#21623 ) ### Description - Update pipelines to use QNN SDK 2.25 by default - Update ifdef condition to apply workaround for QNN LayerNorm validation bug to QNN SDK 2.25 (as well as 2.24) ### Motivation and Context Use the latest QNN SDK	2024-08-06 09:08:48 -07:00
Yi Zhang	0d1da41ca8	Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612 ) ### Description Improve docker commands to make docker image layer caching works. It can make docker building faster and more stable. So far, A100 pool's system disk is too small to use docker cache. We won't use pipeline cache for docker image and remove some legacy code. ### Motivation and Context There are often an exception of ``` 64.58 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail 286.4 curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2) ``` Because Onnxruntime pipeline have been sending too many requests to download Nodejs in docker building. Which is the major reason of pipeline failing now In fact, docker image layer caching never works. We can always see the scrips are still running ``` #9 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #9 0.234 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /tmp/scripts/install_centos.sh: line 1: !/bin/bash: No such file or directory #9 0.235 ++ '[' '!' -f /etc/yum.repos.d/microsoft-prod.repo ']' #9 0.236 +++ tr -dc 0-9. #9 0.236 +++ cut -d . -f1 #9 0.238 ++ os_major_version=8 .... #9 60.41 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail #9 60.59 + return 0 ... ``` This PR is improving the docker command to make image layer caching work. Thus, CI won't send so many redundant request of downloading NodeJS. ``` #9 [2/5] ADD scripts /tmp/scripts #9 CACHED #10 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #10 CACHED #11 [4/5] RUN adduser --uid 1000 onnxruntimedev #11 CACHED #12 [5/5] WORKDIR /home/onnxruntimedev #12 CACHED ``` ###Reference https://docs.docker.com/build/drivers/ --------- Co-authored-by: Yi Zhang <your@email.com>	2024-08-06 21:37:09 +08:00
liqun Fu	f6f9657fb6	Fix typos so to call correct vnni functions under vnni condition (#21625 ) ### Description Fix 2 typos in mlas avx 4bit gemm implementation to call correct vnni functions under vnni condition ### Motivation and Context needed for 1.19.0 release Signed-off-by: liqunfu <liqun.fu@microsoft.com>	2024-08-05 20:52:26 -07:00
Yifan Li	1f907a23f0	[EP Perf] Update cmake (#21624 ) ### Description <!-- Describe your changes. --> update script with cmake 3.30 to unblock EP Perf ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-08-05 16:41:56 -07:00
Edward Chen	a5ce65d87a	Clean up some mobile package related files and their usages. (#21606 ) The mobile packages have been removed.	2024-08-05 16:38:20 -07:00
Scott McKay	bcc01ac123	Updates to apple packaging (#21611 ) ### Description <!-- Describe your changes. --> Add ability to test packaging without rebuilding every time. Add ability to comment out some platforms/architectures without the scripts to assemble the c/obj-c packages breaking. Update a couple of commands to preserve symlinks. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make debugging packaging issues faster. Creates correct package for mac-catalyst and doesn't require setting symlinks via bash script.	2024-08-06 08:50:56 +10:00
Prathik Rao	134f47743e	bumps up version in main from 1.19 -> 1.20 (#21588 ) Bump up version in main from 1.19.0 to 1.20.0 since the release branch has been cut.	2024-08-05 15:46:04 -07:00

... 2 3 4 5 6 ...

11643 Коммитов Все ветки Поиск

11643 Коммитов

Все ветки