Although DXC applied the upstream change, Reassociate: add global reassociation algorithm (llvm/llvm-project@b8a330c) in this PR (https://github.com/microsoft/DirectXShaderCompiler/pull/6598), it still might overlook some obvious common factors.
One case has been observed is:
%Float4_0 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %1, i32 1)
%Float4_0.w = extractvalue %dx.types.CBufRet.f32 %Float4_0, 3
%Float2_0 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %1, i32 0)
%Float2_0.y = extractvalue %dx.types.CBufRet.f32 %Float2_0, 1
/* %Float4_1 is redundant with %Float4_0 since they invokes cbufferLoadLegacy with same parameters */
%Float4_1 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %1, i32 1)
/* %Float4_1.w is redundant with %Float4_0.w */
%Float4_1.w = extractvalue %dx.types.CBufRet.f32 %Float4_1, 3
/* %Float2_1 is redundant with %Float2_0 since they invokes cbufferLoadLegacy with same parameters */
%Float2_1 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 59, %dx.types.Handle %1, i32 0)
/* %Float2_1.y is redundant with %Float2_0.y */
%Float2_1.y = extractvalue %dx.types.CBufRet.f32 %Float2_1, 1
....
%11 = fmul fast float %Float4_0.w, %10
%12 = fmul fast float %11, %Float2_0.y
....
%14 = fmul fast float %Float4_1.w, %13
%15 = fmul fast float %14, %Float2_1.y
(%Float4_0.w * %Float2_0.y) equals to (%Float4_1.w * %Float2_1.y), they should be reassociated to a common factor
The upstream change can't identify this common factor because DXC doesn't know (%Float4_0.w, %Float4_1.w) and (%Float2_0.y, %Float2_1.y) are redundant when running Reassociate pass. Those redundancies will be eliminated in GVN pass.
For DXC can identify more common factors, this PR will aggressively run Reassociate pass again after GVN pass and then run GVN pass again to remove the redundancies generared in this run of Reassociate pass.
Changing the order of floating point operations causes the precision issue. In case some shaders get unexpected results due to this PR, use "-opt-disable aggressive-reassociation" to disable this PR and roll back.
This is part 3 of the fix for #6593.
This PR (https://github.com/microsoft/DirectXShaderCompiler/pull/6598)
pulls the upstream global reassociation algorithm change in DXC and can
reduce redundant calculations obviously.
However, from the testing result of a large offline suite of shaders,
some shaders got worse compilation results and couldn't benefit from
this upstream change.
This PR adds a flag for the upstream global reassociation change. It
would be easier to roll back if a shader get worse compilation result
due to this upstream change.
This is part 2 of the fix for #6593.
This PR pulls the upstream change, Reassociate: add global reassociation
algorithm
(b8a330c42a),
into DXC with miminal changes.
For the code below:
foo = (a * b) * c
bar = (a * d) * c
As the upstream change states, it can identify the a*c is a common
factor and redundant.
This is part 1 of the fix for #6593.
This test was marked DISABLED_ in gtest at the time it was added in PR
#3155, so it appears that it was never passing. Specifically, the CHECK
for `DebugFunction [[func1]]` fails. I don't think it's a priority to
implement debug info for unreferenced functions at this point, so opting
to simply remove it.
This is the last test in the unsupported directory so it can now be
removed entirely.
Fixes#6616
The following tests only required minor test syntax changes to pass:
- tools/clang/test/CodeGenSPIRV/cast.2float.interlocked.hlsl
-
tools/clang/test/CodeGenSPIRV/meshshading.nv.error.fncall.amplification.vulkan1.2.hlsl
(+ replacing NV ext with EXT)
- tools/clang/test/CodeGenSPIRV/var.init.extvector.hlsl
Issue #6621 has been filed to track the failure of
tools/clang/test/CodeGenSPIRV/oo.class.static.member.hlsl.
Related to #6616
These tests were previously marked as "unsupported" because they were
misconfigured at the time they were added and never run. Minor changes
have been made to make them passing tests.
Note that rayquery_assign.cs.hlsl has been removed because it no longer
produces an error. A similar non-erroring check exists in
rayquery_init_expr.hlsl
Related to #6616
This test is marked as unsupported (ignored), but it currently fails to
compile both for SPIR-V and DXIL with the same Sema error: "cannot
implicitly convert from 'SecondStruct' to 'FirstStruct'". If it would
have succeeded for SPIR-V at some point in the past it's no longer valid
anyways, so removing it.
Related to #6616
When SimplifySwitchOnSelect calls SimplifyTerminatorOnSelect, it holds
onto the select's condition value to use for the conditional branch it
replaces the switch with. When removing the switch's unused
predecessors, it must make sure not to delete PHIs in case one of them
is used by the condition value, otherwise the condition value itself may
get deleted, resulting in an use-after-free.
Note that this was fixed in LLVM as well:
dc3b67b4ca
Given a switch with a constant condition and all cases the same
(branching to the same successor), dxil-remove-dead-blocks would
incorrectly remove the switch when replacing it with a branch, by
forgetting to remove the N-1 incoming values to the PHIs in the
successor block.
The implementation of index_is_valid was incorrect returning false when
the input index was equal to Term->getNumSuccessors(). The end iterator
will have such an index, and it is valid to construct such an iterator.
For example, assume we have a block with two successors:
```
BasicBlock* bb = ...;
SuccIterator b(bb); // Index 0
SuccIterator e(bb, true); // Index 2, for example
SuccIterator v = b;
b += 2; // Without this fix, this asserts
assert(b == e);
```
Note that this was also fixed upstream in
https://reviews.llvm.org/D47467
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Natalie Chouinard <chouinard.nm@gmail.com>
This change copes with the AS->MS payload being placed in group-shared
by the application (and MSFT's samples do indeed do this). (TIL, thanks
to pow2clk, that the spec says that the payload counts against the
group-shared total, implying, if not explicitly stating, that at least
on some platforms, the payload will be in group-shared anyway.)
The MS pass needs to be given data from the AS about the AS's thread
group topology, and this is done by extending the payload struct to add
three uints. This can't be done when the payload is resident in
group-shared, of course, because that would change the layout of
group-shared memory.
So the new approach here is to copy the payload to a new alloca (in the
default address space) struct with the members of the base struct plus
the extended data the MS needs, and then to copy piece-wise because
llvm.memcpy isn't appropriate for group-shared-to-normal address space
copies.
A follow-up change will use the PartitionedExclusiveScanNV
GroupOperation, which requires that an additional operand is added to
all GroupNonUniformArithmetic instructions. This means that some of the
SPIR-V opcodes which are currently categorized as unary will become
either unary or binary depending on the GroupOp. Since the arity
distinctions between the OpGroupNonUniform* instructions were already
somewhat arbitrary, I'm prefacing that change by refactoring them into a
single SpirvGroupNonUniformOp instruction type for better reusability.
Follow up: #6608
Two intertwined issues:
1:
The shader-debug-instrumentation and pixel-hit-counter passes add
SV_VertexId+SV_InstanceId to the input sig for a VS if they are not
present, and SV_Position for the PS. The changes in PixPassHelpers.cpp
centralize the SV_Position case and fix up a few missing fields whose
absence was breaking PIX shader debugging on WARP. The calller now has
to pass the upstream shader's SV_Position row, if it exists. (PIX has
been updated to do this.)
2:
However, independent of these tweaks, adding these new values means that
the view id metadata would be incorrect. A long-standing assert would
fire, but is herein fixed, since the above work exacerbated this
problem. The assert in question is this one in
CopyViewIDStateForOutputToPSV that previously fired when a
debug-instrumented shader was subsequently wrapped into a container:
``` DXASSERT_NOMSG((InputScalars <= IOTable.InputVectors * 4) &&
(IOTable.InputVectors * 4 - InputScalars < 4));
```
(InputScalars, which comes from the serialized state in PSV0, was too
small, since it did not include the added system values.)
To fix up these data, the caller now has to invoke the
generate-view-id-state pass, and follow that with the emit-metadata
pass, since emitting the metadata in the debug pass itself is now too
early (and that call has been deleted). (PIX has been updated to do
invoke these passes.)
These changes together are sufficient to allow PIX shader debugging to
operate on WARP in those cases where these system-values were not
included by the application, with correct view-id metadata, and the
assert is happy.
RWByteAddressBuffer has overloads for InterlockedMin and InterlockedMax
for signed ints that were failing to compile due to mismatched types in
the generated SPIR-V instruction. This adds the missing cast if
necessary.
At the same time, some redundant code is removed from the
InterlockedMin/Max intrinsic non-member functions' codegen to modify the
opcode. If it was necessary in the past, the frontend has since been
fixed and it is no longer necessary. Tests to verify these combinations
and the necessary implicit casts have also been added.
Fixes#3196
Related to #4189, #6254, #5707
If a parameter annotated with `[[vk::ext_literal]]` is not lowered to a
SpirvConstant, make a best-effort attempt to evaluate it to a constant
literal value. If unsuccessful, fail gracefully rather than crash.
Fixes#6586
As tested in the accompanying test, the DebugLoc for a class method
contains a DICompositeType as part of its scope hierarchy. Lack of a
test for this was breaking PIX shader debugging for class methods.
PIX requires that all vertex information generated by these passes be
uniquely identified by vertex id and MS thread id.
This change fixes the MS thread id part in two places: the amplification
shader and the mesh shader.
To be unique across an entire DispatchMesh call, we must uniquify the AS
thread group, the AS thread, the MS thread group and the MS thread. This
is a lot of multiplying and adding, and there wasn't quite enough math
going on here before.
In the AS case, we now generate a unique "flat" thread id from the
flat-thread-id-in-group (the already-available system value) and the
"flat group id", which we synthesize by multiplying together the group
id components with the DispatchMesh API's thread group counts, and then
multiplying that by the number of threads each AS group launches, then
add the flat-thread-id-in-group. (This flat id then goes into an
expanded version of the AS->MS payload, the code for which was
pre-existing.)
The MS will either treat the incoming AS thread id as its unique
thread-group-within-the-whole-dispatch id.
If the AS is not active, the instrumentation herein will synthesize a
flat id in the same way as the AS did before it passed that id through
the payload, again from the DispatchMesh parameters (newly-added params
to that pass) and the flat-thread-in-group.
In addition to the new filecheck tests for this, there is also a new
filecheck test to cover coercion of non-i32 types to i32 before being
written to PIX's output UAV, which I happened to notice wasn't
adequately tested.
- This change adds a switch case for TK_AccelerationStructureNV in
lowerToDebugType
- Before, compiling a shader for vulkan containing an acceleration
structure and using -fspv-debug=vulkan-with-source would cause a crash.
Fixes#5113
Also, emit a better message when the desired pass does not exist (or is
mis-spelled). Before this change, there would only be a confusing
message about 'after' being None.
When indexing a swizzled bool vector, some HLSL-specific code in
EmitCXXMemberOrOperatorMemberCallExpr kicks in to handle the
HLSLVecType. In this case, we’re dealing with an ExtVectorElt because of
the swizzle, so this function creates a GEP, Load, and Store on the
vector. However, boolean scalars are returned as type i11 while the
store is storing to a bool, which is an i32, so we need to insert a cast
before the store.
Now that WARP is actually getting bugfixes and features, and has a way
for folks to get new versions without waiting years for an OS update,
stop hiding the places where it's failing. Having these hardcoded skips
in place makes it impossible to run the tests to see if it's still
failing. If we really need these skips, we at least need some kind of
command line option to override them for debugging and fixing the
failures.
This commit adds a new test that validates the help text output contains
the -Qsource_in_debug_module option recently enabled by a previous PR
#6597 .
This is a follow up commit for Issue: #6028
Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
This commit adds the compiler option (-Qsource_in_debug_module) to the
help text.
New help text output below:
```
Utility Options:
-dumpbin Load a binary file rather than compiling
-extractrootsignature Extract root signature from shader bytecode (must be used with /Fo <file>)
-getprivate <file> Save private data from shader blob
-link Link list of libraries provided in <inputs> argument separated by ';'
-P Preprocess to file
-Qembed_debug Embed PDB in shader container (must be used with /Zi)
-Qsource_in_debug_module
Embed source info in PDB
-Qstrip_debug Strip debug information from 4_0+ shader bytecode (must be used with /Fo <file>)
-Qstrip_priv Strip private data from shader bytecode (must be used with /Fo <file>)
-Qstrip_reflect Strip reflection data from shader bytecode (must be used with /Fo <file>)
-Qstrip_rootsignature Strip root signature data from shader bytecode (must be used with /Fo <file>)
-setprivate <file> Private data to add to compiled shader blob
-setrootsignature <file>
Attach root signature to shader bytecode
-verifyrootsignature <file>
Verify shader bytecode with root signature
```
fixes#6028
---------
Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
This adds new literal warnings to identify integer literals that may
have breaking behavior changes between HLSL 2021 and 202x as a result of
the conforming literals change.
The spec update for this is in:
https://github.com/microsoft/hlsl-specs/pull/229Resolves#6581
ReplaceConstantWithInst(C, V) replaces uses of C in the current function
with V. If such a use C is an instruction I, the it replaces uses of C
in I with V. However, this function did not make sure to only perform
this replacement if V dominates I. As a result, it may end up replacing
uses of C in instructions before the definition of V.
The fix is to lazily compute the dominator tree in
ReplaceConstantWithInst so that we can guard the replacement with that
dominance check.
Adds support for the `WaveMatch()` intrinsic function from Shader Model
6.5 using the `OpGroupNonUniformPartitionNV` instruction from the
`SPV_NV_shader_subgroup_partitioned` extension.
SPIRV-Tools bumped to include:
https://github.com/KhronosGroup/SPIRV-Tools/pull/5648Fixes#6545
This adds the Clang 3.8 `-Wdouble-promotion` warning group to DXC. This
is added as part of the mitigations for the HLSL 202x conforming
literals feature which (among other things) changes the conversion rank
of unsuffixed literal values.
Resolves#6571
This change removes the IsOverloadLegal check in OP::GetOpFunc.
It will permit the generation of illegal DXIL operations. Subsequently,
the validation should catch these illegal DXIL operations if they are
not optimized later in SimplifyDxilCall.
Fixes#6410
Bumps [black](https://github.com/psf/black) from 23.9.1 to 24.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/releases">black's
releases</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
<h2>24.2.0</h2>
<h3>Stable style</h3>
<ul>
<li>Fixed a bug where comments where mistakenly removed along with
redundant parentheses
(<a
href="https://redirect.github.com/psf/black/issues/4218">#4218</a>)</li>
</ul>
<h3>Preview style</h3>
<ul>
<li>Move the <code>hug_parens_with_braces_and_square_brackets</code>
feature to the unstable style
due to an outstanding crash and proposed formatting tweaks (<a
href="https://redirect.github.com/psf/black/issues/4198">#4198</a>)</li>
<li>Fixed a bug where base expressions caused inconsistent formatting of
** in tenary
expression (<a
href="https://redirect.github.com/psf/black/issues/4154">#4154</a>)</li>
<li>Checking for newline before adding one on docstring that is almost
at the line limit
(<a
href="https://redirect.github.com/psf/black/issues/4185">#4185</a>)</li>
<li>Remove redundant parentheses in <code>case</code> statement
<code>if</code> guards (<a
href="https://redirect.github.com/psf/black/issues/4214">#4214</a>).</li>
</ul>
<h3>Configuration</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/black/blob/main/CHANGES.md">black's
changelog</a>.</em></p>
<blockquote>
<h2>24.3.0</h2>
<h3>Highlights</h3>
<p>This release is a milestone: it fixes Black's first CVE security
vulnerability. If you
run Black on untrusted input, or if you habitually put thousands of
leading tab
characters in your docstrings, you are strongly encouraged to upgrade
immediately to fix
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.</p>
<p>This release also fixes a bug in Black's AST safety check that
allowed Black to make
incorrect changes to certain f-strings that are valid in Python 3.12 and
higher.</p>
<h3>Stable style</h3>
<ul>
<li>Don't move comments along with delimiters, which could cause crashes
(<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li>Strengthen AST safety check to catch more unsafe changes to strings.
Previous versions
of Black would incorrectly format the contents of certain unusual
f-strings containing
nested strings with the same quote type. Now, Black will crash on such
strings until
support for the new f-string syntax is implemented. (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li>Fix a bug where line-ranges exceeding the last code line would not
work as expected
(<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
</ul>
<h3>Performance</h3>
<ul>
<li>Fix catastrophic performance on docstrings that contain large
numbers of leading tab
characters. This fixes
<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2024-21503">CVE-2024-21503</a>.
(<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Note what happens when <code>--check</code> is used with
<code>--quiet</code> (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
</ul>
<h2>24.2.0</h2>
<h3>Stable style</h3>
<ul>
<li>Fixed a bug where comments where mistakenly removed along with
redundant parentheses
(<a
href="https://redirect.github.com/psf/black/issues/4218">#4218</a>)</li>
</ul>
<h3>Preview style</h3>
<ul>
<li>Move the <code>hug_parens_with_braces_and_square_brackets</code>
feature to the unstable style
due to an outstanding crash and proposed formatting tweaks (<a
href="https://redirect.github.com/psf/black/issues/4198">#4198</a>)</li>
<li>Fixed a bug where base expressions caused inconsistent formatting of
** in tenary
expression (<a
href="https://redirect.github.com/psf/black/issues/4154">#4154</a>)</li>
<li>Checking for newline before adding one on docstring that is almost
at the line limit
(<a
href="https://redirect.github.com/psf/black/issues/4185">#4185</a>)</li>
<li>Remove redundant parentheses in <code>case</code> statement
<code>if</code> guards (<a
href="https://redirect.github.com/psf/black/issues/4214">#4214</a>).</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="552baf8229"><code>552baf8</code></a>
Prepare release 24.3.0 (<a
href="https://redirect.github.com/psf/black/issues/4279">#4279</a>)</li>
<li><a
href="f000936726"><code>f000936</code></a>
Fix catastrophic performance in lines_with_leading_tabs_expanded() (<a
href="https://redirect.github.com/psf/black/issues/4278">#4278</a>)</li>
<li><a
href="7b5a657285"><code>7b5a657</code></a>
Fix --line-ranges behavior when ranges are at EOF (<a
href="https://redirect.github.com/psf/black/issues/4273">#4273</a>)</li>
<li><a
href="1abcffc818"><code>1abcffc</code></a>
Use regex where we ignore case on windows (<a
href="https://redirect.github.com/psf/black/issues/4252">#4252</a>)</li>
<li><a
href="719e67462c"><code>719e674</code></a>
Fix 4227: Improve documentation for --quiet --check (<a
href="https://redirect.github.com/psf/black/issues/4236">#4236</a>)</li>
<li><a
href="e5510afc06"><code>e5510af</code></a>
update plugin url for Thonny (<a
href="https://redirect.github.com/psf/black/issues/4259">#4259</a>)</li>
<li><a
href="6af7d11096"><code>6af7d11</code></a>
Fix AST safety check false negative (<a
href="https://redirect.github.com/psf/black/issues/4270">#4270</a>)</li>
<li><a
href="f03ee113c9"><code>f03ee11</code></a>
Ensure <code>blib2to3.pygram</code> is initialized before use (<a
href="https://redirect.github.com/psf/black/issues/4224">#4224</a>)</li>
<li><a
href="e4bfedbec2"><code>e4bfedb</code></a>
fix: Don't move comments while splitting delimiters (<a
href="https://redirect.github.com/psf/black/issues/4248">#4248</a>)</li>
<li><a
href="d0287e1f75"><code>d0287e1</code></a>
Make trailing comma logic more concise (<a
href="https://redirect.github.com/psf/black/issues/4202">#4202</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/black/compare/23.9.1...24.3.0">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=black&package-manager=pip&previous-version=23.9.1&new-version=24.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/DirectXShaderCompiler/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
In FileCheckerTest.cpp, the -T target of a %dxc part is checked for
compatibility. However, if you use -Vd, it skips the version check for
the validator. This isn't quite right since the validator version will
be set from the attached validator, unless you explicitly set
-validator-version or use -select-validator internal. Also, if
-validator-version is explicitly set, we should check the validator
against that version, not just the one based on shader model.
This change fixes the initial check logic to only skip validator check
when using -select-validator internal or -validator-version. It also
checks against an explicitly requested -validator-version when an
external validator will be used.
Fixes#6367
Tests are split between one containing cases that could potentially be
translated to legacy barrier op in the future, and a test without
translatable barrier ops.
Part of #6538.
These flags had some effect on load code generation, but never worked
right and were not tested in the DXC compiler let alone with drivers.
This removes any notion of the flags beyond option processing and
produces a warning message that the flag is not supported and will be
ignored. The flags are both not listed in help anymore either. A simple
OptionsTest and lit test verify that the warning message is produced for either
flag and existing tests that used the flag are updated or removed.
Incidentally fixed the capitalization of an existing options warning message.
Fixes#6306
Check mix packoffset elements with nonpackoffset elements in a cbuffer
at Sema::ActOnFinishHLSLBuffer. Also check packoffset for aggregate
type.
Fixes#3724
Fixed a bug in `hlsl::RemoveUnstructuredLoopExits` where when a new
exiting block is created from splitting, it was added to the current
loop being processed, when it could also part of an inner loop. Not
adding the new block to inner loops that it's part of makes the inner
loops malformed, and causes crash.
This fix adds the new block to the inner most loop that it should be
part of. Also adds the `StructurizeLoopExits` option to `loop-unroll`
pass, which was missing before.
Bumps [idna](https://github.com/kjd/idna) from 3.4 to 3.7.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/kjd/idna/releases">idna's
releases</a>.</em></p>
<blockquote>
<h2>v3.7</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix issue where specially crafted inputs to encode() could take
exceptionally long amount of time to process. [CVE-2024-3651]</li>
</ul>
<p>Thanks to Guido Vranken for reporting the issue.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/kjd/idna/compare/v3.6...v3.7">https://github.com/kjd/idna/compare/v3.6...v3.7</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/kjd/idna/blob/master/HISTORY.rst">idna's
changelog</a>.</em></p>
<blockquote>
<p>3.7 (2024-04-11)
++++++++++++++++</p>
<ul>
<li>Fix issue where specially crafted inputs to encode() could
take exceptionally long amount of time to process. [CVE-2024-3651]</li>
</ul>
<p>Thanks to Guido Vranken for reporting the issue.</p>
<p>3.6 (2023-11-25)
++++++++++++++++</p>
<ul>
<li>Fix regression to include tests in source distribution.</li>
</ul>
<p>3.5 (2023-11-24)
++++++++++++++++</p>
<ul>
<li>Update to Unicode 15.1.0</li>
<li>String codec name is now "idna2008" as overriding the
system codec
"idna" was not working.</li>
<li>Fix typing error for codec encoding</li>
<li>"setup.cfg" has been added for this release due to some
downstream
lack of adherence to PEP 517. Should be removed in a future release
so please prepare accordingly.</li>
<li>Removed reliance on a symlink for the "idna-data" tool to
comport
with PEP 517 and the Python Packaging User Guide for sdist
archives.</li>
<li>Added security reporting protocol for project</li>
</ul>
<p>Thanks Jon Ribbens, Diogo Teles Sant'Anna, Wu Tingfeng for
contributions
to this release.</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1d365e17e1"><code>1d365e1</code></a>
Release v3.7</li>
<li><a
href="c1b3154939"><code>c1b3154</code></a>
Merge pull request <a
href="https://redirect.github.com/kjd/idna/issues/172">#172</a> from
kjd/optimize-contextj</li>
<li><a
href="0394ec76ff"><code>0394ec7</code></a>
Merge branch 'master' into optimize-contextj</li>
<li><a
href="cd58a23173"><code>cd58a23</code></a>
Merge pull request <a
href="https://redirect.github.com/kjd/idna/issues/152">#152</a> from
elliotwutingfeng/dev</li>
<li><a
href="5beb28b9dd"><code>5beb28b</code></a>
More efficient resolution of joiner contexts</li>
<li><a
href="1b121483ed"><code>1b12148</code></a>
Update ossf/scorecard-action to v2.3.1</li>
<li><a
href="d516b874c3"><code>d516b87</code></a>
Update Github actions/checkout to v4</li>
<li><a
href="c095c75943"><code>c095c75</code></a>
Merge branch 'master' into dev</li>
<li><a
href="60a0a4cb61"><code>60a0a4c</code></a>
Fix typo in GitHub Actions workflow key</li>
<li><a
href="5918a0ef80"><code>5918a0e</code></a>
Merge branch 'master' into dev</li>
<li>Additional commits viewable in <a
href="https://github.com/kjd/idna/compare/v3.4...v3.7">compare
view</a></li>
</ul>
</details>
<br />
[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=idna&package-manager=pip&previous-version=3.4&new-version=3.7)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/DirectXShaderCompiler/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Wave intrinsics such as `WaveGetLaneIndex()` are invalidated at DXR
thread repacking
points such as `CallShader()`. We were however reusing the result of
`WaveGetLaneIndex()`.
Fix this by marking it as `Readonly` instead of `Readnone`.
Add a test case that also covers other wave intrisics, which are handled
correctly.
Fixes#5034.
---------
Co-authored-by: Jannik Silvanus <jasilvanus@users.noreply.github.com>
OpConvertFToU and FToS convert numerically from floating point to
integer, with rounding toward 0.0, matching the defined behavior of HLSL
without any need for an initial bitwidth conversion from floating point
to floating point, which can result in incorrect rounding behavior (see
#6501).
Note that behavior is undefined if the target type is not wide enough to
hold the converted value, however this was also true with an initial
bitcast (because an N-bit FP can hold values outside the range of an
N-bit int), and I can't come up with any case where an intial truncation
would result in correct defined behavior where this straight conversion
would be undefined since the precision will inevitably be lost anyways.
Fixes#6501