The impetus for these changes was unexplained crashes in a display
driver while attempting to create a debug-instrumented shader for PIX.
The heart of it is the new test in pixtest.cpp: use the compiler to
generate a raw UAV, and then compare the generated DXIL with what PIX
generates for the same purpose.
Some of the PIX passes need only one UAV for a module, but some need two
or more. In the latter case, the previous code was a bit loose about
what it was doing with respect to adding the UAV resource, and creating
its handles for each interested function. Most of the actual changes
herein are to do with that.
Lastly, the PIX UAV is raw at the D3D API level, but the instrumentation
had been doing non-raw writes. No driver seemed to care, but I've fixed
it anyway.
(cherry picked from commit bf24b7a54d)
This fixes a bug where the offsets for elements in vectors with 16-bit
types doesn't take into account alignment bits and PIX wouldn't display
vector element values correctly in the shader debugger. Eg. if
`-enable-16bit-types` wasn't set, the offsets for a min16float4 would be
0, 16, 32, 48 instead of 0, 32, 64, 96.
Also removed the assert in PopulateAllocaMap_StructType that was
checking whether the calculated aligned offset matches the packed offset
(from SortedMembers) because it was false for members with sizes smaller
than the alignment size.
(cherry picked from commit 84c0a09557)
Co-authored-by: gracezhang72 <gracezhang@microsoft.com>
By default, the launch type should be set to ‘Broadcast’ when diagnosing
barriers. However, the current behavior sets the default launch type to
‘Invalid,’ resulting in warnings when the launch type is not explicitly
specified as an attribute.
To address this issue, we’ll adjust the default setting to ‘Broadcast’
and thereby resolve the problem.
Fixes#6836
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
(cherry picked from commit ef043e90f3)
In the special code to handle the memcpy pattern where a constant buffer
contains a vector array that initializes a local (or static global)
scalar array for use by the shader, an invalid assumption was made that
if the memcpy dest was global, that the src is global as well.
This was not the case, and when expecting to generate constant
expressions to index the src, these generated orphaned instructions
instead, leading to invalid IR.
This fixes the issue by leveraging ReplaceConstantWithInst, and setting
the insertion point for the Builder. Now, replacement *could* fail, if
src instructions don't dominate replacement uses, so bool for replaced
all is returned from replaceScalarArrayWithVectorArray.
Another issue was that it would replace the dest for the original memcpy
with src along the way. Now, if we don't replace all uses, this turns
the memcpy into a no-op and any remaining uses are no longer coming from
src, but an undef dest instead. This was also fixed to skip this
replacement, then clean up this use if all other uses have been
successfully replaced.
Fixes#6510
(cherry picked from commit 5cfefc7d0b)
Co-authored-by: Tex Riddell <texr@microsoft.com>
The behavior was changed with #6317 so that regardless of spelling in
the shader, the include path will conform to the host OS style for the
purposes of the include handler. This just adds a release note for that
new behavior.
Fixes#6669
(cherry picked from commit 9fa4618de9)
Moving the source file for the README.md for github and nuget releases
into the DXC repo. It should be updated with relevant notes for the
upcoming release whenever they are made in main such that they will
already be available when the release is built.
Part of #6697
(cherry picked from commit 8fce06b03c)
This fixes the following compiler error with clang 18.1.6 with mingw-w64
toolchain.
microcom.h:190:43: error: unknown type name 'nullptr_t'; did you mean
'std::nullptr_t'?
190 | template <typename T> HRESULT AssignToOut(nullptr_t value, T
*pResult) {
| ^~~~~~~~~
| std::nullptr_t
microcom.h:207:43: error: unknown type name 'nullptr_t'; did you mean
'std::nullptr_t'?
207 | template <typename T> void AssignToOutOpt(nullptr_t value, T
*pResult) {
| ^~~~~~~~~
| std::nullptr_t
As we prepare to do a release, we generally create an issue that
contains instructions and checklists for the mechanics of the release.
This change adds a template for these issues.
This pass was attempting to compare different things. The return values
of GetDxilVersion are not shader models, but... dxil version. Since the
code is trying to upgrade the validator version, I changed this to
GetValidatorVersion, to pair with SetValidatorVersion.
The previous code was breaking the nvidia driver on workgraphs.
Internal build that has DXC as a submodule and that is built with a
different VC toolset version started failing after the pragma got moved
up in commit 0b9acdb75. Adding a duplicate pragma back at the original
location makes both compiler versions happy.
If an exception is thrown, don't block it in the TempOverloadPool
destructor. Allow it to propagate out as a user-visible error.
Explicitly clear the TempOverloadPool before returning from the
HLMatrixLowerPass::runOnModule. In the normal case, when no exception is
thrown, that will still verify that all the overloads actually have been
lowered, and will assert out if they aren't.
The first first fix in #5392 was not correct. It relied on the layout
rule for the address to be the correct layout rule, but that is not
always the case. The address is just an integer that could exist in any
storage class. The correct solution is to explicitly set the layout rule
for the BitCast operation when expanding the RawBuffer* functions. We
know that the result of the BitCast is a pointer to the physical storage
buffer storage class, so we know the layout need to be the storage
buffer layout.
Fixes#6554
Originally @lizhengxing's PR. Retargeting main.
This PR pulls the upstream change, Fix non-determinism in Reassociate
caused by address coincidences
(ef8761fd3b),
into DXC.
Here's the summary of the change:
Between building the pair map and querying it there are a few places
that erase and create Values. It's rare but the address of these newly
created Values is occasionally the same as a
just-erased Value that we already have in the pair map. These
coincidences should be accounted for to avoid non-determinism.
Thanks to Roman Tereshin for the test case.
This is part 6 (the last part) of the fix for #6659.
Co-authored-by: Zhengxing Li <zhengxingli@microsoft.com>
This PR pulls the following upstream changes into DXC:
[llc/opt] Add an option to run all passes twice
(04464cf731)
> Lately, I have submitted a number of patches to fix bugs that only
occurred when using the same pass manager to compile
> multiple modules (generally these bugs are failure to reset some
persistent state).
>
> Unfortunately I don't think there is currently a way to test that from
the command line. This adds a very simple flag to both
> llc and opt, under which the tools will simply re-run their respective
> pass pipelines using the same pass manager on (a clone of the same
module). Additionally, we verify that both outputs are
> bitwise the same.
>
> Reviewers: yaron.keren
[opt] Fix sanitizer complaints about r254774
(38707c45be)
> `Out` can be null if no output is requested, so move any access
> to it inside the conditional. Thanks to Justin Bogner for finding
> this.
This is for the test of this change
(ef8761fd3b)
to fix#6659.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Adam Yang <31109344+adam-yang@users.noreply.github.com>
Originally @lizhengxing's PR. Retargeting main.
This PR pulls 2 upstream changes, Add a new WeakVH value handle; NFC
(f1c0eafd5b)
and Use a 2 bit pointer in ValueHandleBase::PrevPair; NFC
(b297bff1cc),
into DXC.
Here's the summary of the changes:
Add a new WeakVH value handle; NFC
> WeakVH nulls itself out if the value it was tracking gets deleted, but
it does not track RAUW.
>
> Reviewers: dblaikie, davide
>
> Subscribers: mcrosier, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D32267
Use a 2 bit pointer in ValueHandleBase::PrevPair; NFC
> This was an omission in r301813. I had made the supporting changes to
make this happen, but I forgot to actually update the
>
> PrevPair declaration.
This is part 4 and 5 of the fix for #6659.
Some of the types that have been added to the vk namespace were being
added to the default namespace when compiling for DXIL. The if
conditions were such that they would fall through to a default case.
The solution is to explicitly add code that we should skip adding those
builtin types when the vk namespace is not defined.
Fixes#6646.
This PR pulls the upstream change, Rename WeakVH to WeakTrackingVH; NFC
(e6bca0eecb),
into DXC.
Here's the summary of the change:
> I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
>
> Reviewers: dblaikie, davide
>
> Reviewed By: davide
>
> Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb,
llvm-commits, nhaehnle
>
> Differential Revision: https://reviews.llvm.org/D32266
This is part 3 of the fix for #6659.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
When processing the GetDimension member function for textures, we do not
emit an error if the output variable is not an l-value. This change will
add this error.
Fixes#6689
This commit bumps SPIR-V tools version, and re-add support for objects
debug instructions when using Vulkan's debug instructions.
Because OpenCL debug instructions are not a non-semantic set, the SPIR-V
spec would need to be modified, as today it does not allows forward
references.
Fixes#6691
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Induction variable simplification (indvars) tries to rewrite exit
values; these appear as phi nodes in loop exit blocks. If the
replacement for the phi is still in the loop, then that would break the
LCSSA property. Don't do that.
Add a test for this.
We will start issues a warning when `vk::offset` is not correctly
aligned to make it easier for users to understand why their spir-v will
not validate. Note that we do not treat this as an error because we want
to allow someone to have the flexibility to do other things. For
example, they could be targeting an API that does not follow any of
the existing rules, which is why they are using `vk::offset`.
Fixes#6171
Previously, if the latch exit was reachable from a different exit block
for the loop, then the pass would introduce a loop involving that exit
block and the latch exit. This is unwanted and unaccounted for.
- Add a test for shared exits.
- Add a test for non-dedicated latch exit
Before this change, OpConstantNull was emitted when an undef value was
required.
This causes an issue for some types which cannot have the OpConstantNull
value.
In addition, it mixed well-defined values with undefined values, which
prevents any kind of optimization/analysis later on.
Fixes#6653
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Update tolerance for ExecutionTest::UnaryHalfOpTest#AcosHalf test.
Enables implementations to calculate `acos` for fp16 type by converting
to fp32, doing fp32 math, and then converting back to fp16 using
round-to-nearest-even conversing (RTNE) per D3D11 spec. For more details
please see issue #6179.
As mentioned in the linked issue, for these floating point tests a fixed
point tolerance does not really make sense. It should vary based on the
magnitude of the expected value. But we are already using this approach
in many similar test cases and the simplest fix now is to update the
tolerance to accommodate the fp32-to-ft16 conversion.
Fixes#6179
The code that adds the input and output decoration in the entry points
inputs and outputs assumes that there is a single entry point in the
module. When using the `lib` profile that is not true.
This commit modifies the code so that it groups the stage variables by
entry point, and runs the current code on each group separably.
I hesitate to make this change because it will change the locations for
code that currently works, and will force users to update their
applications accordingly. Or they could modify their shaders
to use explicit locations attributes. Neither is great.
However, the advantage is that this allows the implicit locations to
match what would happen if the shader were compiled individually. It
also makes the locations more predictable because change in another
shader would change all shader after it. This is a better design, and
worth the breakage.
Fixes#6678Fixes#5213
[SPIR-V] Fix GroupNonUniform capabilities+ext
Fixes emission of GroupNonUniform capabilities and related extensions,
in particular SPV_NV_shader_subgroup_partitioned.
Since this PR bumps SPIR-V headers + tools, some test changes are
required due to opcode changes. Those are in a separate commit, but same
PR.
Fixes#6672
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This PR pulls the upstream change, Emulate TrackingVH using WeakVH
(8a6238201f),
into DXC.
Here's the summary of the change:
> This frees up one slot in the HandleBaseKind enum, which I will use
later to add a new kind of value handle. The size of the HandleBaseKind
enum is important because we store a HandleBaseKind in
> the low two bits of a (in the worst case) 4 byte aligned pointer.
>
> Reviewers: davide, chandlerc
>
> Subscribers: mcrosier, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D32634
This is part 2 of the fix for #6659.
When optimizing an overflow check of an add followed by a compare, the
new instruction was being inserted at the compare, and the add removed.
This produced invalid IR in cases where there were other uses of the
former add between it and the compare. This fix makes sure to insert the
new instruction at the old add location, rather than at the compare.
Note that this was also fixed in LLVM:
6f5dca70ed
In certain cases of unreachable code, SimplifyCFG could try to replace a
phi node with a select where the phi node itself was the select's
condition. This resulted in an ASAN use-after-free during SimplifyCFG.
The test case added isn't quite ideal because by the end of the
SimplifyCFG pass, the phi node is restored to its original state both
before and after this fix. However, an ASAN build of `dxopt` or
`check-clang-dxc` will identify a heap-use-after-free failure in the
intermediary steps of this test without this patch and succeeds with it.
This was also fixed in upstream LLVM:
602ab24833
This PR contains two changes:
1) Moves a pragma to disable a warning, which seems to be required by
the new compiler.
2) Adds a preprocessor define to workaround the crashes caused by the
runner image mismatching C++ runtime versions.
The second change we will want to revert once the runner images are
fixed. The issue tracking the runner images is:
https://github.com/actions/runner-images/issues/10004
Related #6668
Vulkan implementation can have different limits on the maximum value
used as an id in a SPIR-V binary. SPIRV-Tools generall assumes this
limit is 0x3FFFFF because all implementations must support at least that
value for an id. Since many implementations can support larger values,
the tools allows an option that will set a different limit. This commit
add an option to DXC to do the same.
Fixes#6636
When processing global values to determine when to flatten vectors, this
pass was only checking the immdiate users of the value for non-dynamic
indexing of the vector. But this would fail in the case of a dynamic
indexed GEP of a constant indexed GEP (e.g. h[0][a]) because the first
level GEP was constant indexed, but not the second. We fix this by
checking the full User tree of the value in `hasDynamicVectorIndexing`.
Before doing any major surgery on an exit from loop L, ensure that if an
exit edge from L goes to block X, then X is in L's parent loop or no
loop at all.
Add test cases:
- a reduced test case where the exiting block does not dominate its own
loop latch.
- a reduced test case where the exiting block is the latch for its own
loop. This reproduces the assert triggered by the original HLSL.
- the original HLSL that triggered this bug fix.
- the intermediate module from the original HLSL, taken just before the
attempt to remove unstructured loop exits.
This PR pulls the upstream change, Use accessors for ValueHandleBase::V;
NFC
(6f08789d30),
into DXC.
Here's the summary of the change:
> This changes code that touches ValueHandleBase::V to go through
getValPtr and (newly added) setValPtr. This functionality will be used
later, but also seemed like a generally good cleanup.
>
> I also renamed the field to Val, but that's just to make it obvious
that I fixed all the uses.
This is part 1 of the fix for #6659.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This is to fix compile error like
```
../../third_party/dxc/utils/TableGen/AsmWriterEmitter.cpp(974): error C2666: '`anonymous-namespace'::IAPrinter::operator ==': overloaded functions have similar conversions
../../third_party/dxc/utils/TableGen/AsmWriterEmitter.cpp(739): note: could be 'bool `anonymous-namespace'::IAPrinter::operator ==(const `anonymous-namespace'::IAPrinter &)'
../../third_party/dxc/utils/TableGen/AsmWriterEmitter.cpp(739): note: or 'bool `anonymous-namespace'::IAPrinter::operator ==(const `anonymous-namespace'::IAPrinter &)' [synthesized expression 'y == x']
../../third_party/dxc/utils/TableGen/AsmWriterEmitter.cpp(974): note: while trying to match the argument list '(`anonymous-namespace'::IAPrinter, `anonymous-namespace'::IAPrinter)'
```
in dawn project which is updating to newer VS toolchain.
This imports fix from
4ab57cd9ab.
more context is in https://issues.chromium.org/issues/341890053#comment2
Add a pass to run hlsl::RemoveUnstructuredLoopExits in isolation
Example: opt -dxil-r-u-l-e a.ll -S
Add some basic tests.
No functional change to the pass itself.
Sampler feedback resource types are not supported by the SPIR-V backend,
but they would previously fail silently until a function was called on
them. This change makes the error message more explicit on the type.
Related to #6614