This pull request introduces the open-source implementation of hashing
functionality for DXIL containers.
DxilHash.cpp: Implements DXBC/DXIL container hashing functions.
This is first part for #6808
This PR moves the DebugFunctionDefinition instruction from the SPIR-V
wrapper to the real function. This supplies debugger users with greater
accuracy.
When struct contains opaque resources, we must legalize the SPIR-V to
move those resources out of the struct (Vulkan doesn't allow composite
to store opaque resources).
Before this change, any resource array would also be flattened due to
the pass flattening everything by default.
Patched SPIRV-Tools to allow some level of selection in what we flatten.
Not perfect, as we would flatten all arrays or all composites and cannot
pick only one variable to flatten, but for now this should be enough.
Fixes#6745
Signed-off-by: Nathan Gauër <brioche@google.com>
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
This change ensures that data validation occurs within the container
itself,
rather than relying on the module—especially since the module may be
modified during container assembly.
Furthermore, simplifying the validator’s interface would be an added
benefit.
Add option LLVM_ASSERTIONS_NO_STRINGS
When defined, drop the stringized expression, __FILE__, and __FUNCTION__
strings passed to llvm_assert. This dramatically reduces the binary
size, which is useful when enabling assertions in a non-debug build.
Add option LLVM_ASSERTIONS_TRAP
When enabled, this forces asserts to always trap. Currently, on Windows
asserts calls RaiseException, while they trap on non-Windows. This
option makes the assertion behaviour consistent across platforms.
When declaring a lambda with a value-capture default [=, ...], the this
pointer is implicitly captured by value as well. This results in
potentially-unintuitive behavior and has been deprecated in C++20. It
produces a warning in newer versions of clang
(https://reviews.llvm.org/D142639).
This PR makes the implicit captures explicit, preventing the warning. It
does not change the compiled code at all, since it's just removing some
syntactic sugar.
In the middle of rewriting expressions like (A*B + A*C + D) to pull
common factor A out, the algorithm finds that there's actually only one
A. This is unexpected, and it fires an assertion.
This can occur when A is a constant, and constant -A also appears in the
terms somewhere else.
There is no harm in this situation, however, because the algorithm then
creates an addition-tree, but with a single element, and that's still
correct.
This bookkeeping issue was fixed later in LLVM, at
95abfa35d6
Unfortunately the associated test doesn't translate cleanly to DXC-era
LLVM. I've added test case reduced from our original case.
Fixed: #6829
The code the decides which global variables to include in the implicit
global cbuffer does not check for spec constant or push constant. They
end up being incorrectly include in it, causing problems.
The solution is to add those to the type of variable that should be
skipped.
Fixes#4542
By default, the launch type should be set to ‘Broadcast’ when diagnosing
barriers. However, the current behavior sets the default launch type to
‘Invalid,’ resulting in warnings when the launch type is not explicitly
specified as an attribute.
To address this issue, we’ll adjust the default setting to ‘Broadcast’
and thereby resolve the problem.
Fixes#6836
---------
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
This makes it possible to define how assert works on all platforms. The
header was already being included by String.cpp, and was already
designed to work for non-Windows platforms.
Also modify the non-Windows llvm_assert to emit the assertion message to
stderr and trap. We cannot call standard assert as we are overriding it
via include dirs, so there's no way to include the standard one and call
it.
If the index on a constant store into an array is negative or out of
bounds, that's an error, but shouldn't make the compiler index a vector
out of bounds.
Fixed: #6824
Change Python regexp ' clang\+\+ ' to ' clang\\+\\+' We're trying to
match fixed strings like ` clang++ `, but `\+` is not a valid Python
escape sequence. Use `\\+` so the regexp machinery sees `\+`
A static lib dxcvalidator is added.
The validators in dxcompiler and dxrfallbackcompiler links dxcvalidator.
Fixes#6790
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
In #6814, we modified the compiler to avoid generating bad code in some
cases for array initializers. However, this caused a crash in the case
where the initializer does not use a GEP expression for addressing
because the `GV` will be null.
I considered setting `GV` to the value in the `store` pointer operand,
but it looked like `GV` was also checked elsewhere for null and did not
want to modify the behavior of the code in other places.
The fix is to check if we found a global variable before validating the
array case.
When the OpenCL.DebugInfo.100 debug info was implemented, there was no
DebugTypeMatrix. Now that NonSemantic.Shader.DebugInfo.100 has been
merged, we should use DebugTypeMatrix. This PR corrects that oversight.
In CGHLSLMSFinishCodeGen's BuildImmInit, when initializing an array, if
the init value type doesn't match the array element type, we must bail
and instead, have it inject a call to the global ctor. Without this,
builds with asserts enabled would assert later with "Wrong type in array
element initializer". In non-assert builds, this invalid IR would be
removed, and valid code emitted.
See https://github.com/microsoft/DirectXShaderCompiler/issues/5294
This commit adds support for Sampler/Resource descriptor heaps in DXC.
Support for those heaps on the SPIR-V side requires no other extension
than SPV_EXT_descriptor_indexing.
On the Vulkan side, the VK_EXT_mutable_descriptor_type will be required
as multiple descriptor types must be allowed on the same binding.
When loading a type from a heap, DXC generates a new OpRuntimeArray of
the correct type, and binds it to `set=0,
binding=<BindingNumberOfTheHeap>`.
This means multiple OpRuntimeArrays will share the same binding. This is
why VK_EXT_mutable_descriptor_type is required.
This implementation uses at most 3 bindings:
- N OpRuntimeArray as binding A for the ResourceDescriptorHeap
- N OpRuntimeArray as binding B for the SamplerDescriptorHeap
- 1 OpRuntimeArray %counter_type for the ResourceDescriptorHeap
counters.
The bindings are only allocated if used. If only the
SamplerDescriptorHeap is used, a single binding is required.
The binding allocation logic is:
1. allocate bindings for every resources, excluding heaps.
2. If ResourceDescriptorHeap is used, find the first unused binding in
set=0 and use it.
3. Same for the SamplerDescriptorHeap
4. Same for the counters.
UAV counters are not always created, only if used.
When used, they are stored in an OpRuntimeArray. The index of a counter
in that array
is equal to the index of the associated resource in its own
OpRuntimeArray.
```hlsl
RWStructuredBuffer a = ResourceDescriptorHeap[2];
a.IncrementCounter();
// buffer in descriptorSet 0, binding 0, OpRuntimeArray[index=2]
// counter in descriptorSet 0, binding 1, OpRuntimeArray[index=2]
```
As-is, this PR doesn't allow resource heaps to alias regular resources,
or to overlap.
A follow-up PR will add 3 flags to override each binding/set pairs:
- 'fvk-bind-resource-heap <set> <binding>'
- 'fvk-bind-sampler-heap <set> <binding>'
- 'fvk-bind-counter-heap <set> <binding>'
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
When the prototype of an entrypoint was defined, the codegen crashed
because it failed to filter this partial declaration when adding
functions to the work-queue.
Fixes#6750
Signed-off-by: Nathan Gauër <brioche@google.com>
We should use the pull_request_target option here so that the PR runs
from the pipeline in the target rather than the PR source branch. This
allows the action to run with reduced security implications.
This fixes a bug where the offsets for elements in vectors with 16-bit
types doesn't take into account alignment bits and PIX wouldn't display
vector element values correctly in the shader debugger. Eg. if
`-enable-16bit-types` wasn't set, the offsets for a min16float4 would be
0, 16, 32, 48 instead of 0, 32, 64, 96.
Also removed the assert in PopulateAllocaMap_StructType that was
checking whether the calculated aligned offset matches the packed offset
(from SortedMembers) because it was false for members with sizes smaller
than the alignment size.
In CalcResTypeSize function, UNREFERENCED_PARAMETER macro is used with a
parameter which is referenced in return statement. That macro was added
in
6ee4074a4b but it was not removed when
"DxilModule &M" was referenced in
86073a3b0b
This fixes the following compiler error with clang 18.1.8 with mingw-w64
toolchain.
DxilContainerReflection.cpp:1512:3: error: object of type 'DxilModule'
cannot be assigned because its copy assignment operator is implicitly
deleted
1512 | UNREFERENCED_PARAMETER(M);
| ^
winnt.h:1387:40: note: expanded from macro 'UNREFERENCED_PARAMETER'
1387 | #define UNREFERENCED_PARAMETER(P) {(P) = (P);}
| ^
The code that implements `RWByteAddressBuffer::Store` will iterate over
all of the fields in a struct to write each element in the struct.
However, it does not use the "Spir-V fields", which accounts for
multiple fields being packed into the same bitfield. This is fixed by
using the `forEachSpirvField` function to make sure that the bitfield
are correctly handled.
Fixes#6483
When a scalar variable is passed as the argument to an inout vector
parameter,
then the scalar is suppose to be splatted. After returning from the
function, we need to extract the first element from the parameter to
store back into the scalar.
Fixes#6568
When the [branch] annotation is used, switches are converted into an
if/else tree.
Issue arise when declaring a variable into a switch case:
- when flattened, the same variable could be traversed twice in case of
a case fall-through.
In addition, there was a small bug in the flattening logic: only breaks
were stopping the handling of further statements, while early-returns
were ignored. This was not an important bug as it only added more
dead-code, but it was wrong.
Fixes#6718
Signed-off-by: Nathan Gauër <brioche@google.com>
IsNan returns a boolean, even is the input-type is a float. This was
working in most cases except:
- if the layout was not Void
- if the input type was not a matrix
The first bug is because a bool memory layout/representation is not
specified, and shall never be exposed to externaly-accessible memory.
Hence, if we saw a layout rule != Void, we converted it to a UINT. When
calling isnan, the layout rule should not be propagated as we loose any
layout info.
The second is because our codegen assumed matrix operations returned a
matrix with the same type as the input parameters. In the case of isnan,
this was just wrong.
Fixes#6712
Signed-off-by: Nathan Gauër <brioche@google.com>
There is an extra "signature" in the Patch Constant signature part of
the disassembler output, causing the "Patch Constant signature
signature" to appear. This PR removes the extra "signature" from the
output.
Example HLSL input:
```hlsl
struct OutputConstantData {
float tessFactor[4] : SV_TessFactor;
float insideTessFactor[2] : SV_InsideTessFactor;
};
OutputConstantData HSConstant() {
OutputConstantData output;
return output;
}
[domain("quad")]
[partitioning("integer")]
[outputtopology("triangle_cw")]
[outputcontrolpoints(1)]
[patchconstantfunc("HSConstant")]
void HSMain() {}
```
Example Disassembler output:
```diff
...
;
- ; Patch Constant signature signature:
+ ; Patch Constant signature:
;
; Name Index Mask Register SysValue Format Used
; -------------------- ----- ------ -------- -------- ------- ------
; SV_TessFactor 0 w 0 QUADEDGE float w
; SV_TessFactor 1 w 1 QUADEDGE float w
; SV_TessFactor 2 w 2 QUADEDGE float w
; SV_TessFactor 3 w 3 QUADEDGE float w
; SV_InsideTessFactor 0 w 4 QUADINT float w
; SV_InsideTessFactor 1 w 5 QUADINT float w
;
...
```
This imports upstream commit
54bff1522f
This fixes the following compiler error with clang 18.1.8 with mingw-w64
toolchain.
CFG.h:916:22: error: expected a qualified name after 'typename'
916 | template <typename CALLBACK>
| ^
minwindef.h:90:18: note: expanded from macro 'CALLBACK'
90 | #define CALLBACK __stdcall
| ^
This change adds a new CMake configuration option
`HLSL_DISABLE_SOURCE_GENERATION` which allows a user to disable
generating the in-tree sources which contributte to DXC's source
releases. This option should only be used by users building DXC and not
modifying it.
Resolves#6728
In the special code to handle the memcpy pattern where a constant buffer
contains a vector array that initializes a local (or static global)
scalar array for use by the shader, an invalid assumption was made that
if the memcpy dest was global, that the src is global as well.
This was not the case, and when expecting to generate constant
expressions to index the src, these generated orphaned instructions
instead, leading to invalid IR.
This fixes the issue by leveraging ReplaceConstantWithInst, and setting
the insertion point for the Builder. Now, replacement *could* fail, if
src instructions don't dominate replacement uses, so bool for replaced
all is returned from replaceScalarArrayWithVectorArray.
Another issue was that it would replace the dest for the original memcpy
with src along the way. Now, if we don't replace all uses, this turns
the memcpy into a no-op and any remaining uses are no longer coming from
src, but an undef dest instead. This was also fixed to skip this
replacement, then clean up this use if all other uses have been
successfully replaced.
Fixes#6510
The documentation says that we use the HitTKHR builtin to implement
RayTCurrent. However, HitTKHR was renamed to RayTMaxKHR. We update
the documentation to represent that change.
Fixes#6739
The behavior was changed with #6317 so that regardless of spelling in
the shader, the include path will conform to the host OS style for the
purposes of the include handler. This just adds a release note for that
new behavior.
Fixes#6669
Add guidance for how release notes should be documented at the time
of the change going in as well as some suggestions for how to format
that documentation.
Contributes to #6697
---------
Co-authored-by: Chris B <cbieneman@microsoft.com>
According to the HLSL semantics documentation, the valid semantic
indices for SV_Target[n] are 0 <= n <= 7:
https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-semantics
A check for this already exists in DXIL validation, but for large values
of n, crashes and/or buffer overruns may occur during compilation before
validation, so an earlier check is needed.
Fixes#6115
When enabled, any hlsl::Exception thrown during code generation and
optimization will cause the process to trap.
---
edit: I've changed the implementation to generate a trap, instead of
std::abort
Moving the source file for the README.md for github and nuget releases
into the DXC repo. It should be updated with relevant notes for the
upcoming release whenever they are made in main such that they will
already be available when the release is built.
Part of #6697
…list
Instructions in a BasicBlock are maintained in a doubly-linked list. The
links are "instrusive" in the sense that with inheritance tricks the
Next and Prev nodes are embedded in the Instruction object itself.
The linked list uses a sentinel object to mark the tail of the list.
Previously, the sentinel was an ilist_half_node<Instruction> which just
consists of a 'Instruction* Prev', but then it was cast to Instruction*.
This is flagged by the undefined behaviour sanitizer in some cases.
Also, it's weird and wrong.
Upstream LLVM has entirely reimplimented the intrusive list to avoid
such problems. But that's a massive change.
This change uses a real Instruction object as the sentinel. The
instruction is only used for its Next and Prev properties. I used an
UnreachableInst becuase it's small and simple.
Issue: #6446
The `ResElem` type doesn't exist, and the example here seemed to imply
that the similar `ResRet` type was used in resource metadata. This is
incorrect, so fix the examples to match types that would actually show
up in this metadata.
Note: In practice the metadata doesn't generally actually refer to a
variable, but just an `undef` of the right type. I've opted not to
change the examples to reflect that here to minimize the change, but it
might be nice to describe when/why this occurs.
Fixes#3411