The original payload access qualifiers change disallowed disabling them
on 6.8+ versions. This wasn't an intended feature of 6.8. We'd prefer to
maintain the ability to disable these when needed to avoid issues that
might crop up.
This change extends respect of the disable-payload-qualifiers flag for
all versions. It rewords the warning that is produced when they are
encountered when disabled to be more applicable. Tests are also added to
verify that the implicit enabling in 6.7+ works in the basic cases and
also that disabling in the 6.7+ cases works as well.
Fixes#6462
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
When deleting an unused memcpy, ScalarReplAggregatesHLSL was attempting
to delete both the target and the source of the memcpy without first
checking if they were both same, resulting in a double-delete.
According to the documentation: "A hull shader is implemented with an
HLSL function, and has the following properties: [...] The shader output
is between 1 and 32 control points".
https://learn.microsoft.com/en-us/windows/win32/direct3d11/direct3d-11-advanced-stages-tessellation#hull-shader-stage
This change adds a check to verify that the outputcontrolpoints
attribute is set on Hull shaders and has an argument in the valid range.
Fixes#3733 (by making the error consistent between backends)
When lowering an hl.cast, when the operand was an undef matrix, the pass
would insert a call to a mat2vec stub, but since the undef value is not
an alloca, it never gets handled, and the call to the temporary stub
remains. Since the stub FunctionVal gets deleted, when the instruction
is accessed in a future pass, it reads a dangling pointer.
The fix is to handle undef similarly to how constant 0 is handled, and
to return an undef vector from lowerHLCast.
---------
Signed-off-by: Chris Bieneman <cbieneman@microsoft.com>
Co-authored-by: Antonio Maiorano <amaiorano@google.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
If last arg lpUsedDefaultChar is non-null, the function must set this
value to true or false. WideToEncodedString conditionally passes in a
pointer to 'usedDefaultChar', which was not getting written to, and
causing ubsan to fail on reading an invalid boolean value:
```
.../lib/DxcSupport/Unicode.cpp:156:14: runtime error: load of value 208, which is not a valid value for type 'BOOL' (aka 'bool')
```
This commit adds a new optional -ftime-trace-granularity option that is
already implemented in llvm-project. This change is a surgical port of
an existing feature from the upstream llvm-project repo into the DXC
codebase.
The following commits in the llvm-project repo were copied and followed
for this change. Clean cherry picks were not possible due to the
differences in repos like change of file locations and other dependant
changes made in the repo.
*** Adds the granularity configuration setting ***
'Time profiler: small fixes and optimizations'
Commit: 26536728591d5fdac373ef535ae122b873f73292
*** Wires up the commandline option -ftime-trace-granularity to the
TraceProfiler code ***
'[Support] Fix -ftime-trace-granularity option'
Commit: 4fdcabf259c4ab94654e6cd5d95d0e0313159c70
Fixes#6372
---------
Signed-off-by: Cooper Partin <coopp@ntdev.microsoft.com>
Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This change adds NodeRecordType alignment field to RDAT to make it
possible to validate pointer and stride alignment in the runtime.
This includes a change to DXIL metadata to preserve the record alignment
without requiring recovery by looking for GetNodeRecordPtr.
Fixes#6270
AllocateRayQuery returns an int, but it's not really an int: it's a
handle to the query. Thus, it's not sensible for the PIX debugging
instrumentation to attempt to write it out to the debug-data-UAV. The
previous code did an explicit check against the type of the value to be
written, but if that value is actually a phi itself, then that check
would fail. So now we recursively run through the phi values looking to
see if any of its antecedents are a phi, and if so, refuse to send its
value to the UAV.
This change exposes the clang 3.7 `-fdiagnostics-format=` flag in DXC with supported values `clang`, `msvc`, `mdvc-fallback`, and `vi`.
This option allows having DXC output diagnostic messages in string formats that are recognized natively by Visual Studio.
Resolves#6160 & #1811, both of which had been closed as not planned.
ShaderCompatInfo identifies compatibility for functions that could be
called from an entry point.
Currently, this checking detects compatibility problems between entry
points and internal function calls that the validator otherwise misses.
This change adds a check for ShaderCompatInfo, recursing into called
functions looking for a source of conflict when not compatible with the
current entry point properties.
Errors are emitted for the entry point and at one source of each type of
conflict that ShaderCompatInfo detects.
A function is considered the source of a conflict when it has the
conflict but none of the functions called from this function have this
conflict.
Removed early exit for ShaderFlags validation when module is a library,
since these flags should be validated for libraries, and running
CollectShaderFlagsForModule fills in the ShaderCompatInfo data we need
for validation.
Also fixed tests for illegal barrier and derivative use, as well as
replacing the ignored MaxOutputRecords with the correct MaxRecords
attribute.
Fixes#6292.
It turns out that in the prior validator version, if a subobject
required DXR 1.1, the DXR 1.1 flag would be set on each function in RDAT
output, as well as the global flags. It didn't appear this was the case
through a D3DReflect test because that goes through disassembly to
assembly step, where subobjects are lost because they aren't in the llvm
IR. The previous change assumed this was not the case when the
subobjects were removed, but this removal occurs after the RDATWriter
constructor, which does the full RDAT serialization since size if
required right away.
This restores the detection code and hooks it into
DxilModule::ComputeShaderCompatInfo when validator version is in range
[1.5, 1.7]. DXR 1.1 was introduced in 1.5, and we no longer tag every
function as requiring DXR 1.1 based on subobjects in 1.8.
At the moment, there is no way to CHECK the subobject RDAT in D3DReflect
because they get stripped from the module (even reflection and debug
modules) before serialization, and the test path for D3DReflect goes
through a disassemble/re-assemble step between the prior stage and the
D3DReflect stage.
Fixes#6321 (really).
Fix barrier allowed ops and flags by shader kind
New barrier operations lacked validation and for RDAT info: had
incorrect min target and shader stage flags.
- Identify barrier DXIL operations with new `is_barrier` in `hctdb.py`
and generated `OP::IsDxilOpBarrier`.
- Identify when a barrier op requires shader stage with group
(compute-like stage), or when it requires node memory.
- Add new `OptFeatureInfo_RequiresGroup` to identify function only
compatible with a shader stage with a visible group for access to
groupshared memory or use of group sync.
- Translate to original `BarrierMode` when compatible; adds
`BarrierMode::Invalid` to identify invalid cases.
- Account for `DXIL::MemoryTypeFlags::AllMemory` being allowed and
auto-masked by driver.
- Properly set min shader model and compatible shader stage flags.
- Validate barrier for shader stage.
- Added new barriers to counters which were missing.
Adressing parts of: #6256 and #6292Fixes#6266
In validator version 1.7 and below, the RaytracingTier1_1 module flag
was set on every function if any StateObjectConfig subobject had the
AllowStateObjectAdditions flag set. This was incorrect, as the
requirement is validated in the runtime based on the use of the
subobject instead. Subobjects are removed from the module and placed in
RDAT during container serialization, so the requirement would be lost
when recomputing the flags in validation. This didn't break validation
because flag validation was completely disabled for libraries!
This change fixes this problem, and allows DxilValidation to validate
ShaderFlags because they will no longer mismatch due to this issue.
Running CollectShaderFlagsForModule is also necessary for collecting
per-function shader flags which will be used by call graph validation in
a subsequent change, so enabling flag validation unblocks that as well.
Fixes#6321
In EmitGetNodeRecordPtrAndUpdateUsers, the type will be mutated. And the
GEP user of the RecordPtr will be merged at same time. This make things
complex because the GEP index need to be updated since type is mutated.
To make things easier, merge the GepUse after mutate type.
Fixes#6223
PIX's code for parsing debug data operates at the module level. When the
same global is referenced by multiple functions in a module, that
variable is referred to by multiple dbg.value/dbg.declare statements,
and those are mapped (by the PIX passes) to multiple fake allocas using
its usual scheme. This code was written before libraries were a thing,
and wasn't expecting this duplication. A little more attention to the
variable's scope fixes the issue.
Also, the changed code's original "return false" broke the whole process
of discovering variables with the results that PIX's shader debugger
locals window was completely empty. Makes more sense to ignore the one
variable and keep going.
This change switches from instrumentation per instruction to
instrumentation per basic block. Furthermore, not every instruction in a
basic block needs to write debug data to the output UAV- it's enough to
know that the basic block was entered, as long as the calling
application can figure out what instructions were in that block. To
support that knowledge, the pass now emits a text "precis" of each basic
block.
Also, the previous branchless UAV bounds enforcement was replaced with
something similar that emits fewer instructions at the cost of a larger
UAV. This tradeoff is WELL worth it.
Additionally, the debug pass used to add extra blocks in order to
solidify the arguments to phi instructions. This work was unnecessary,
and added a lot of complexity to the resulting instrumented shader. The
debugger application is only interested in the value of the phi itself
and the actual value produced via the actual preceding edge.
Full details are in the comments in the code.
This change reduces driver-side compilation overhead from "overnight" to
2 minutes on a 160k-instruction shader.
Metadata loading for node record properties assumed fixed positions for
values in a (key,value) property list, skipping and ignoring the keys.
This was not correct, fragile, and not the intended implementation of
the key,value property pattern.
There is no functional change, but this makes the loader robust to
additional properties instead of crashing or loading the wrong thing.
Fixes#6276
Upon further discussion, the team has agreed that in certain degenerate
cases, the current diagnostics are insufficient.
In the case that min == max in the wave size range attribute, a
defaultError warning should be emitted. Additionally, there should be an
explicit way to handle the case where 0,0,0 is passed to the wavesize
range attribute.
This PR directly handles and tests both of these cases.
Fixes#6161
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Add ShaderKind::Last_1_8 for shader mask
Add shader model comments before flag groupings in DxilConstants.h and DxilShaderFlags.
Add missing flag checks for min shader model in RDAT function info. Move ShaderCompatInfo computation into DxilModule, propagate callee info.
Move computation of shader model requirements based on flags into DxilModule. Finalize requirements for entry functions in AdjustMinimumShaderModelAndFlags.
Fixes for function level flag tracking:
- DerivativesInMeshAndAmpShaders: use flag to track deriv use, then adjust for entry functions.
- hasUAVs: based on resource use in function instead of global resources.
- WriteableMSAATextures: based on use in function instead of global resources.
- Also catch cases for dynamic res from any use by looking at create/annotate handle, not just the TextureStoreSample op.
- RaytracingTier1_1: move module-level detection to CollectShaderFlagsForModule
- Marked deriv and quad ops as being supported in node.
- Fixed SampleCmpBias to be considered gradient op.
- Update RDAT definitions to dump more useful info for testing
Fixes#6218.
This option is not enabled by default. When enabled, this option adds
the MaximallyReconvergesKHR execution mode to the module, and the
associated extension.
Related to #6222
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Two related sets of changes:
1) Modify the value-to-declare pass to tolerate bitfields, which are
detected by their habit of being smaller than the type in which they are
contained. The pass produces dbg.declare statements that map each HLSL
variable to some storage. With bitfields, that mapping becomes
many-to-one, because many bitfields can be stored in the same basic
type. Most of the changes are to preserve the size of the bitfield as
the code descends the type hierarchy into the underlying basic type,
which is what motivates these new "type overrride" parameters to some
functions. Complications to watch out for are min16 types, which may or
may not be their own purpose-made (16-bit) types depending on whether or
not they are enabled.
2) Modify the PIX debug-info API to return enough data to PIX to allow
it to extract (by bit shifting) a bitfield from the underlying type.
This bit is also about preserving knowledge of the field's size as
opposed to its containing type's size, hence the new dxcapi
interface/method.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
#5827 first requires validation at the DXIL level. The validation will
check for functions that have incompatible node launch types, and
intrinsics that represent certain system values.
This PR implements this validation and adds some tests that exercise
this validation.
Fixes#6104
Added DiagnoseUsedHLSLMethodCall which only report error when the
MethodCall is used.
DiagnoseUsedHLSLMethodCall is used in DiagnoseTranslationUnit.
In DiagnoseTranslationUnit, traverse all visited functions in call
graph, diagnose illegal intrinsic call inside these functions.
Shader kind is also checked in DiagnoseUsedHLSLMethodCall.
Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/5855
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The init of the dia symbol manager presents an unneccessary performance
hit for WinPix. It's quite simple to defer the init, so here we are.
(This dia implementation was originally added specifically for WinPix,
but was then superseded by a simpler API. It's entirely possible that
there are no other clients of the dia interface at all, but it has been
published, so we're probably bound to maintain it.)
(Note that there are other deferment opportunities here, but the larger
pending work is to disentangle the dia implementation from that of the
new "IDxcPix*" API that WinPix uses, which would obviate deferment
altogether, but that's way out of scope for me for now.)
PSV0 MaximumExpectedWaveLaneCount was incorrectly set to 0 for
non-range.
Create struct for WaveSize in DxilFunctionProps.h.
Centralize encoding and validation logic there.
Use validation logic in both SemaHLSL and DxilValidation.
Remove test requiring newer shader model in CodeGenHLSL.
Add comprehensive test for compute and node, cs and lib targets, SM 6.6
and 6.8, testing ast, metadata and RDAT.
Add PSV0 tests to catch incorrect runtime data.
Update validation rules and test for more cases.
The build is failing due to clang-format modifying these lines. I've
changed them to what clang-format wants even though it's a bit odd, and
updated the CMake script to make the error message more clear when this
type of failure occurs.
Wave Size should be able to take a range of possible wavesizes. This PR
aims to implement what is described in this spec:
https://github.com/microsoft/hlsl-specs/pull/149/files
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
New feature flag ShaderFeatureInfo_SampleCmpGradientOrBias = 0x80000000
and
ShaderFeatureInfo_ExtendedCommandInfo = 0x100000000 were added.
They will be set when SampleCmpBias/Grad startInstance/VertexLocation
exists.
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Semantic index assignment for flattened arguments would incorrectly
increment the argument index after an array of struct field was
encountered. It would misinterpret any array as a basic type because it
wasn't a struct type.
This fixes the bug by drilling into the array element before checking
the type.
Fixes#6121
In DxcPixLiveVariables.cpp,
UniqueScopeForInlinedFunctions::AscendScopeHierarchy tries to keep track
of a variable's scope and the scope at which its containing function was
inlined, if any.
This scope duo has to be the same in two places for things to work out:
the scopes inferred from dbg.declare, and then again when the scope
hierarchy is ascended in order to discover all variables accessible from
a given point in the program.
AscendScopeHierarchy was forgetting to update the inlined part of the
scope, resulting in, for example, PIX's debugger losing track of
variables outside of, for example, a for loop, even though the code can
see those variables at that point.
Just setting the inlined scope to be the same as the function scope
fixes this problem.
(In addition to these new tests, existing PIX tests exercise this
situation happily)
The IsRay() function defined in the ShaderModel class can be misleading.
It is included in a set of functions that are used to test whether or
not the shader in question was targeted to a specific shader stage.
However, it is impossible for a shader to be targeted to, say, the
"raygeneration" stage, or any other stages associated with the kinds
that the IsRay() function checks for. Like the "node" shader kind, a
shader can only have these shader kinds if the associated attribute is
found on the function declaration. To prevent confusion, and to keep the
Is*() functions inside the ShaderModel class restricted to only those
targetable shader stages, the IsRay() function can be removed. This PR
comes as a response to #6008
The ExtractRecordStructFromArray intrinsic is left over dead code that
should be removed. This PR removes the intrinsic from existence. In the
cases where this intrinsic was used, DefaultSubscript is used instead.
Moved from #5893 to target the main branch instead.
Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/5351
The misplacement of that "return" in DxilDebugInstrumentation.cpp meant
that a thread would continue to the following call to
addStepDebugEntryValue, even if pix_dxil::PixDxilReg::FromInst had
failed (i.e. returned false), which means that RegNum is not valid
(although initialized to 0).
This meant that PIX was instrumenting a bunch of void-return DXIL
instructions that it shouldn't have.
Didn't think to test that it WASN'T instrumenting instructions, but
herein is added a test to do just that.
Got a couple reasons to do this: 1) so that users can step over the last
value assignment and actually see that value in the debugger and 2) so
that I can easily distinguish the final instruction in DXR traces
wherein two different function's traces might be intermingled in the
debug output data in PIX's UAV.
- In `DxilPipelineStateValidation.h`, add `PSVRuntimeInfo3` derived from
`PSVRuntimeInfo2`, containing one `uint32_t EntryFunctionName` member.
- Update `MAX_PSV_VERSION` and `RuntimeInfoSize()` for new version.
- Add pointer to `DxilPipelineStateValidation` and assign in
`ReadOrWrite` along with other versioned pointers.
- Add `DxilPipelineStateValidation::GetEntryFunctionName()` accessor to
look up name in string table.
- Update PSV0 versioning to key off validator version 1.8 in
`DxilContainerAssembler.cpp`
- Add entry name to string table and set `EntryFunctionName` to offset.
- Add initialization to DxilPSVWriter members
- Add asserts in DxilPSVWriter::write() to ensure buffer sizes not
changed after m_PSVInitInfo is set.
- Add test to DxilContainerTest, adding/updating common functions as
needed
Note: Testing the HS passthrough case revealed problems blocking that
scenario.
Filed issue #6001 to track this.
Fixes#5944
From the comment in one of the changed files:
// If a function is inlined, then the scope of variables within that
function
// will be that scope. Since many such callers may inline the function,
we
// actually need a "scope" that's unique to each "instance" of that
function.
// The caller's scope would be unique, but would have the undesirable
// side-effect of smushing all of the function's variables together with
the
// caller's variables, at least from the point of view of PIX. The pair
of
// scopes (the function's and the caller's) is both unique to that
particular
// inlining, and distinct from the caller's scope by itself.
For users of hlsl::ReplaceUsesForLoweredUDT, only
EmitGetNodeRecordPtrAndUpdateUsers will has V and NewV with different
address space. But there will be no addrspacecast for NodeRecordPtr from
clangCodeGen, because NodeRecordPtr doesn't have address space before
HLLowerOperation.
As a result, when hit addrspacecast in hlsl::ReplaceUsesForLoweredUDT,
the address space of V and NewV should match. So
InsertAddrSpaceCastIfRequired will always has AddrSpace !=
Ty->getPointerAddressSpace().
Removed InsertAddrSpaceCastIfRequired and add assert for address space
mismatch.
Fixes https://github.com/microsoft/DirectXShaderCompiler/issues/5695