An earlier version of the D3D runtime doesn't handle CheckFeatureSupport
for FEATURE_SHADER_MODEL when the corresponding struct is set to a
unrecognized shader model. Instead of returning the highest supported
shader model that is less than the one provided as documented, the call
fails. In practice, this only occurred for 6.6 tests where the SDK had
6.6 capability, but the installed runtime did not.
To work around this, we don't verify that the call succeeds. Instead,
a failing return result is interpretted the same way as a highest shader
model value that is lower than that requested.
Because version 6 of the command list pointer was being created whenever
the available SDK supported it, sometimes the test would try to create
version 6 where it wasn't supported.
Instead, this just stores the base pointer type and casts it up where we
know support is available.
- VFS captures output files for duration of test, enabling:
- %dxl test IDxcLinker
- add -D to FileCheck args to supply defined variables
- report failing RUN command when not consumed by FileCheck or XFail
This extension adds qualifiers for payload structures accompanied with semantic checks and code generation. This feature is opt-in for SM 6.6 libraries. The information added by the developer is stored in the DXIL type system and a new metadata node is emitted during code generation. The metadata is not necessary for correct translation of DXIL, so it may be safely ignored, but it provides hints to unlock potential optimizations in payload storage between DXR shader stages.
Currently, arrays of min16float etc. types don't get a RelaxedPrecision decoration for the OpVariable. This has some real performance implications since at least some hardware can fit more RelaxedPrecision variables into registers, and maybe also groupshared memory.
This pull request add the missing RelaxedPrecision decoration for OpVariables of array types in a similar way vector and matrix minimum precision types are already handled.
The mesh shader path for QuadReadTest expects at least one render target. Without, the test framework attempts to clear a nonexistent RTV resource:
D3D12 ERROR: ID3D12CommandList::ClearRenderTargetView: Specified CPU descriptor handle ptr=0xCCCCCCCCCCCCCCCC does not refer to a location in a descriptor heap.
The commit in this PR simply adds an RTV, and will proceed through to test completion.
Comparing two unbound arrays where one was initilizing the other
resulted in infinitely incrementing both and never reaching a
conclusion.
This adds detection to FlattenedTypeIterator for the situation whereupon
the comparison is terminated and the result returned.
A simple test of the assignment is added expecting the correct error.
Cycles in the dependency graph can result in phis and other instructions
failing to get a wave sensitivity result after analysis. This can only
happen when nothing in the loop has wave sensitive operations.
By collecting phis that are unknown and marking them as non-sensitive
after analyze is complete and rerunning it, we can correctly evaluate
their dependants too.
Because incompletely processed blocks might still be wave-sensitive,
unknown phis are only added to be reprocessed when their preds have all
be processed fully.
In addition to cycles preventing wave sensitivity determination on
dependant operations, they can prevent entering blocks that depend on
values with cyclic dependencies. Even when reprocessing all unknown
phis, this can result in not all values getting a determination.
To resolve this, all blocks are added to the block list from the start
to ensure that they are all processed eventually. Since instruction
processing is more intelligent and targetted, this change is accompanied
by a change that only processes a single block from the worklist before
attempting to address queued instructions again.
To catch wave-sensitive instructions sooner, when visiting an
instruction, all arguments are checked for wave sensitivity even after
an unknown argument is found whereas before an unknown argument ended
the search.
Add some tricky loop tests that failed in ToT and various intermediate
stages of the change. Add a more aggressive assert to check for failures
to find the wave-sensitivity.
Fixed#2556 (again)
An unrelated earlier change mistakenly removed the return on a line
calling the Scatterer constructor for an argument value. As a result,
the constructor is called, but the resulting object which would insert
new extraction ops in the entry block is not used at all. Instead,
the default case is encountered where the vector PHI op is used as an
insertion point, which places the extractions after the new scalar PHIs
that use them. This is caught in the loop case by an assert because the
assumptions that caused LCSSA to be skipped for this BB are faulty.
ShaderOpTest::CreatePipelineState needs to check the return code from ID3D12Device2::CreatePipelineState so failures immediately stop here.
Otherwise we might get a failure in a subsequent call like when CommandListRefs::CreateForDevice() calls ID3D12Device::CreateCommandQueue() - this one will fail with return code 0x887a0005 - The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.
Gather operations that take four separate offsets are meant to allow for
programmable, non-immediate values. This removes them from the immediate
and range validations by giving the immediate versions their own opcodes
For these intrinsics and loads, errors are generated when offsets are non-literal
or out of range. These largely use the slightly altered validation paths that the
sample intrinsics use.
Reword error when texture access offsets are not immediates to be
more all encompassing and grammatical.
Incidentally remove duplicate shaders being used for the validation
test from the old directory while identical copies in the validation
directory went unused. Redirected validation test to the appropriate
copies. This is consistent with the test shader re-org's stated intent
Sample operations have a maximum of 3 args, but gather has a maximum of
two since it always operates on 2D images. So gather operations only
pass in two offset args to the offset validation, which resulted in an
invalid access.
Rather than adding a specialized condition to evade this, just iterate
over the number of elements in the array. For sample it will be 3 and
for gather 2 and it will still check for expected undefined args
appropriately.
For the offset legalization pass, the opcode is used to determine the
start and end of the offset args
Only produce the loop unroll suggestion when within a loop
Base error line on call instruction instead of source of the offset
Sort by location in source when possible and remove duplicates
Adapt tests to verify and match these changes
Fixes#2590Fixes#2713
To test the flag indicating support for 64-bit atomics on heap
resources, the groupshared and raw tests are here altered to use root
descriptors. An nearly identical raw buffer test is kept for heap
testing and the typed resources, requiring heap support implicitly, is
unaltered. 32-bit testing, encompassing typed resources and all else,
are kept in the typed variants
Renamed a few things to make the names less confusing.
According to Vulkan specification when using `OpImageRead/OpImageWrite`, the `OpTypeImage` (`Buffers`, `RWBuffers`, `RWTextures`) must have a format that matches the format on the API side, unless the StorageImageReadWithoutFormat/StorageImageWriteWithoutFormat is added and `Unknown` is used as the format.
This pull request addressess #2498 for the format part by adding an attribute `[[vk::image_format("<image format as spelled in SPIR-V spec>")]].` Example of the syntax:
```
[[vk::image_format("rgba8")]]
RWBuffer<float4> Buf;
[[vk::image_format("rg16f")]]
RWTexture2D<float2> Tex;
RWTexture2D<float2> Tex2; // Works like before
```
The `image_format` only applies to **global variables** of type `Buffer`, `RWBuffer`, `RWTexture`. For variables and function parameters it is propagated by the inlining pass in legalization. This required a small change to one of the passes in SPIRV-Tools, that should be also checked by someone more familiar with the codebase: https://github.com/KhronosGroup/SPIRV-Tools/pull/4126
Note that this does not fix the handling of unspecified format (that case still works like before, using `R32f`, etc. based on the type in shader), although it should be still fixed to add the
StorageImageReadWithoutFormat and/or StorageImageWriteWithoutFormat and use Undefined. But I think the ability to specify the format is more urgent.
Design note from Jaebaek:
Since the `image_format` attribute only applies to **global variables**, under the DXC architecture
only `DeclResultIdMapper` can check the attribute when it handles `VarDecl`s. It means
we have to pass the `image_format` information to `LowerTypeVisitor` because it cannot access to
`VarDecl`. In order to pass the `image_format`, we use `SpirvContext` that can be accessed by
`SpirvEmitter` and all visitors. We use `SpirvVariable` to `spv::ImageFormat` mapping because the
attribute only applies to **global variables** (not to image types).
See how we use `llvm::DenseMap<const SpirvVariable *, spv::ImageFormat> spvVarToImageFormat`.
When a resource comes from a binding, we can't determine if it is on a
heap or not, so we set a flag on it to let the runtime sort it out.
When a resource comes from the heap directly, we know it's on the heap
so we set the shader flag to check for platform support.
Refactor Derivatives test to be clearer, more correct, and more in
keeping with the original intent. Center index is now nearer the center.
Conversion of 2D index to 1D is now correct for the quad layout.
Dispatch sizes are separated by CS, Mesh, and undefined results to make
testing easier and more complete.
Also adds the 2D quad formats while adapting 1D variants to be correct
with the latest spec
Any extension function that includes the specical string "$o" in its name
needs to have the $o replaced with the type name of the overload. Previously
we used a default heuristic to select the overload type from a function
argument.
This commit adds support for explicitly setting the argument to use for the
overload name by appending a ":<ArgIndex>" to the overload marker. For example,
using a name like "my_special_function.$o:3" would take the overload type from
the third function argument.
- SROA was not skipping handle type, now present with dynamic resources
changes, or ConstantBuffer<> / TextureBuffer<> types, as it should.
- Updated unit tests to add combinations of -Od and -Zi.
The order of globals() is inconsistent. Since this test depends on that
order, it fails sometimes. By removing the individual characterists of
the errors, the issue is resolved.