1. Replace extension AMD_shader_explicit_vertex_parameter with
KHR_fragment_shading_barycentric as they define same semantic.
2. Support new usage of SV_BaryCentrics and PerVertexKHR decorated
input.
3. For per-vertex inputs, they will be used as an array if a shader call
new intrinsic GetAttributeAtVertex explicitly.
4. Support mixed deprecate usage of `nointerpolation` qualifier. When
use non-interpolated inputs outside new intrinsic, it will be treated as
accessing to first provoking vertex : data[0]. But it will still be
treated as an array if used as a function call's argument.
5. As now we allow mixed usage of this kind of inputs, add spirv
instruction operand replacement in DXC. A new visitor will replace
normal access to those inputs with a new AccessChain instruction targets
its first element.
6. For bool and bool vector type, we reconstruct a new bool/boolVec type
array in entry function wrapper.
Adds support for
- `RasterizerOrderedBuffer`
- `RasterizerOrderedByteAddressBuffer`
- `RasterizerOrderedStructuredBuffer`
- `RasterizerOrderedTexture1D`
- `RasterizerOrderedTexture1DArray`
- `RasterizerOrderedTexture2D`
- `RasterizerOrderedTexture2DArray`
- `RasterizerOrderedTexture3D`
Each of these types is treated and lowered as their corresponding `RW`
type, with the addition that loads and stores to values are wrapped with
`OpBeginInvocationInterlockEXT` and `OpEndInvocationInterlockEXT`. If
loads or stores to an ROV type are present, one of the
- `SampleInterlockOrderedEXT`
- `PixelInterlockOrderedEXT`
- `ShadingRateInterlockOrderedEXT`
execution modes are added to the entry function, based on semantics
inputted to the function.
ImageMSArray is only required is OpTypeImage arrayed=1, MS=1 and
sampled=2. This is never the case AFAIK for code emitted from HLSL. No
tests or shaders I knew about required it, so should be OK removing it
from DXC.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Add `-fspv-preserve-interface` CLI option to prevent DCE optimization
pass from compiling out interface variables. It happens if these
variables are unused.
The option may be useful when decompiling SPIR-V back to HLSL.
Personally, I need it to convert DX12 shaders to DX11 ones using
SPIRV-Cross as a tool for converting SPIR-V, produced by DXC, to the old
shader model HLSL.
SPIR-V Tools now have a parameter in `RegisterPerformancePasses()` and
`RegisterLegalizationPasses()` for this. This PR creates a new command
line option in DXC and passes it to the `spvtools::Optimizer`.
Closes#4567
* Enable lit by default.
* Update README for git user bin.
* Add DXC_DISABLE_LIT to replace DXC_ENABLE_LIT
* Set -DDXC_DISABLE_LIT=Off for appveyor.
* Keep original name for ClangSPIRVTests
* Set correct path for unit tests binary.
* Remove doc about DXC_DISABLE_LIT
* Remove extra space.
* Remove DxcOnUnix.rst.
Implement latest `SV_Barycentrics` semantics in DXC, according to DXC wiki spec.
For variant qualifier decorated baryCoord inputs, as KHR extension has no matched built-in, still mapped to Bary*AMD.
Add support for using input as first parameter of GetAttributeAtVertex. This would expand input's dimension to meet extension spec.
Merge some features from AMD_shader_explicit_parameter and current extension together for MSAA usage. Keep generated SPIR-V codes similar to GLSL as possible.
With concern to HLSL/DXIL part, keep most type modification/variable decoration change in SPIRV trans unit.
Remove AMD_shader_explicit_parameter related features.
Support struct member as parameter of GetAttributeAtVertex.
Support using Decorator for MSAA features.
* [lit] Port Support %if ... %else syntax for RUN lines from upstream
Based on
1041a9642b
This syntax allows to modify RUN lines based on features
available. For example:
RUN: ... | FileCheck %s --check-prefix=%if windows %{CHECK-W%} %else %{CHECK-NON-W%}
CHECK-W: ...
CHECK-NON-W: ...
This is merged to allow dxilver check apply on each RUN line.
* hlsl, spirv: allow use of ffinit-math-only
This options should match GCC's & Clang's behavior:
From GCC documention:
```
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs.
This option is not turned on by any -O option since it can result in incorrect output for programs that depend on an exact implementation of IEEE or ISO rules/specifications for math functions. It may, however, yield faster code for programs that do not require the guarantees of these specifications.
The default is -fno-finite-math-only.
```
This commit allows the flag to be used again, and makes SPIR-V min/max
intrinsics use FMin/FMax instead of NMin/NMax when enabled.
Fixes#4954
Signed-off-by: Nathan Gauër <brioche@google.com>
Co-authored-by: Laura Hermanns <laura.hermanns@epicgames.com>
This document tries to capture the basics of a unified build and test
workflow for all platforms. It is complete enough to start using but not
yet ready to be the default build mode.
* Support VK_EXT_mesh_shader
* Fix errors when compiling with SPV_NV_mesh_shader
* Minor tweak onto amplification shaders
* Fix pre-checkin failure
* Add amplification and mesh tests for EXT_mesh_shader
* Add some comments
* Fix primitive indices
Co-authored-by: Tianyuan <Tianyuan.Wu@amd.com>
* initial code stubbed in to get vk raw buffer store working
* adding in additional intrinsic to support custom alignment of store
* altering test, forgot to add template argument
* some changes to unit test, though not sure checks are correct... and then added support for bool
* updating unit test, correcting some mistakes with the checks
* more changes to unit test
* removing addr variable to see if thats causing tests to fail
* fixing tests
* Updating documentation
* Small typo fix in docs
Assume that matrices are stored in the column major order in raw
buffers, e.g., `ByteAddressBuffer` and `RWByteAddressBuffer`.
Add a new flag,`-fspv-use-legacy-buffer-matrix-order`, so that shaders
that depend on the previous matrix order (row major) can opt-out of this
change.
Fixes: https://github.com/microsoft/DirectXShaderCompiler/issues/3370
When -fvk-bind-register is used, all resource must have corresponding
register bindings set, including the Globals cbuffer. This change
clarifies the error message when the latter is the cause of failure and
directs the user to specify -fvk-bind-globals.
Fixes#3781
The specification for the HLSL min and max intrinsic functions states
that if one of the values is NaN, the other will be given as the result,
which is correctly represented by the GLSL.std.450 instructions NMin and
NMax, respectively. By the semantics of the previously used FMin and
FMax instructions, this would be undefined behavior.
Fixes#3221
The specification for the HLSL round intrinsic function states "Rounds
the specified value to the nearest integer. Halfway cases are rounded to
the nearest even.", so translate it to RoundEven SPIR-V extended
instruction rather than Round.
Fixes#4368
This change does a few things.
First and foremost, it disallows a number of illegal destination
parameters to atomic operations. This includes SRVs and other const
values, non-groupshared and non-resource variables, members of
typed resources, and bitfield members of any resource.
In addition, this correctly propagates const information so that they
can be properly rejected and also the information is reflected.
This involves the consolidation and earlier detection of a number of
these failures as well as an expansion of that detection.
Also adds validation checks that the targets are not const and either
UAVs or groupshared address space.
The const errors are produced in clang and manifest as overload
mismatches. The errors for member access of typed buffers, bitfields,
or non-groupshared and non-UAV destinations are generated
as part of the lowering operation. This is where they were before
and easier to replicate though they would be better in codegen.
In some cases, existing errors were moved earlier in the process
and the logic to generate them simplified.
Add compilation and verifier tests for all.
Also adds validation errors for const destination targets and other
invalid targets and tests for the same.
In order to make those tests, I had to give locations to the function
candidate notes for function builtins. This changed a single verifier
test that depended on the locationless notes.
These tests make use of const groupshared variables that are of
questionable utility, but are valid for now.
Incidentally enhances some existing atomics tests to use a more
complex sequence of references and subscripts.
Finally incidentally added some foolproofing for mistakenly comparing
opcodes without taking into account the "table" or namespace that
they belonged to by changing some utility functions in Sema to require
the table as well as the opcode to produce results.
Fixes#4319Fixes#4377Fixes#4437
VK_KHR_buffer_device_address extension support a Vulkan feature to load data
from an arbitrary raw buffer address. We added
'uint vk::RawBufferLoad(uint64_t address)' before, but it is limited to loading data
with the unsigned integer type.
This commit adds
'T vk::RawBufferLoad<T = uint>(uint64_t address [, uint alignment = 4] )' that
allows us to load data with an arbitrary type.
Since we check for the "SPIR-V 1.4 (under Vulkan 1.1 semantics)"
minimum target environment for ray tracing extension use, this change
allows it to be now configured as a target-env via command line options.
Some out of date documentation target environments is also updated,
and improvements are made to version checking to use enums rather
than string comparisons.
Fixes#4313
Currently, DXC assigns new locations for each stage variable. When there
are lots of stage variables, the number of locations we use can exceed the
maximum location limit. This commit uses `Component` decoration to pack
signature and reduce the number of locations DXC uses.
Shader Model 6.7 is still a work in progress.
Thanks to multiple contributors at Microsoft for this work.
Added support for additional advanced texture operations, including:
Raw Gather, Programmable Offsets, SampleCmpLevel, and RWTexture2DMSAA.
Added new QuadAny/QuadAll intrinsics to help with writing quad-uniform control flow.
Added [WaveOpsIncludeHelperLanes] pixel shader entry attribute, changing the
behavior of wave ops to treat helper lanes as active lanes.
For Vulkan 1.2 or less, we must not use HelperInvocation BuiltIn
decoration. Instread, for a variable with `[[vk::HelperInvocation]]`
attribute, we have to use `OpIsHelperInvocation` instruction:
- Create a variable (with Input storage class)
- In the module initialization, execute `OpIsHelperInvocation`
instruction and store the result in the variable
When command-line shift values are provided for binding numbers inferred
for register types, always use the t shift value for combined image
sampler resources, and ignore the s shift value.
Fixes#4166
The existing code creates a temporary variable with Output storage class
for the parameter with OutputPatch type for the patch constant
functions. However, this adds an additional output stage variable and
consumes more locations. In addition, it can also change the rendering
result depending on the driver.
This commit removes the additional output stage variable and use the
actual output stage variables for the argument passing for the
OutputPatch. This is correctly working because the output stage variable
keeps the output value for all invocation id and we can simply reuse it.
* Only support raw buffer (i.e. HLSL ByteAddressBuffer) style loads of single uint-s
* Always use 4 byte alignment for loads (same as HLSL ByteAddressBuffer)
* Add PhysicalStorageBufferAddresses capability and KHR_physical_storage_buffer extension when needed
* Promote memory addressing model to PhysicalStorageBuffer64 when needed
* Add custom alignment support to SpirvLoad
* Add createUnaryOp() overload that takes SpirvType
* Add getPhysicalStorageBufferType()
* Add documentation and a test case for the new intrinsic
The current DXC does not separate the resources from a cbuffer, which
results in a OpTypeStruct including resources. We have to separate the
resources from a cbuffer.
Fixes#4019
CMake supports passing in CMake scripts via the `-C` command line
option, which can set CMake cache variables to initialize options
before the first CMakeLists file is processed. This is a portable and
shell-agnostic way of supporting what the
`cmake-predefined-config-params` file is used for.
Mostly reverts the change that added an error when offsets were not
immediate or out of range for gather operations. More research suggests
this error was not required and it broke some existing shaders
Fixes#3738
Change the mapping of SV_ShadingRate from FragSizeEXT (VK_EXT_fragment_density_map) to PrimitiveShadingRateKHR/ShadingRateKHR (VK_KHR_variable_rate_fragment_shading).
- Move validator/dxil version checks up-front
These should fail first rather than side effects of trying to validate
details of a version we don't support.
- Improve message for unsupported validator or dxil version
These errors are most likely if compiling separately from validation
and failing to override the validator version properly, or running on
an external validator that doesn't support a newer dxil.
- Use dxil version from metadata for DxilModule when loading,
rather than just setting it to minimum based on shader model.
- Remove TODO from validator messages that shouldn't be there
Gather operations that take four separate offsets are meant to allow for
programmable, non-immediate values. This removes them from the immediate
and range validations by giving the immediate versions their own opcodes
For these intrinsics and loads, errors are generated when offsets are non-literal
or out of range. These largely use the slightly altered validation paths that the
sample intrinsics use.
Reword error when texture access offsets are not immediates to be
more all encompassing and grammatical.
Incidentally remove duplicate shaders being used for the validation
test from the old directory while identical copies in the validation
directory went unused. Redirected validation test to the appropriate
copies. This is consistent with the test shader re-org's stated intent
Sample operations have a maximum of 3 args, but gather has a maximum of
two since it always operates on 2D images. So gather operations only
pass in two offset args to the offset validation, which resulted in an
invalid access.
Rather than adding a specialized condition to evade this, just iterate
over the number of elements in the array. For sample it will be 3 and
for gather 2 and it will still check for expected undefined args
appropriately.
For the offset legalization pass, the opcode is used to determine the
start and end of the offset args
Only produce the loop unroll suggestion when within a loop
Base error line on call instruction instead of source of the offset
Sort by location in source when possible and remove duplicates
Adapt tests to verify and match these changes
Fixes#2590Fixes#2713
According to Vulkan specification when using `OpImageRead/OpImageWrite`, the `OpTypeImage` (`Buffers`, `RWBuffers`, `RWTextures`) must have a format that matches the format on the API side, unless the StorageImageReadWithoutFormat/StorageImageWriteWithoutFormat is added and `Unknown` is used as the format.
This pull request addressess #2498 for the format part by adding an attribute `[[vk::image_format("<image format as spelled in SPIR-V spec>")]].` Example of the syntax:
```
[[vk::image_format("rgba8")]]
RWBuffer<float4> Buf;
[[vk::image_format("rg16f")]]
RWTexture2D<float2> Tex;
RWTexture2D<float2> Tex2; // Works like before
```
The `image_format` only applies to **global variables** of type `Buffer`, `RWBuffer`, `RWTexture`. For variables and function parameters it is propagated by the inlining pass in legalization. This required a small change to one of the passes in SPIRV-Tools, that should be also checked by someone more familiar with the codebase: https://github.com/KhronosGroup/SPIRV-Tools/pull/4126
Note that this does not fix the handling of unspecified format (that case still works like before, using `R32f`, etc. based on the type in shader), although it should be still fixed to add the
StorageImageReadWithoutFormat and/or StorageImageWriteWithoutFormat and use Undefined. But I think the ability to specify the format is more urgent.
Design note from Jaebaek:
Since the `image_format` attribute only applies to **global variables**, under the DXC architecture
only `DeclResultIdMapper` can check the attribute when it handles `VarDecl`s. It means
we have to pass the `image_format` information to `LowerTypeVisitor` because it cannot access to
`VarDecl`. In order to pass the `image_format`, we use `SpirvContext` that can be accessed by
`SpirvEmitter` and all visitors. We use `SpirvVariable` to `spv::ImageFormat` mapping because the
attribute only applies to **global variables** (not to image types).
See how we use `llvm::DenseMap<const SpirvVariable *, spv::ImageFormat> spvVarToImageFormat`.
groupshared variables shouldn't be allowed outside of compute and
compute-like shaders. This adds a validation error when such
variables are used.
Add shader mask where groupshared are used
Correct several tests that used groupshared incorrectly
Incidental fix to atomics tests where the special case initializations
might have allowed the threads that failed to write to proceed
before the write took place and read the improperly initialized values
Changed up the assignments to avoid write collisions and be more
readable
Fixes#2603, #2677
In DX12, `SV_InstanceID` always counts from 0. On the other hand,
Vulkan `gl_InstanceIndex` starts from the first instance. Thus it
doesn't emulate actual DX12 shader behavior. To make it equivalent,
we can set `SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`.
However, it can break the existing working shaders (i.e., compatibility
issue). We want to provide an option for users to choose what they
exactly want by `SV_InstanceID`. This commit adds
`-fvk-support-nonzero-base-instance` option. If it is enabled, it emits
`SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`. Otherwise,
`SV_InstanceID = gl_InstanceIndex`.