This is the work of many contributors from Microsoft and our partners.
Thank you all!
Add support for 64-bit atomics, compute shader derivatives, dynamic
resources (from heap), 8-bit Packed Operations, and Wave Size.
All of these require compiling with 6_6 targets with just a few
exceptions. Each of these features include unittests and Execution
tests.
64-bit atomics add 64-bit variants of all Interlocked* intrinsic
operations. This involves changing some of the code that matches
intrinsic overloads to call instructions. Also adds a few float
intrinsics for compare and exchange interlocked ops which are available
for all shader models 6.0 and up.
Compute shader derivatives adds dd[x|y], CalculateLevelOfDetail, and
Sample operations that require derivatives to compute. QuadRead
operations have been allowed in compute from 6.0+ and tests are added
for them here.
Dynamic resources introduce global arrays that represent the resource
and sampler heaps that can be indexed without requiring root signature
representations. This involves a new way of creating and annotating
resource handles.
8-bit Packed operations introduces a set of intrinsics to pack and
unpack 8-bit values into and out of new 32-bit unsigned types that can
be trivially converted to and from uints.
WaveSize introduces a shader attribute that indicates what size the
shader depends on the wave being. If the runtime has a different wave
size, trying to create a pipeline with this size will fail.
* [spirv] Introduce the implicit 'vk' namespace
If any of these are used for DXIL code generation, the compiler will
report an error about the unknown "vk" namespace.
* [spirv] Introduce vk::ReadClock intrinsic.
The following Vulkan specific intrinsic functions are added:
```hlsl
uint64_t vk::ReadClock(in uint32 scope);
```
Also the following Vulkan-specific implicit constants are added:
```
vk::CrossDeviceScope; // defined as uint 0
vk::DeviceScope // defined as uint 1
vk::WorkgroupScope // defined as uint 2
vk::SubgroupScope // defined as uint 3
vk::InvocationScope // defined as uint 4
vk::QueueFamilyScope // defined as uint 5
```
Sample usage looks as follows:
```hlsl
uint64_t clock = vk::ReadClock(vk::WorkgroupScope);
```
If any of these are used for DXIL code generation, the compiler will
report an error about the unknown "vk" namespace.
* [spirv] Add documentation.
* Address code review comments.
* Test: Validate vk namespace is not allowed for dxil.
* Fix usage of DXASSERT.
* Move ValidateVkNamespaceNotAllowed test to HlslFileCheck.
* Add -fvk-auto-shift-bindings that applies -fvk-*-shift bindings to resources that do not have explicit register assignments. Closes#2956
* For resources that don't have an explicit register binding, first categorize their registerType using new method DeclResultIdMapper::getImplicitRegisterType.
* When finding automatic binding indices for implicit register bindings, pass an optional shift to useNextBinding.
* Change getNextBindingChunk() to handle finding the next binding range given a bindingShift
* Fix regression with binding numbers causing AppVeyor failure and also fix crash I ran into in my own testing in getImplicitRegisterType() if the getSpirvInstr() did not have an AstResultType.
* Address review comments.
* Update documentation
Co-authored-by: Ehsan Nasiri <ehsannas@gmail.com>
The validation error message for when one or more of the components of a
storeBuffer (or similar) operation are undefined was terribly confusing
to anyone not developing the compiler. It mentioned mask mismatches when
one of the masks was generated based on which operands were undefined.
This changes the error message when the flaw results from the user under-
specifying the copied variable to simply state that the assignment involves
undefined values. If the masks mismatch because the type-based mask didn't
expect values that the user somehow defined, the original error message
results as this is likely a compiler error.
In addition, this consolidates some of the mask checking for rawbuffer,
texture, and generic buffer storage.
This strips the previous method of error reporting from DxilValidation
in favor of the DxilUtil methods. With these comes a new way to find
location information for those associated with functions or globals.
Instruction reporting is slightly changed in that the asm informtion
including the instruction and its block is added as a note so as to not
clutter up the line with the error and hlsl location.
For the most part, the existing EmitError* functions were retained, but
their implementation was changed to largely format inputs for the
dxilutil:: function and call it.
Errors that had clear associations with instructions, functions, or
globals had their emitting functions changed to variants that had this
information. The result is error message with as much location
information as is available.
In some cases, functions that implicitly added format arguments where
removed in favor of explicit formatted error calls that are clearer.
* Avoid directory duplication in DIFile MD
Change #2030 prevented the duplication of the directory for filenames
with full paths. However the same problem exists for relative paths that
include directories. This results in duplicate DIFile entries and
confusing error message outputs for Global Variables.
This skips the original code which was meant to prepend the directory
name since our filenames always have that information already.
* Validation error message text changes
This includes a number of different changes to validation error
messages. Some correct grammar. Some add a format argument to the
message for better information. Many of them just add a period to the
end of the error message for consistency and to allow the "Use /Zi"
message to be tagged on the end without jumbling the output.
* Enhancements for Dxil DiagnosticInfo and DxilUtil
Introduce *OnContext diagnostic reporting for messages that don't
clearly adhere to any instruction, global, or function. Also includes
the ability to add a note to supplement an earlier diagnostic.
Add function to DiagnosticInfo. This provides a fallback option for
printing location information when debug info isn't available. At least
we can print the function where the issue occurs.
For GV errors, we can't depend on the module having a dxil module in it.
Debug modules don't get the dxil part cloned. The only use of it was the
potentially cached debuginfofinder. Lacking it, we can just create it on
the fly.
Corrected the format of the output for DiagnosticInfoDxil::print. In
addition to minor fixups, when the diagnostic is a remark or note, it
doesn't need the "Use /Zi" prompt.
DxilLoopUnroll has its own diagnostics reporting. This adds the
associated function for the messages.
* Add function location information for interp mode error
This is to satisfy a specific request for line information for this
specific error. It got left out of the original batch, but it's
important enough for a followon.
`@dx.op.bufferLoad` is for `BufferLoad`, not `RawBufferLoad`.
Similar to the previous, the appropriate function for `RawBufferStore` is `@dx.op.rawBufferStore`.
This matches what DxilOperations.cpp does.
Add GetResourceFromHeap for hlsl.
For Dxil, add CreateHandleFromHeap and AnnotateHandle.
All handles ( createHandle, createHandleForLib, createHandleFromHeap ) must be annotated with AnnotateHandle before use.
TODO: add AnnotateHandle for pix passes.
cleanup code about resource.
New Rules:
outer type may be: [ptr to][1 dim array of]( UDT struct | scalar )
inner type (UDT struct member) may be: [N dim array of]( UDT struct | scalar )
scalar type may be: ( float(16|32|64) | int(16|32|64) )
- Disallow pointers to pointers, and pointers in structs
- Disallow multi-dim arrays at top-level, but allow within struct
Two test options, -Qstrip_reflect_from_dxil and -Qkeep_reflect_in_dxil
for making tests work with reflection removed, since many tests are relying
on main module disassembly-reassembly between test phases and reflection
metadata will no longer be present there. The strip option is for the
few cases where tests don't want the reflection kept in DXIL by default.
Validator no longer requires function annotations for no reason.
Fix places where remove global hook was not being called when functions
were removed manually from the list.
StripReflection now deletes function annotations, unless targeting lib or
old validator that required them. Preserve global constructor list and
add annotation for 1.4 validator. The global hook fixes were required
here, otherwise annotations would refer to dead functions during linking.
Struct annotations may not be removed in library case when they still need
translation to legacy types.
Allow missing struct annotation when not necessary to upgrade the layout.
Preserve usage in reflection by upgrading the module, emitting metadata,
cloning for reflection, then restoring validator version and re-emit
metadata.
Fix size for 16-bit type for usage and reflected size.
Make various batch reflection tests require validator 1.5, since these
tests rely on module disassembly->assembly, which will not preserve extra
usage metadata for reflection in 1.4.
Include reflection part in IDxcAssembler, but don't strip from module,
since there are no options to prevent this from breaking a lot of tests.
Don't strip reflection from offline lib target.
* [spirv] Add option to flatten array of resources.
Current SPIR-V code generation uses 1 binding number for an array of
resources (e.g. an array of textures). However, the DX side uses one
binding number per array element. The newly added
'-fspv-flatten-resource-arrays' changes the SPIR-V backend behavior to
use one binding number per array element, and uses spirv-opt to flatten
the array.
TODO: Add a test where the array is passed around.
TODO: Test this works with steven's PR and proper results are produced.
* [spirv] Update tests to include array of samplers.
* [spirv] Take early exit condition out of the loop.
* [spirv] Add documentation for the new cmd option.
* [spirv] Invoke CreateDescriptorScalarReplacementPass when needed.
* [spirv] address code review comments.
- Compute mesh payload size before final object serialization
- During CodeGen for MS based on payload parameter
- During CollecShaderFlagsForModule for AS based on DispatchMesh call
- Store payload sizes in corresponding funtion properties, serializing
these properly for HL and Dxil Modules
- Use payload sizes from function props for PSV0 data during serialization
- Validate measured and declared payload sizes, don't just fill in
properties during validation
- Fix Wave/Quad allowed shader stages, enabling Quad* with CS-like models
- rename payloadByteSize members to payloadSizeInBytes
- Add GetMinShaderModelAndMask overload taking CallInst for additional
detail required to produce correct SM mask for Barrier operations
- Update the HLSL syntax from FeedbackTexture2DMinLod to FeedbackTexture2D<SAMPLER_FEEDBACK_MIN_MIP>
- Update DXIL to only have two UAV types for FeedbackTexture2D[Array] and use an extra metadata field to distinguish between the sampler feedback type.
- base of struct should always be aligned - or internal bug
- offset for array member must always be aligned - (new) validation error
- alloc and verify struct layouts even when not array field
- out of bound check would have missed OOB on last array element
* [spirv] Respect the -auto-binding-space option in SPIR-V backend.
* [spirv] Use default space if vk::binding doesn't provide dset.
* Address code review comments.
* Allow clip/cull elements to be declared as array [2]
- This approach fixes validation and packing to handle this case.
- There could be implications to runtime ViewID validation
- fix some issues found in packing related to rowsUsed result from Pack
functions. Make these return 0 on failure, instead of startRow.
- Split PackNext into FindNext and PackNext that uses it for greater
flexibility.
* Add support to convert DXR HLSL to SPV_NV_ray_tracing
* Fix multiple typos and cleanup using clang-format.
* Update tests to verify transpose and custom matrix type
Update tests to add multple entry functions of same shader stage
* Replace ExecutionModel class with ShaderModel::Kind
- This change removes ExecutionModel class and relies on ShaderModel::Kind to track current entry point shader stage
- Also instead of declaring it in SpirvEmitter, DeclResultIdMapper & GlPerVertex, we declare it only once in common object SpirvContext
* Dont create a stageVar for raytracing interface variables.
* Don't perform 'new' memory allocation for FunctionInfo object
This change also -
- removes invalid "SpirvEmitter::" from function declarations in SpirvEmitter class.
- fix build errors by adding a default constructor in FunctionInfo struct to allow functionInfoMap allocate an empty object for no search results.
* Fix some more typos and fomatting errors.
* Update RST with raytracing stage info
* In SpirvContext.h, replace unsigned by uint32_t
* Test add ascii art and fixup grammar mistakes.
* Use placement new to allocate FunctionInfo objects.
Also bundle the insertion into functionInfoMap and workQueue together.
* Remove outdated comment.
* Update RST with intrinsic mapping and typo fixes.
* Some more wording fixes to RST for raytracing.
* Accidently broke table in RST due to missing '-'
* Add tick marks for supported stages in RST
* Final clang-format formatting fixes.
* Add missing labels to flowchart and spacing.
These new DXIL instructions are added to SM 6.5. The valid operations
for <Op> are:
- BitAnd
- BitOr
- BitXor
- CountBits
- Product
- Sum
In HLSL, these are exposed as:
uint4 WaveMatch(<type> val)
<type> WaveMultiPrefixBitAnd(<type> val, uint4 mask)
<type> WaveMultiPrefixBitOr(<type> val, uint4 mask)
<type> WaveMultiPrefixBitXor(<type> val, uint4 mask)
uint WaveMultiPrefixCountBits(bool val, uint4 mask)
<type> WaveMultiPrefixProduct(<type> val, uint4 mask)
<type> WaveMultiPrefixSum(<type> val, uint4 mask)
In DXIL, these are exposed as:
[BitAnd,BitOr,BitXor,Product,Sum]
%dx.types.fouri32 @dx.op.waveMatch.T(i32 %opc, T %val)
T @dx.op.waveMultiPrefixOp.T(i32 %opc, T %val, i32 %mask_x,
i32 %mask_y, i32 %mask_y, i32 %mask_z,
i8 %operation, i8 %signed)
[CountBits]
i32 @dx.op.waveMultiPrefixBitCount(i32 %opc, i1 %val, i32 %mask_x,
i32 %mask_y, i32 %mask_y,
i32 %mask_z)
Scalarization of vector types occur as per the existing wave intrinsics.
For WaveMatch, the match is performed on each scalar and the results
are combined with bitwise AND. For WaveMultiPrefix, the operation is
performed on each scalar and combined into an aggregate.