[[vk::builtin("...")]] is introduced to support Vulkan-specific
builtins.
There are two supported in this commit:
* gl_PointSize
* gl_HelperInvocation
Validating the usages of these two builtins is left for anther
commit.
Uses the legalization pass recipe from SPIRV-Tools.
Also change -fcgl to disable both legalization and optimization.
Validation on some tests are turned off because of new image
instruction checking in SPIRV-Tools. Enabling them is a TODO.
-bump up dll version for fixed version
-show dxcompiler and dxil version number for dxc help message
-emit warnings for dxcompiler validator that container will not be signed
Right now we just do a load and then a store over of whole struct,
completely ignoring the storage class (and thus layout). This is
surely invalid SPIR-V, but we need this right now to enable some
workloads. Hopefully SPIRV-Tools will be able to remove these
loads/stores.
Updating associated counters for certain types is a TODO.
Previously we reply on Expr::isGLValue() to check whether an
expression is lvalue. That requires lots of ad-hoc fixes in
the CodeGen for missing/wrong lvalue/rvalue.
Now we use SpirvEvalInfo to calculate the lvalueness together
with the <result-id>. It is much cleaner.
Fixes https://github.com/Microsoft/DirectXShaderCompiler/issues/819
If there are no return value specified in the source code for a
certain control flow path, just return the null value.
Previously always use OpReturn to terminate basic blocks without
explict return statement from the source code, which may cause
illegal SPIR-V generated.
This commit add checking for three HS/GS rules:
* HS entry point should have the patchconstantfunc attribute
* GS entry point should have the maxvertexcount attribute
* GS stream-output objects should be inout parameters
This change removes new Load variations (LoadHalf, LoadFloat, etc) for RWByteAddressBuffer/ByteAddressBuffer methods and add templated Load intrinsics. This templated Load only works on scalar or vector types, and its variations (Load2, Load3, Load4, etc) do not work with templates and work as it did before (only storing uints). For Store operation, you can store any scalar or vector types of up to 16 bytes, while prior to 2018 Store only supported storing uint scalar.
[[vk::push_constant]] can be attached to a global variable of
struct type to put that variable in the PushConstant storage
class.
PushConstant should be of OpTypeStruct type with std430 layout.
This change is to add fixed width scalar types in HLSL 2018.
These types are: int16_t, uint16_t, int32_t, uint32_t, float16_t, float32_t, and float64_t.
Prior HLSL versions do not support these fixed width types.
This change also renames /no-min-precision command line option to /enable-16bit-types. int16_t and uint16_t are only allowed with this mode.
* Translation of GroupMemoryBarrier(WithGroupSync)
* Translation of DeviceMemoryBarrier(WithGroupSync)
* Translation of AllMemoryBarrier(WithGroupSync)
This change is to introduce native precision types for shader model 6.2. With native half (float16) types, we will introduce native int16/uint16 types with same command line option. This means that one option will disable all min precision types. For now, int16/uint16 types will be exposed on HLSL level as int16_t and uint16_t.
This change also includes some constant buffer fixes for 16 bit types, as well as int64 types.
This commit supports .Append() and .RestartStrip() method calls
on stream-output objects, which will be translated into SPIR-V
OpEmitVertex and OpEndPrimitive, respectively.
For each .Append() call, all affected stage output variables
will be flushed.
This change is an extension of float16 support. We are adding LoadHalf, LoadFloat, and LoadDouble method to byte address buffer so that users can access data from byte address buffer by these types. Also starting shader model 6.2, we are mapping byte address buffer and structure buffer load/store operations to RawBufferLoad/Store to differentiate raw buffer load from typed buffer load. Unlike BufferLoad for typed buffers, RawBufferLoad for min precision types will not have its min precision values as its return types, but their actual scalar size in buffer (i.e rawBufferLoad.i32 for min16int and rawBufferLoad.f32 for min16float). RawBufferLoad/Store contains additional parameters, where mask was required for correct status behavior for CheckAccessFullyMapped, and alignment is for relative alignment for future potential benefit for backend.
Also make parameter names clear that they are for patch constant
function (PCF), not generally patch constant, which also involves
some domain shader inputs.
As per the Vulkan spec requirement:
Any variable decorated with Position must be declared as a
four-component vector of 32-bit floating-point values.
But for HS/DS/GS, we actually have an extra arrayness. If we
generate a stand-alone Postion builtin variable, it will be
an array of float4, which does not comply with the spec.
Similary for the type requirements on ClipDistance and
CullDistance.
The spec could have an problem on this issue, but the GLSL way
is to emit a gl_PerVertex that contains Position, ClipDistance,
and CullDistance. That satisfies the current Vulkan spec.
This commit converts VS output, HS/DS input and output, GS
input to emit the gl_PerVertex struct. It also splits arrays
of structs into arrays of the fields for HS/DS/GS input/output.
ClipDistance/CullDistance is also supported in this commit,
which requires quite some non-trivial handling.
* byte offset, not element offset => multiply by sizeof(dword)
set uav to r/w
Re-emit meta-data becomes delete then emit
* update test
* re-compute view id state
This change is to update correct target data layout for DXIL. Now that we have a scalar type of size less than dwords, we need to correctly print string data layout to determine non-packed structure layout.
This change also fixes alignments issues with ConstantBuffer as it was having different CG path from cbuffer.
A new attribute [[vk::counter_binding(X)]] can be used to specify
the binding number for associated counters for RW/append/consume
structured buffers.
Also added support for .IncrementCounter() and .DecrementCounter()
for RWStructuredBuffer.
Also fixed the type error of OpAtomicI{Add|Sub}.
Added necessary functionality to handle domain shader translation.
Also added Bezier domain shader as test.
Also added Patch decoration to the hull shader patch constant outputs and
the domain shader patch constant inputs.
Added the following four command line options to shift register
number for Vulkan:
* -fvk-b-shift
* -fvk-t-shift
* -fvk-s-shift
* -fvk-u-shift
These options are used to avoid assigning the same binding
number to more than one resources.
Structure buffer alignment change for float16. Starting in shader model 6.2, the structure will match that of #pragma pack(8) alignment in C++
Removing padding for structure buffer. We will not have any padding in HLSL structure since load/store operation already handles offset as if padding exist in #pragma pack(8) mode in C++
Disable legacy alignment for non precision mode as it was only needed for min precision
If we are returning some struct value not in the Function storage
class, we need to decompose it and write each component to the
corresponding component of a temporary variable and then return
that temporary variable.
Also fixed a crash regarding implicitly generated constructors
and destructors for structs.
* Adds support for pause/resume in pipeline.
* Adds 0.0 as the validator version to indicate no validation should occur.
* Moves module parsing from dxcompiler down to HLSL library.
* Adds support for optimizer to process a DXIL part.
* Adds a form with designer support for the interactive optimizer.
See the comment in DxilDebugInstrumentation.cpp for a summary of how this-all works. Basically: instrument all instructions to output "trace" info to a UAV so that a debugger can reconstruct the execution of the shader and offer a debugger-like experience.
Fix static data member const issue. For now we are setting static data member to have internal linkage simply because current module passes does not remove global variable for static data member unless it's an internal linkage (LowerStaticGlobalIntoAlloca), and we do not allow instructions using global variables. This can be fixed in the future if we want to keep static data member to maintain external linkage and create an extra pass to handle this.
This allows us to convey more information about the evaluation
(constness, storage class, layout rule, etc.) than before
(only result id).
This sets the foundation for propagating relaxed precision info
later.
When seeing opaque types within structs in function parameter,
function return, and variable definition, invoke SPIRV-Tools
legalization passes.
Also refreshed external projects
* Add SV_Position index parameter, various minor tweaks to UAV setup
* Convert to raw buffer
* fix up unit tests, add a few names to variables
* CR feedback: check for pre-existing type; typo
Since this feature is planned to be supported in future release, the test will fail on current OS versions.
This test case should be tested once there is a driver support for float16 option.
This change is to enforce the new constraint on signature packing: pack signature elements by data width. Before we introduce fp16 type, every element was assumed to reserve 32 bits. Since we are introducing a new 16 bit data type, we need a new way to enforce signature rules.
After discussions we decided that it would be nice to pack elements based on data width. However, we are still enforcing the rule that each row contains up to 4 elements, regardless of the size. This way, depending on the hardware support drivers can optimize packing signatures, while on DXIL level we maintain the assumption that there are 4 elements per row. We are also still constraining on the total number of rows to be 32 for now. This can be changed in the future if people find this limit to be an issue.
* [spirv] OpImageRead can read a vector of any size.
* [spirv] Add support for Load() of RWTexture types.
* [spirv] Texture2DMS and Texture2DMSArray types.
* [spirv] Capability for 1D images without a sampler
* [spirv] Update documentation of Texture2DMS(Array)
* [spirv] Support {Append|Consume}StructuredBuffer
* Supported the {Append|Consume}StructureBuffer type
* Supported .Append() and .Consume()
{Append|Consume}StructureBuffer<S> is implemented as OpTypeStruct
with an OpTypeRuntimeArray of S. And each such array will have
an associated counter, whose type is OpTypeStruct of 32-bit integer.
The index for .Append()/.Conume() is fetched by doing atomic
operations on the associated counter.
* Use subtraction of one instead of addition of negative one
For these objects, if the lhs and rhs are of different storage
class, we may need to do the assignment recursively at the
non-composite level since OpStore requires that the pointer's
type operand must be the same as the type of object.
[spirv] Refactor Buffer and Texture load functions
Also fixes 2 bugs:
1- RWBuffer must use OpImageRead rather than OpImageFetch.
2- OpImageFetch/Read take a parameter whose type is an image. We were
incorectly passing the *variable* (which is of type pointer-to-image).
This change also allows TextureXX<type> to be used with types other than
float4. Tests were added for such cases.
* Supported (RW)StructuredBuffer types
* Supported accessing elements in (RW)StructuredBuffer
* Supported the Load() method
* Supported std430 layout
Also optimized OpAccessChain CodeGen. This is necessary right
now because otherwise we will have pointers to intermediate
composite objects like structs and arrays. Structs/arrays are
different types with and without layout decorations. To have the
correct struct/array pointer type for OpAccessChain, we need to
always pass in the correct layout rules when translating the
types, which is not easy. Optimizing intermediate OpAccessChain
away solves the problem because scalar/vector/matrix types are
unique given the same parameter.
* [spirv] Support Load function for Buffer/RWBuffer.
* [spirv] Operator[] for Buffer/RWBuffer.
Also some improvement for handling of vec2 and vec3 cases:
use OpVectorShuffle instead of OpCompositeExtract&OpCompositeConstruct.
* Address code review comments.
This change is to support fp16 on constant buffer.
We are still keeping a row pitch of 4 dwords. So we can fit up to 8 halfs in one row.
Dword data types will be aligned to dword address space for efficiency. We are also introducing new flags "use native low precision" if we have low precision type present and /no-min-precision option is enabled.