Fixing a bunch of SM 6.2 Tests
- BinaryHalfOp tests reading input2 instead of input1
- Fixed converting signed zero from float to float16
- Fixed compiler options for float16 min,max test
- Fixed expected results for following float16 operations
- cos(0), sin(314), sin(-314)
- frc(-7.39)
- rsqrt(65504)
- int16 subtraction
- int16 unsigned multiplication
- sqrt, rsqrt, round_pi, round_ni, fmad treat denorm as it is
Two new Vulkan specific intrinsic resource types, SubpassInput and
SubpassInputMS, are introduced to provide a way to use subpass inputs
in Vulkan. Their signatures:
template<typename T>
class SubpassInput {
T SubpassLoad();
};
template<typename T>
class SubpassInputMS {
T SubpassLoad(int sample);
};
T can be a scalar or vector type.
If compiled without ENABLE_SPIRV_CODEGEN, the compiler will report
unknown type name for SubpassInput(MS). If compiled with SPIR-V
CodeGen and trying to use SubpassInput(MS) in DXIL CodeGen, the
compiler will report an error saying the type is Vulkan-specific.
A new Vulkan-specific attribute, vk::input_attachment_index is
introduced to provide a way to set the input attachment index for
SubpassInput(MS) objects.
- TAEF is unable to recognize negative values in hex representation in x64 released build. So use decimal values instead
- Add 'Shl', 'LShr', 'And', 'Or', 'Xor' execution tests
- Update Warp Version for test cases known to fail
- Cleaning up execution test
- Generating xml test file from the script and hctdb
- Fixing hctdb categories for unsigned int
- Adding more execution tests (Fadd, FSub, FMul, FDiv, Add, Sub, SDiv)
This change removes new Load variations (LoadHalf, LoadFloat, etc) for RWByteAddressBuffer/ByteAddressBuffer methods and add templated Load intrinsics. This templated Load only works on scalar or vector types, and its variations (Load2, Load3, Load4, etc) do not work with templates and work as it did before (only storing uints). For Store operation, you can store any scalar or vector types of up to 16 bytes, while prior to 2018 Store only supported storing uint scalar.
This change is to add fixed width scalar types in HLSL 2018.
These types are: int16_t, uint16_t, int32_t, uint32_t, float16_t, float32_t, and float64_t.
Prior HLSL versions do not support these fixed width types.
This change also renames /no-min-precision command line option to /enable-16bit-types. int16_t and uint16_t are only allowed with this mode.
This change is to introduce native precision types for shader model 6.2. With native half (float16) types, we will introduce native int16/uint16 types with same command line option. This means that one option will disable all min precision types. For now, int16/uint16 types will be exposed on HLSL level as int16_t and uint16_t.
This change also includes some constant buffer fixes for 16 bit types, as well as int64 types.
This change is an extension of float16 support. We are adding LoadHalf, LoadFloat, and LoadDouble method to byte address buffer so that users can access data from byte address buffer by these types. Also starting shader model 6.2, we are mapping byte address buffer and structure buffer load/store operations to RawBufferLoad/Store to differentiate raw buffer load from typed buffer load. Unlike BufferLoad for typed buffers, RawBufferLoad for min precision types will not have its min precision values as its return types, but their actual scalar size in buffer (i.e rawBufferLoad.i32 for min16int and rawBufferLoad.f32 for min16float). RawBufferLoad/Store contains additional parameters, where mask was required for correct status behavior for CheckAccessFullyMapped, and alignment is for relative alignment for future potential benefit for backend.
* Adds support for pause/resume in pipeline.
* Adds 0.0 as the validator version to indicate no validation should occur.
* Moves module parsing from dxcompiler down to HLSL library.
* Adds support for optimizer to process a DXIL part.
* Adds a form with designer support for the interactive optimizer.
See the comment in DxilDebugInstrumentation.cpp for a summary of how this-all works. Basically: instrument all instructions to output "trace" info to a UAV so that a debugger can reconstruct the execution of the shader and offer a debugger-like experience.
* Add SV_Position index parameter, various minor tweaks to UAV setup
* Convert to raw buffer
* fix up unit tests, add a few names to variables
* CR feedback: check for pre-existing type; typo
This change is to enforce the new constraint on signature packing: pack signature elements by data width. Before we introduce fp16 type, every element was assumed to reserve 32 bits. Since we are introducing a new 16 bit data type, we need a new way to enforce signature rules.
After discussions we decided that it would be nice to pack elements based on data width. However, we are still enforcing the rule that each row contains up to 4 elements, regardless of the size. This way, depending on the hardware support drivers can optimize packing signatures, while on DXIL level we maintain the assumption that there are 4 elements per row. We are also still constraining on the total number of rows to be 32 for now. This can be changed in the future if people find this limit to be an issue.
* Fix ViewID state for flow control and consistent ViewID detection
1. Different ways of determining ViewID usage led to assert or crash.
Use DxilModule flag to guarantee consistency. But this required
splitting DxilEmitMetadata pass into two passes, one that finalizes
the module and computes flags run before ComputeViewIdState, and one
that simply writes the metadata.
2. storeOutput in flow control with values not dependent on that flow,
such as literals, would not pick up control flow dependence.
Fix by picking up control flow dependence on storeOutput instructions,
not just the value being written.
* Fix control depenence computation
There was a bug in the implementation of the control dependence algorithm.
We were incorrectly iterating over all descendents in the postdom tree where
the algorithm should only iterate over immediate children.
This change is to support preserve denorm operations. We do this by creating dxc options and propagate them all the way to dxil metadata so that drivers can see how it should handle floats.
Currently we support three modes: any(default), preserve, and ftz
Denorm option will be of per function flag. We store this information at function annotation metadata. This means that for DXIL 1.2 we have a new structure for function annotation. The change in structure is documented on DXIL.rst
The pass implements functionality for PIX for the "pixel cost", "depth complexity" and "overdraw" visualizers. You can probably infer what the pass does from the names "overdraw" and "depth complexity": For each pixel rendered it increments a corresponding counter in a UAV of a buffer that is the same size as the render target. The "pixel cost" pass does the same thing, only the increment is a weight value calculated from the total cost of the draw call, as derived from PIX's GPU-side profiling system.