* Adding test fixes and more support for SM6.5 WaveMultiPrefix functions
1) Ensure wave lane-id’s are sorted when accumulating result
2) Fix made for HLSL in ShaderOpArithTable.xml to ensure the input aligns with what is being tested
3) Added support for the “UBit” version for all of these tests.
Two test options, -Qstrip_reflect_from_dxil and -Qkeep_reflect_in_dxil
for making tests work with reflection removed, since many tests are relying
on main module disassembly-reassembly between test phases and reflection
metadata will no longer be present there. The strip option is for the
few cases where tests don't want the reflection kept in DXIL by default.
Validator no longer requires function annotations for no reason.
Fix places where remove global hook was not being called when functions
were removed manually from the list.
StripReflection now deletes function annotations, unless targeting lib or
old validator that required them. Preserve global constructor list and
add annotation for 1.4 validator. The global hook fixes were required
here, otherwise annotations would refer to dead functions during linking.
Struct annotations may not be removed in library case when they still need
translation to legacy types.
Allow missing struct annotation when not necessary to upgrade the layout.
Preserve usage in reflection by upgrading the module, emitting metadata,
cloning for reflection, then restoring validator version and re-emit
metadata.
Fix size for 16-bit type for usage and reflected size.
Make various batch reflection tests require validator 1.5, since these
tests rely on module disassembly->assembly, which will not preserve extra
usage metadata for reflection in 1.4.
Include reflection part in IDxcAssembler, but don't strip from module,
since there are no options to prevent this from breaking a lot of tests.
Don't strip reflection from offline lib target.
- Align coord dimensions with Sample for future flexibility and alignment
- Fix ddx and ddy arguments to support the correct number of dimensions
- Rewrite lowering, using SamplerHelper
- Clean up SampleHelper a bit, adding additional asserts/checks
- Set components to zero for default offset, not undef
- Compute mesh payload size before final object serialization
- During CodeGen for MS based on payload parameter
- During CollecShaderFlagsForModule for AS based on DispatchMesh call
- Store payload sizes in corresponding funtion properties, serializing
these properly for HL and Dxil Modules
- Use payload sizes from function props for PSV0 data during serialization
- Validate measured and declared payload sizes, don't just fill in
properties during validation
- Fix Wave/Quad allowed shader stages, enabling Quad* with CS-like models
- rename payloadByteSize members to payloadSizeInBytes
- Add GetMinShaderModelAndMask overload taking CallInst for additional
detail required to produce correct SM mask for Barrier operations
- Update the HLSL syntax from FeedbackTexture2DMinLod to FeedbackTexture2D<SAMPLER_FEEDBACK_MIN_MIP>
- Update DXIL to only have two UAV types for FeedbackTexture2D[Array] and use an extra metadata field to distinguish between the sampler feedback type.
The tests were manually added to ShaderOpArithTable.xml in PR #1867,
but ShaderOpArithTable.xml should be generated by hctdb_test.py script.
No actual changes to the test data, just the order of the tests is different
because of rearranging done by the script XML processing.
Added a check for shader model 6.5 so the tests will be skipped on lower
shader models.
- base of struct should always be aligned - or internal bug
- offset for array member must always be aligned - (new) validation error
- alloc and verify struct layouts even when not array field
- out of bound check would have missed OOB on last array element
Added generating of new version for each DX Compiler build.
There are 3 kinds of version:
1. **Official build**
Built by using `hctbuild -official`. The version is based on the current DXIL version, latest official release and a number of commits since then. The format is `dxil_major.dxil_minor.release_no.commit_count`. For example a current official version would be something like `1.5.1905.42`. The latest release information is read from `utils\version\latest-release.json`. The `1905` corresponds to `dxil-2019-05-16` release branch and `42` is the number of commits since that release branch was created. For master branch the `commit_count` will be incremented by 10000 to distinguish it from stabilized official release branch builds. So the current official version of master would be someting like `1.5.1905.10042`.
2. **Dev build**
Build by using `hctbuild` with no other version-related option. The format is `dxil_major.dxil_minor.0.commit_count` where commit_count is the number of total commits since the beginning of the project.
3. **Fixed version build**
Build by using `hctbuild -fv`. Enables overriding of the version information. The fixed version is read from `utils\version\version.inc`. Location of the version file can be overriden by `-fvloc` option on `hctbuild`.
In addition to the numbered version the product version string on the binaries will also include branch name and last commit sha - `"1.5.1905.10042 (master, 47e31c8a)"`. This product version string is included in `dxc -?` output.
Adds support for templatized RWRawByteBuffer.Store<T>. To avoid SROA making us lose the original layout of any struct arguments, a new pass runs before SROA and breaks down such cases into per-element stores. So better be careful with the likes of buf.Store(0, (int[65536])0);...
-Qstrip_reflect would reserialize the root signature, leading to
validation failure #2162. Fixed by moving root sig to writer to clear
from module and prevent re-serialization to metadata.
Fixed -Qstrip_debug with -Zi and no output location still embeding
debug module.
- New -Qembed_debug is required to embed PDB in shader container
- -Zi used without -Qembed_debug will not embed debug info anymore,
and will issue a warning from CompileWithDebug().
- When compiling with Compile() and -Zi, -Qembed_debug is assumed
for compatibility reasons (lots of breaks without it)
- In dxc and CompileWithDebug() -Fd implies -Qstrip_debug
- Debug name is based on -Fd, unless path ends with '\', meaning you
want auto-naming and file written under the specified directory
- Debug name always embedded when debug info used, or -Fd used
- -Fd without -Zi just embeds debug name for CompileWithDebug(),
still error with dxc, since it can't write to your file.
- If not embedding debug info, it doesn't get written to the container,
only to be stripped out again.
- Fix padding for alignment in DebugName part.
- Default to DebugNameForBinary instead of DebugNameForSource if no
DebugInfo enabled
- Also fixed missing dependency on table gen options from libclang
* Allow clip/cull elements to be declared as array [2]
- This approach fixes validation and packing to handle this case.
- There could be implications to runtime ViewID validation
- fix some issues found in packing related to rowsUsed result from Pack
functions. Make these return 0 on failure, instead of startRow.
- Split PackNext into FindNext and PackNext that uses it for greater
flexibility.
Some large shaders exhibit a behavior where phi nodes are created for resources, in which one of the possible incoming values is undef. This gets cleaned up later such that there are no undef resources left. However the fail-undef-resources pass has already failed compilation by that point. The fix is to change that pass to invalidate-undef-resources and replace the undefs with a special invalid value, such that we can produce an error later if still necessary, when optimization passes have been run such that temporary undef resources have been eliminated.
Those tests reach into HLSL, CodeGenHLSL, Samples and even quick-test directories to find the files they need. With this change, I move or copy files around to make sure that these test files are attributable to their consumer. I copied rather than moved files only in the case where a test in code explicitly reached into quick-test or Samples, because that essentially means that the file serves two different tests since they are also run as batch.
These new DXIL instructions are added to SM 6.5. The valid operations
for <Op> are:
- BitAnd
- BitOr
- BitXor
- CountBits
- Product
- Sum
In HLSL, these are exposed as:
uint4 WaveMatch(<type> val)
<type> WaveMultiPrefixBitAnd(<type> val, uint4 mask)
<type> WaveMultiPrefixBitOr(<type> val, uint4 mask)
<type> WaveMultiPrefixBitXor(<type> val, uint4 mask)
uint WaveMultiPrefixCountBits(bool val, uint4 mask)
<type> WaveMultiPrefixProduct(<type> val, uint4 mask)
<type> WaveMultiPrefixSum(<type> val, uint4 mask)
In DXIL, these are exposed as:
[BitAnd,BitOr,BitXor,Product,Sum]
%dx.types.fouri32 @dx.op.waveMatch.T(i32 %opc, T %val)
T @dx.op.waveMultiPrefixOp.T(i32 %opc, T %val, i32 %mask_x,
i32 %mask_y, i32 %mask_y, i32 %mask_z,
i8 %operation, i8 %signed)
[CountBits]
i32 @dx.op.waveMultiPrefixBitCount(i32 %opc, i1 %val, i32 %mask_x,
i32 %mask_y, i32 %mask_y,
i32 %mask_z)
Scalarization of vector types occur as per the existing wave intrinsics.
For WaveMatch, the match is performed on each scalar and the results
are combined with bitwise AND. For WaveMultiPrefix, the operation is
performed on each scalar and combined into an aggregate.
- add addrspace stress test and addrspace inst test
- add ll test for cleanup pass targeting paths that may be difficult to
hit from HLSL
- fix handling of base-class bitcast cast for structured buffer
- handle addrspacecast in SimplifyBitCast during CodeGen
The old test values and precision was the same as the 32-bit variants of the tests. This does not work for 16-bit floating point numbers and especially for trigonometric operations.
The ShaderCompatSuite and Unroll tests randomly supported being used to batch run file-check on directories. This change rather updates hcttest file-check to support directories as well as individual files.
* Add root signature flags
- Add LOCAL_ROOT_SIGNATURE flag in RootFlags
- Add DESCRIPTORS_STATIC_KEEPING_BUFFER_BOUNDS_CHECKS flag in DescriptorTable
- New DxilRootSignatureCompilationFlags argument on Compile/ParseRootSignature
to specify whether the root signature is local or global.
* Fix root signature flag issues, add cmd test for root sig from define
- Fix improper bool early promotion to int in CheckBinOpForHLSL
- Modify VerifierTest.cpp to re-enable parsing by VerifierHelper.py
- Update/fix VerifierHelper.py
- Update/fix various tests, including adding expected fxc comments
- Fix source locations for ValidateCast uses
- ValidateCast will use expression location if no loc provided
- TODO: Fix error on explicit truncation cast that should be ok, as long
as it's not an l-value "cannot truncate lvalue vector/matrix"
See binop-dims.hlsl
- TODO: Fix L-Value casts in the future
Remove bad optimization for reusing loads in LegalizeResourceUse
- also update non-uniform detection to catch when the handle is marked,
not just the load.
Validate raytracing shader properties and RDAT blob part
- no signatures for ray tracing shader functions
- payload/params/attribute sizes are >= argument type allocation sizes
- RDAT is bit-identical to RDAT generated
- Minor fix for CS: should not have input signature elements either.
- update val version comment
- Prevent strange behavior with library target and entry point not empty.
- remove DXASSERT in LoadDxilMetadata (validation should catch this case)
Disallow ignore/accept hit intrinsics in lib export functions
- still need to add general validator part to restrict intrinsics for
library functions and shaders.
Eliminate phi on resources/handles for library
- ensure all handles use only one global resource
- replace resource path with indices into global resource
- replace resource alloca with index allocas
- AddCreateHandleForPhiNodeAndSelect now unnecessary, deleted
- error on resources in library function params or return
- mark entry clones as internal linkage since they could contain params
that are not allowed for libraries.
- fix MarkUavUpdateCounter - handle nested GEP
- disallow static resource use from exported library functions
- add/update tests
- detect undef resource paths that would otherwise be lost (FailUndefResource)
- set internal linkage for HL intrinsic function bodies so they aren't processed with user functions
- Rename legalize [static] resource use passes to DxilPromote(Local|Static)Resources
- Remove various dead code paths for resources, such as:
GenerateParamDxilResourceHandles, RemoveLocalDxilResourceAllocas,
lower handle cast, non-promotable error (may translate to index allocas)
- force resource GVs to external constant with no initializer
This change disables the build targets that are not supported on Unix
platforms when compiling on those platforms.
We were also ignoring DXCompiler.cpp when compilation in the past. We
can now bring it back.
We have also added cmake-predefined-config-params which contains all the
cmake configurations needed to build on Unix platforms. It will make it
easy for developers to build.
Add shader mask and min SM to RDAT
- add OP::GetMinShaderModelAndMask to compute mask and min SM
- account for ops that can be translated during linking
- update PS ops that could be allowed in library functions
- update min SM with newer optional feature flags
Fix AV in TranslateHLSubscript for typed buffer with struct element
- add OP::GetMinShaderModelAndMask to compute mask and min SM
- account for ops that can be translated during linking
- update PS ops that could be allowed in library functions
- update min SM with newer optional feature flags
This will give users a way to know the commit from which a DXC
library/binary was built from, so that they can compare different
libraries/binaries and report bugs easily.
dxc.exe now also prints the commit info using this new interface
method. It uses the following format:
(dev;<commit-count>-<commit-hash>)
In this CL, we have introduced the "WinAdapter.h" header file.
This file includes declarations and definitions that will enable compilation of the code base on non-Windows platforms. These include:
* Disabling SAL annotations for non-Windows platforms.
* Defining some Windows-specifc macros that are not defined on other platforms.
* Defining Windows-specific types that are used throughout the codebase for non-Windows platforms.
This file is currently not complete. As we go forward, we will add more classes/structs/type definitions to this file for specific purposes. The contents of this file is completely disabled (#ifdef'd away) for Windows, and do not affect Windows compilation in any way. For the most part, this is our way of enabling compilation of the codebase on non-Windows platforms without making invasive changes to the existing cpp files, and without hindering any future development. Moreover, I have added the #include of this file for all the locations that will need it for successful Linux compilation.
Finalize OpCode changes for Dxil 1.3 / SM 6.3
- Rename CreateHandleFromResourceStructForLib to CreateHandleForLib
- Add PrimitiveIndex
- Add final NumOp[Codes|Classes]_Dxil_1_3 values
- Fix legal shader stage set for PrimitiveID
hctstart.cmd and hctbuild.cmd now support ARM64 builds.
Note that building ARM64 build on x86/x64 machine needs a location
of x86/x64 version of TableGen tools (clang-tblgen and llvm-tblgen).
Before starting ARM64 build set the BUILD_TBLGEN_PATH to point to
the TableGen binaries or use the -tblgen option on hctbuild.cmd.
Also had to rearrange control flow around the cmake --build call
in hctbuild.cmd due to a flaky batch behavior around brackets.
Primarily if not exclusively due to the massive carveouts of the
original LLVM source base as part of the HLSL adaptation, many
variables and functions are left unused. In keeping with the
practice of commenting or ifdef-ing out unused portions of this
code and marking every such exclusion as an HLSL change, this adds
few comments and moves a lot of preprocessor conditionals around to
encompass the portions left unused as a consequence of the earlier
exclusions.
Fixes 450 clang and 442 gcc warnings.
Several #included files were capitalized incorrectly in the case of
files pertaining to the project, or inconsistently in the case of
Windows headers which are typically only relevant where capitalization
doesn't matter. In spite of that, consistent capitalization can be
useful for cross platform stubbing purposes.
This corrects capitalization of a few Dxil* headers and forces
capitalization of Windows-specific headers to be consistent with
the overwhelming majority of other cases.
No functional change, just cross-platform facilitation
Add -auto-binding-space option enabling auto-binding for lib targets
- Added DxilAllocateResourcesForLib pass to allow auto-binding any unbound
resources in a DxcOptimizer pass.
- Added AutoCleanupPerThreadFileSystem for calling cleanup on scope exit,
useful for command line users.
- Use global singleton and ensure cleanup on exit if not called
- Put setup/cleanup in VerifierTest.cpp where the TestModule[Setup|Cleanup]
methods live.
- Remove GlobalPerThreadSys from FileCheckForTest, since this has to happen
at the module level instead.
- Added Setup/Cleanup to all the appropriate places
- Fixing device support for > SM 6.0
- updating dxexp
- Also fixing the following tests:
ExecutionTest::DenormBinaryFloatOpTest#FDivDenormFTZ
ExecutionTest::DenormBinaryFloatOpTest#FAddDenormAny
ExecutionTest::DenormBinaryFloatOpTest#FMulDenormAny
ExecutionTest::DenormBinaryFloatOpTest#FDivDenormAny
ExecutionTest::DenormBinaryFloatOpTest#FMulDenormFTZ
ExecutionTest::DenormBinaryFloatOpTest#FAddDenormFTZ
ExecutionTest::DenormBinaryFloatOpTest#FSubDenormFTZ
ExecutionTest::DenormBinaryFloatOpTest#FSubDenormAny
ExecutionTest::DenormTertiaryFloatOpTest#FMadDenormAny
ExecutionTest::DenormTertiaryFloatOpTest#FMadDenormFTZ
2. Not SROA on resource.
3. avoid unpack for resource.
4. Use value type for createHandleForLib.
5. Move all scalar global resource load to the beginning of function.
This will stop other pass do optimizations on the loads of scalar global resource.
6. Not remove local resource for lib.
Fixing a bunch of SM 6.2 Tests
- BinaryHalfOp tests reading input2 instead of input1
- Fixed converting signed zero from float to float16
- Fixed compiler options for float16 min,max test
- Fixed expected results for following float16 operations
- cos(0), sin(314), sin(-314)
- frc(-7.39)
- rsqrt(65504)
- int16 subtraction
- int16 unsigned multiplication
- sqrt, rsqrt, round_pi, round_ni, fmad treat denorm as it is
Two new Vulkan specific intrinsic resource types, SubpassInput and
SubpassInputMS, are introduced to provide a way to use subpass inputs
in Vulkan. Their signatures:
template<typename T>
class SubpassInput {
T SubpassLoad();
};
template<typename T>
class SubpassInputMS {
T SubpassLoad(int sample);
};
T can be a scalar or vector type.
If compiled without ENABLE_SPIRV_CODEGEN, the compiler will report
unknown type name for SubpassInput(MS). If compiled with SPIR-V
CodeGen and trying to use SubpassInput(MS) in DXIL CodeGen, the
compiler will report an error saying the type is Vulkan-specific.
A new Vulkan-specific attribute, vk::input_attachment_index is
introduced to provide a way to set the input attachment index for
SubpassInput(MS) objects.
- TAEF is unable to recognize negative values in hex representation in x64 released build. So use decimal values instead
- Add 'Shl', 'LShr', 'And', 'Or', 'Xor' execution tests
- Update Warp Version for test cases known to fail
- Cleaning up execution test
- Generating xml test file from the script and hctdb
- Fixing hctdb categories for unsigned int
- Adding more execution tests (Fadd, FSub, FMul, FDiv, Add, Sub, SDiv)