- add addrspace stress test and addrspace inst test
- add ll test for cleanup pass targeting paths that may be difficult to
hit from HLSL
- fix handling of base-class bitcast cast for structured buffer
- handle addrspacecast in SimplifyBitCast during CodeGen
- set address space for groupshared QualType and fix downstream effects
Fix:
- double LValue expression emit for in aggregate arguments
- in agg param modifying caller's value instead of copy
- groupshared matrix support in HLMatrixLower
- groupshared base class member access
- groupshared matrix member casting in class method
Still in need of more fixes and tests:
- incomplete array and auto dimensions from initializer
Argument handling still needs overhaul. This fix retains old behavior, even
when not quite correct to avoid worse regressions. Objects are not copied
in, aggregate LValueToRValue cast expr is emitted as LValues instead of
RValues because RValue emit path doesn't have HLSL changes necessary to handle
certain cases, such as derived-to-base cast.
Updates the HLMatrixLower pass logic to eliminate the nondeterminism and randomly bogus codegen that came from iterating over a pointer-keyed map and modification of that map during the iteration. The new approach does a single pass on every instruction consuming or producing matrices, replacing it by its vector equivalent. Any consumed matrix is replaced by a temporary mat-to-vec translation stub, and any formerly produced matrix is emitted as a temporary vec-to-mat translation stub. Stubs get cleared as both ends of a consumer-producer dependency get lowered.
Lots of different code in CGHLSLMS.cpp did numerical conversions in slightly different ways, most of them forgetting to take into account bools. This change introduces a single ConvertScalarOrVector function which handles both and updates code to use it. This implied a partial rewrite of const init list handling code because they lost the QualType information necessary to properly cast. They were also not handling register/memory representations properly so that got fixed it at the same time.
llvm::Twines are dangerous. When concatenated, they store a pointer to their operands to defer the actual string concatenation as long as possible. Hence every Twine being concatenated must be kept alive. FormatMessageWithoutLocation took a twine by value, concatenated, and returned the result. This means that the concatenation pointed to the by-value argument, which gets destroyed when the function returns.
The parameter SROA's LowerMemcpy function deletes the memcpy and bitcast instructions after it has processed them. However, higher up the stack we created an IRBuilder pointing to the first non-alloca instruction, which happens to be the bitcast being deleted. Hence any further use of that IRBuilder will crash.
The solution is to delay creating the IRBuilder until after LowerMemcpy, since the rest of the code is well-behaved with respect to instruction deletion (ie it defers them to the end).
This change fixes an inconsistency between scalar and vector bools, where the formers were stored as i32s in memory (as opposed to i1s in registers) while the latter remained <n x i1>s. This lead to a number of crashes in more complex bool scenarios where scalar and vector element types were expected to match.
The solution is to extend Clang's strategy for scalar bools to vector of bools. Clang relies on three key functions to deal with differences between register and memory type representations which are special-cased for bools and now also for bool vectors. Several HLSL code additions did not properly leverage those functions and this was addressed wherever possible, in some cases removing the need for special cases for bools.
To deal with matrices, a similar concept of register/memory representation was introduced in the matrix lowering code and the lowering passes were updated accordingly.
Added front-end `-flegacy-resource-reservation` to control behavior (added automatically for SM <= 5.0)
Added an error for unbounded resources used with `-flegacy-resource-reservation`
Added a mechanism for `DxilModule` to preserve intermediate options during its passes but not in the final DXIL
Changed `HLModule` to not remove unreferenced `DxilResources` but rather make their symbol UndefValue
Fixed assumptions of `DxilResources` having a valid `GlobalVariable`
Added reserved resource range gather pass before elimination of unused resources (if SM <= 5.0)
Changed resource allocation to account for reserved ranges
* Fix subobject raw bytes storage; don't point into RDAT for reflection
DxilSubobject:
- combine string and raw bytes storage in m_BytesStorage
- use StringRef as key, since memcmp is used with explicit length
DxilRuntimeReflection:
- add m_BytesMap for local unique copy of raw bytes in RDAT
to prevent pointing into RDAT blob for root sig
- clean up some insertion patterns
* Rename Get[SubobjectString/RawBytes] to Intern[String/RawBytes]
- re-serialize module when changed by removing either subobjects or
root signature metadata
- improve error detection/handling for loading/creating subobjects
- load subobjects from RDAT if not in DxilModule for RDAT comparison
- fix subobjects missing from desc in DxilRuntimeReflection
- default ReducibilityAnalysis to pass so it doesn't fail when no
functions are present
- report error on subobject if validator (< 1.4) will not accept
them, so they will not be captured
- container validation for required/disallowed parts for libraries
- Replace HitGroup subobject by TriangleHitGroup and ProcedurePrimitiveHitGroup
- Add internal HitGroupType enum that maps to D3D12_HIT_GROUP_TYPE
- Add corresponding HitGroupType field to metadata, RDAT and reflection
* Add root signature flags
- Add LOCAL_ROOT_SIGNATURE flag in RootFlags
- Add DESCRIPTORS_STATIC_KEEPING_BUFFER_BOUNDS_CHECKS flag in DescriptorTable
- New DxilRootSignatureCompilationFlags argument on Compile/ParseRootSignature
to specify whether the root signature is local or global.
* Fix root signature flag issues, add cmd test for root sig from define
- Use one callback for global removal - Function and GlobalValue use this
- Share callback for HLModule and DxilModule, transfer is handled by
setting callback on construction and checking pointer before clearing
- Add HLSL subobject classes to clang and enable their initialization
via initialization lists.
- Translate declared subobjects and add to DXIL module.
- Fix bugs in subobject metadata serialization and disasm.
- Preserve original input string in root signature subobject.
- Remove unused IsHlslObjectType method from CGHLSLRuntime.
- Remove unnecessary collection of global strings in CGHLSLRuntime
- Change type printer to report 'literal string' for constant char arrays.
Fix crash caused by NonUniformSet being invalidated before use
Instead of using NonUniformSet, defer lowering of
NonUniformResourceIndex until last, then mark all GEPs that use
the incoming value.
This will mark all uses of the value used in NonUniformResourceIndex
as non-uniform, including indexing that was not marked non-uniform,
if it ends up being the same index. This matches the prior behavior,
and avoids loss of non-unifom metadata when merging GEPs, but could
be considered not quite correct, although the difference in indexing
would probably not have been intentional.
* Fix string interning for Subobjects
* DxilRuntimeReflection: Remove dependency on llvm/ADT/STLExtras.h
- This file is meant to be included in other projects that do not include
llvm headers or libs, so it must avoid those dependencies.
* Avoid unnecessary strlen operations in GetSubobjectString
* StringStorage/RawBytesStorage: use pair of unique_ptr and size