Adds crash tests for bugs #2077, and #2061.
Removes crash test for closed bug #1882.
Adds a bunch of tests for "precise" attribute, whose effects I just learned about. Lead to filing bugs #2078, #2079 and #2080 due to divergences in behavior with FXC.
This is a partial solution to preserving snorm and unorm for Buffer and other resource types. In the clang type system, the canonical type of Buffer<snorm float> is a RecordType for the template specialization, which is Buffer<float>, so the information is lost. However, the type of the variable itself has a sugar node of type TemplateSpecializationType which provides the exact type arguments that were provided to the template class to select its specialization. Hence we can retrieve the information if we have access to the VarDecl's type.
This is a partial solution, because the snorm/unorm is still not encoded in the canonical type of the variable, which means that it is still legal to assign resources that differ between their use of the attribute. However, it should unblock the use of the feature with backends that consume this metadata.
In a few places, llvm uses malloc instead of operator new, especially with collections that want to avoid calling constructors. In those cases, the allocation won't go through our custom allocator and will crash rather than throw on out-of-memory conditions. This fixes the biggest offenders.
We declare CHeapPtr<wchar_t> DebugBlobName;, where CHeapPtr uses the CCRTAllocator by default, but initialize it with: Unicode::UTF8BufferToUTF16ComHeap(pDebugName, &DebugBlobName), which uses CComHeapPtr<wchar_t> p;
If we don't hit any exceptions, things are fine because we end up doing: *ppDebugBlobName = DebugBlobName.Detach(); , so the destructor will be a no-op.
- New -Qembed_debug is required to embed PDB in shader container
- -Zi used without -Qembed_debug will not embed debug info anymore,
and will issue a warning from CompileWithDebug().
- When compiling with Compile() and -Zi, -Qembed_debug is assumed
for compatibility reasons (lots of breaks without it)
- In dxc and CompileWithDebug() -Fd implies -Qstrip_debug
- Debug name is based on -Fd, unless path ends with '\', meaning you
want auto-naming and file written under the specified directory
- Debug name always embedded when debug info used, or -Fd used
- -Fd without -Zi just embeds debug name for CompileWithDebug(),
still error with dxc, since it can't write to your file.
- If not embedding debug info, it doesn't get written to the container,
only to be stripped out again.
- Fix padding for alignment in DebugName part.
- Default to DebugNameForBinary instead of DebugNameForSource if no
DebugInfo enabled
- Also fixed missing dependency on table gen options from libclang
DxcCreateBlobOnHeapCopy allocated using the COM allocator and freed using the default thread allocator, leading to crashes.
But more generally, some DxcCreateBlob functions would take ownership of the passed-in buffer without also having the caller explicitly specify what IMalloc to use for deallocation, which is an error-prone pattern, so I reworked these methods a bit.
This also lead me to find some memory leaks and general memory mismanagement in the the rewriter.
Clang implicitly adds an i8 to empty structs. Subsequent empty struct fields in a larger structure get packed differently by clang (all in one dword) than by the llvm DataLayout (one per dword), causing inconsistent debug metadata. With this change, we stop adding implicit i8s, such that empty structs are zero-sized and we don't have to ignore their dummy field, which fixes debug info inconsistencies.
Used to report next member minus this offset, which would include cb
padding in the size. This is not correct.
Now we calculate the size based on the packing rules for the type.
- basic types are:
(((elements * rows) - 1) * row stride) + cols * element size
where row stride is 16, or 32 if 64-bit and cols > 2.
- for struct:
struct size = offset of last field + size of last field
struct size += (elements - 1) * 16-byte aligned struct size
Turned on Size check in reflection tests since it should be correct now.
The front-end has taken care of annotating the AST nodes with
the right attribute on QualTypes. We need to read that information
properly and use it in the SPIR-V backend.
* Allow clip/cull elements to be declared as array [2]
- This approach fixes validation and packing to handle this case.
- There could be implications to runtime ViewID validation
- fix some issues found in packing related to rowsUsed result from Pack
functions. Make these return 0 on failure, instead of startRow.
- Split PackNext into FindNext and PackNext that uses it for greater
flexibility.
Pull request #2086 lies about the size of resources to ensure that backends can always expect 32 bits in the debug info. However, this causes warnings/errors in code containing structs of a resource type and a numerical type, because SROA uses the LLVM size rather than the clang size, and those are expected to be in sync. So the dwarf is going to say that the resource is 32 bits, but LLVM will think it is 64 bits (given our current definition of the struct).
When multiple hull shaders point to a single patch constant buffer, we would iterate over the hull shaders, during which we would apply a transformation on the patch constant buffer and erase it. This means that subsequent hull shaders will be pointing to an already-deleted patch constant buffer. If the pointer got reused by a new function, we got into weird situations were we would remove a newer function while trying to clean an invalidated pointer.
The solution is to bin hull shaders by their common patch constant function. Transform the patch constant function without deleting the original, performing the replace on every using hull shader function, and only then deleting the original function.
After loading a vector from a cbuffer, we attempt moving the debug info upstream, to the individually loaded elements, assuming that they got converted into a vector by a sequence of insertelement instructions. This is not the case for bool vectors, because the insertelement sequence will be followed by a mem-to-reg conversion. Made the code more resilient to such unexpected patterns (better to lose debug info than to crash compilation).
- Hijack dwarf debug info to present matrices as having _11 to _44 fields, similar to vectors
- Better preservation of debug info for constant buffer loads
- Improve and rename resource loading tests
- Improve and reorganize vector members tests, adding matrix members test
- Hide the resource members in the debug information as they are internal implementation details.
- Force the size of resources to be reported as 32 bits (sizeof(intptr)). It's not useful for resources to have a size based on the internal fields we stick in there, it's more useful for backends to be able to expect a fixed resource size, even though what they end up generating might be bigger.
One more small step in reorganizing the CodeGenHLSL tests. Most batch-run tests are moved under CodeGenHLSL\batch such that there is only one TAEF test referencing this root folder, and we can freely move and reorganize files within there without risking to break any code.
Note that:
- I left the quick-test directory (now empty except for a single placeholder test), in case other people have pull requests which add tests there. This is just a staging measure against build breaks.
- I didn't move the debug subfolder because I have a number of pull requests modifying its contents
- I couldn't move "Samples" because some tests reach into it to run specific files, in addition to the whole thing being batch run
No test were changed, only files moved. CompilerTest.cpp was updated to remove the old batch-run directories in favor of a single test for batch-running the "batch" subfolder.
* [spirv] Use QualType in SpirvConstantBoolean constructor.
We switched constant instruction constructors to use QualType, but had
forgotten about SpirvConstantBoolean. This makes things consistent.
* [spirv] Only use QualType for creating SpirvConstantNull.
In the general case, bit_pieces might be only for part of a register or register range. For example a half in a VGPR or a SGPRx4 for which we only have debug info for the first register.
FXC considers arrays the same as structures for implicit conversions, so int[2] should be implicitly convertible to struct { int x; float y; }, for example.