This change is the initial step for supporting fp16.
Support half type with /no-min-precision option. This switch can be moved to other master switch in the future as we refine our spec for half support.
min16float will be treated as half with warnings.
Fix compiler test to take additional arguments, and fix regression tests.
TODO: we need to make decisions on what triggers low precision instead of min precision. Also need to change signature packing and constant buffers for half.
1. Support empty struct as call arg.
2. Define _ATL_DECLSPEC_ALLOCATOR for older atlbase.h
3. Fix typo in SROA_Parameter_HLSL::preprocessArgUsedInCall for use AllocaBuilder when need RetBuilder.
1. Don't check hasMulticomponentUAVLoads for lib.
2. Skip mem2reg for unpromotable handle alloca for lib.
3. Support find resource attribute from function call.
4. Avoid unpack for dxil types.
5. Add resource attribute for fieldAnnotation.
* Fix ViewID state for flow control and consistent ViewID detection
1. Different ways of determining ViewID usage led to assert or crash.
Use DxilModule flag to guarantee consistency. But this required
splitting DxilEmitMetadata pass into two passes, one that finalizes
the module and computes flags run before ComputeViewIdState, and one
that simply writes the metadata.
2. storeOutput in flow control with values not dependent on that flow,
such as literals, would not pick up control flow dependence.
Fix by picking up control flow dependence on storeOutput instructions,
not just the value being written.
* Fix control depenence computation
There was a bug in the implementation of the control dependence algorithm.
We were incorrectly iterating over all descendents in the postdom tree where
the algorithm should only iterate over immediate children.
This change is to support preserve denorm operations. We do this by creating dxc options and propagate them all the way to dxil metadata so that drivers can see how it should handle floats.
Currently we support three modes: any(default), preserve, and ftz
Denorm option will be of per function flag. We store this information at function annotation metadata. This means that for DXIL 1.2 we have a new structure for function annotation. The change in structure is documented on DXIL.rst
The pass implements functionality for PIX for the "pixel cost", "depth complexity" and "overdraw" visualizers. You can probably infer what the pass does from the names "overdraw" and "depth complexity": For each pixel rendered it increments a corresponding counter in a UAV of a buffer that is the same size as the render target. The "pixel cost" pass does the same thing, only the increment is a weight value calculated from the total cost of the draw call, as derived from PIX's GPU-side profiling system.
This pass will be used in PIX for "Dr. PIX experiments", wherein a workload is re-rendered with MSAA turned off in order to get an approximate understanding of the performance impact of MSAA.
* maybe?
* Add a "mode" that inserts CB references too
* Add tests
* Xiang's helper to recreate meta-data
* Create CB properly
* Use actual defined constant
* CR feedback: incorrect "one or the other" assert. Unnecessary sethandle. Not accounting for CB id > 0. Test CB id > 0.
In particular, out-of-memory exceptions thrown during string construction in the context of certain move constructors (in this case, and std::pair) would cause the runtime to terminate the process.
The problem is that a noexcept specification was added to some STL move constructors incorrectly, and when the runtime finds such a frame while unwinding, it will invoke std::terminate(). See the comments and static assertions for more details.
Fixes the Realloc implementation in the test memory allocator.
Supporting a custom allocator for dxcompiler.
Adds recovery for exceptions and out-of-memory handling.
Add custom allocator support to linker.
Fix for release-only test failure.
Removes assertion about presence of command-line option registration
* Add DxilD3DCompile and DxilD3DCompile2 as examle of library use.
It will compile shader into library first, then link to generate the final dxil.
If there's include used, it will compile included file into seperate lib and link all lib together.
* Add remove-discards pass
* explicit lfs
* Change pass name, remove copy-pasted member enum, reduced include set
* CR feedback: better way to delete instructions, stale comment, end iterator could go stale, CHECK-NOT is a good idea.
2. For none library profile, remove unused functions except entry and patchconstant function.
3. Fix issue caused by lost instanceCount for GS function props.
1. Save signatures for each entry.
2. Lower signatures and strip parameters for each entry.
3. Don't save function props for patch constant function.
4. Erase value from vectorEltsMap once it is used in SROA_Parameter_HLSL::replaceCastArgument and SROA_Parameter_HLSL::replaceCastParameter.
5. Remove unused member of DxilGenerationPass.
6. Fix typo when clear semantic for cloned return annotation.
1. Add Function props for entry.
2. Create DxilEntrySignature to group DxilSignature.
3. Move generation of loadinput/storeoutput into HLSignatureLower.
2. Clone shader entry to be called by other functions.
3. Skip ComputeViewIdState for library profile.
4. Change HLFunctionProps to DxilFunctionProps for DxilModule to use.
5. Don't add .flat to flattened function name.
This change is to fix the bug on fixing multi component UAV Load Flag Check by iterating over UAV resources in a DxilModule and check if a given resource is 1) either texture or typed buffer and 2) multi component resource. This change also changes shader flag collection stage after all necessary passes are complete from higher level DXIL to DXIL.
This commit adds a new pass that will transform all the storeOutput
instructions in the program. For each element in the output
signature we create an alloca to hold writes to the output.
We rewrite each storeOutput to store to the alloca'd location
instead. Then at each function return point we store to
each element in the output signature by reading the latest
value from the alloca'd location.
The above transformation provides the following properties
for outputs after it is run.
1. Remove all dynamic indexing on output writes
2. Remove multiple writes to same output location
3. Ensure all output locations in the signature are written
This commit updates the `IsPrecise` helper to also check for the fast math
flags on an instruction. Now, the function will return the correct value
for any instruction. Before it only worked for intrinsic calls.
We also added three helpers for working with fast math flags.
We provide getters and setters for precise fast math flags and
a helper to say if fast math flags are preserved through serialization
and deserialization.
Finally, we added a unit test for DxilModule that we can use to ensure
that the `IsPrecise` helper returns the correct value.