Avoid inserting other insts before allocas in several passes.
- Improve code that creates allocas/non-allocas to keep allocas first
- Add helper functions to dxilutil
- Use name AllocaBuilder for CreateAlloca for clarity
GLSL as a shading language does not define the layout rules for
std140/std430; the OpenGL graphics environment defines that.
This is also more consistent with -fvk-use-dx-layout.
DxilRuntimeReflection uses RDATTableReader to obtain information about the library. InitFromRDAT call will allocate all memory required to get DXIL_LIBRARY_DESC.
* [spirv] Add -fspv-target-env command line option.
The valid values for this option currently are:
vulkan1.0
vulkan1.1
If no target environment is specified, vulkan1.0 is used as default.
Added FeatureManager to record all extensions specified from
the command-line and emit error if trying to use one not permitted.
Added command-line option -fspv-extension= to specify
whitelisted extensions.
This commit uses the HlslCounterBufferGOOGLE decoration to link
the main RW/Append/Consume StructuredBuffer with its associated
counter buffer. It also uses HLSLSemanticGOOGLE to decorate
stage IO variables with their semantic strings from the source code.
- Improve code that creates allocas/non-allocas to keep allocas first
- Add helper functions to dxilutil
- Use name AllocaBuilder for CreateAlloca for clarity
Based on GLSL std140/std430 layout rules, relaxed layout allows
using vector's element type's alignment as the vector types's
alignment, so that we can pack a float value and a float3 value
tightly. This is the default right now.
Also add an option, -fvk-use-glsl-layout, to turn off the relaxed
layout for vectors and use conventional GLSL std140/std430 layout
rules.
The fxc compiler evaluates both operands before performing the
token-pasting operation. But the default Clang way is to follow
C standard, which token-pastes operands as-is, without any
pre-expanding.
This commit adds support for the fxc behavior via the command
line option: -flegacy-macro-expansion.
Fixes https://github.com/Microsoft/DirectXShaderCompiler/issues/1005
2. Not SROA on resource.
3. avoid unpack for resource.
4. Use value type for createHandleForLib.
5. Move all scalar global resource load to the beginning of function.
This will stop other pass do optimizations on the loads of scalar global resource.
6. Not remove local resource for lib.
- Keep track of patch constant functions for later identification
- functions that require input/output signature processing identified
with IsEntryThatUsesSignatures
- update lib_rt.hlsl intrinsics and naming
Two new Vulkan specific intrinsic resource types, SubpassInput and
SubpassInputMS, are introduced to provide a way to use subpass inputs
in Vulkan. Their signatures:
template<typename T>
class SubpassInput {
T SubpassLoad();
};
template<typename T>
class SubpassInputMS {
T SubpassLoad(int sample);
};
T can be a scalar or vector type.
If compiled without ENABLE_SPIRV_CODEGEN, the compiler will report
unknown type name for SubpassInput(MS). If compiled with SPIR-V
CodeGen and trying to use SubpassInput(MS) in DXIL CodeGen, the
compiler will report an error saying the type is Vulkan-specific.
A new Vulkan-specific attribute, vk::input_attachment_index is
introduced to provide a way to set the input attachment index for
SubpassInput(MS) objects.
- fix microcom.h macros to allow for custom constructor and other issues
- fix 32/64-bit build issues in ShaderOpTest
- add missing .rc files for dxl, dxlib_sample, and dxc_batch
- Cleaning up execution test
- Generating xml test file from the script and hctdb
- Fixing hctdb categories for unsigned int
- Adding more execution tests (Fadd, FSub, FMul, FDiv, Add, Sub, SDiv)
Added a new command line option -fvk-ignore-unused-resources
to avoid emitting SPIR-V code for resources defined but not statically
referenced by the call tree of the entry point in question.
Uses the legalization pass recipe from SPIRV-Tools.
Also change -fcgl to disable both legalization and optimization.
Validation on some tests are turned off because of new image
instruction checking in SPIRV-Tools. Enabling them is a TODO.
This change removes new Load variations (LoadHalf, LoadFloat, etc) for RWByteAddressBuffer/ByteAddressBuffer methods and add templated Load intrinsics. This templated Load only works on scalar or vector types, and its variations (Load2, Load3, Load4, etc) do not work with templates and work as it did before (only storing uints). For Store operation, you can store any scalar or vector types of up to 16 bytes, while prior to 2018 Store only supported storing uint scalar.
This change is to add fixed width scalar types in HLSL 2018.
These types are: int16_t, uint16_t, int32_t, uint32_t, float16_t, float32_t, and float64_t.
Prior HLSL versions do not support these fixed width types.
This change also renames /no-min-precision command line option to /enable-16bit-types. int16_t and uint16_t are only allowed with this mode.
This change is an extension of float16 support. We are adding LoadHalf, LoadFloat, and LoadDouble method to byte address buffer so that users can access data from byte address buffer by these types. Also starting shader model 6.2, we are mapping byte address buffer and structure buffer load/store operations to RawBufferLoad/Store to differentiate raw buffer load from typed buffer load. Unlike BufferLoad for typed buffers, RawBufferLoad for min precision types will not have its min precision values as its return types, but their actual scalar size in buffer (i.e rawBufferLoad.i32 for min16int and rawBufferLoad.f32 for min16float). RawBufferLoad/Store contains additional parameters, where mask was required for correct status behavior for CheckAccessFullyMapped, and alignment is for relative alignment for future potential benefit for backend.
This change is to update correct target data layout for DXIL. Now that we have a scalar type of size less than dwords, we need to correctly print string data layout to determine non-packed structure layout.
This change also fixes alignments issues with ConstantBuffer as it was having different CG path from cbuffer.
Added the following four command line options to shift register
number for Vulkan:
* -fvk-b-shift
* -fvk-t-shift
* -fvk-s-shift
* -fvk-u-shift
These options are used to avoid assigning the same binding
number to more than one resources.
* Adds support for pause/resume in pipeline.
* Adds 0.0 as the validator version to indicate no validation should occur.
* Moves module parsing from dxcompiler down to HLSL library.
* Adds support for optimizer to process a DXIL part.
* Adds a form with designer support for the interactive optimizer.
See the comment in DxilDebugInstrumentation.cpp for a summary of how this-all works. Basically: instrument all instructions to output "trace" info to a UAV so that a debugger can reconstruct the execution of the shader and offer a debugger-like experience.
This change is to enforce the new constraint on signature packing: pack signature elements by data width. Before we introduce fp16 type, every element was assumed to reserve 32 bits. Since we are introducing a new 16 bit data type, we need a new way to enforce signature rules.
After discussions we decided that it would be nice to pack elements based on data width. However, we are still enforcing the rule that each row contains up to 4 elements, regardless of the size. This way, depending on the hardware support drivers can optimize packing signatures, while on DXIL level we maintain the assumption that there are 4 elements per row. We are also still constraining on the total number of rows to be 32 for now. This can be changed in the future if people find this limit to be an issue.
This change is to support fp16 on constant buffer.
We are still keeping a row pitch of 4 dwords. So we can fit up to 8 halfs in one row.
Dword data types will be aligned to dword address space for efficiency. We are also introducing new flags "use native low precision" if we have low precision type present and /no-min-precision option is enabled.
A new CL option -fvk-stage-io-order={alpha|decl} is added to
control the order for assigning stage I/O location numbers.
The default is also changed to declaration order (decl) instead
of alphabetical order (alpha).
Also extended testing fixtures to support additional CL options.
This change is the initial step for supporting fp16.
Support half type with /no-min-precision option. This switch can be moved to other master switch in the future as we refine our spec for half support.
min16float will be treated as half with warnings.
Fix compiler test to take additional arguments, and fix regression tests.
TODO: we need to make decisions on what triggers low precision instead of min precision. Also need to change signature packing and constant buffers for half.
1. Support empty struct as call arg.
2. Define _ATL_DECLSPEC_ALLOCATOR for older atlbase.h
3. Fix typo in SROA_Parameter_HLSL::preprocessArgUsedInCall for use AllocaBuilder when need RetBuilder.
1. Don't check hasMulticomponentUAVLoads for lib.
2. Skip mem2reg for unpromotable handle alloca for lib.
3. Support find resource attribute from function call.
4. Avoid unpack for dxil types.
5. Add resource attribute for fieldAnnotation.
* Fix ViewID state for flow control and consistent ViewID detection
1. Different ways of determining ViewID usage led to assert or crash.
Use DxilModule flag to guarantee consistency. But this required
splitting DxilEmitMetadata pass into two passes, one that finalizes
the module and computes flags run before ComputeViewIdState, and one
that simply writes the metadata.
2. storeOutput in flow control with values not dependent on that flow,
such as literals, would not pick up control flow dependence.
Fix by picking up control flow dependence on storeOutput instructions,
not just the value being written.
* Fix control depenence computation
There was a bug in the implementation of the control dependence algorithm.
We were incorrectly iterating over all descendents in the postdom tree where
the algorithm should only iterate over immediate children.
This change is to support preserve denorm operations. We do this by creating dxc options and propagate them all the way to dxil metadata so that drivers can see how it should handle floats.
Currently we support three modes: any(default), preserve, and ftz
Denorm option will be of per function flag. We store this information at function annotation metadata. This means that for DXIL 1.2 we have a new structure for function annotation. The change in structure is documented on DXIL.rst
The pass implements functionality for PIX for the "pixel cost", "depth complexity" and "overdraw" visualizers. You can probably infer what the pass does from the names "overdraw" and "depth complexity": For each pixel rendered it increments a corresponding counter in a UAV of a buffer that is the same size as the render target. The "pixel cost" pass does the same thing, only the increment is a weight value calculated from the total cost of the draw call, as derived from PIX's GPU-side profiling system.
This pass will be used in PIX for "Dr. PIX experiments", wherein a workload is re-rendered with MSAA turned off in order to get an approximate understanding of the performance impact of MSAA.
* maybe?
* Add a "mode" that inserts CB references too
* Add tests
* Xiang's helper to recreate meta-data
* Create CB properly
* Use actual defined constant
* CR feedback: incorrect "one or the other" assert. Unnecessary sethandle. Not accounting for CB id > 0. Test CB id > 0.
Supporting a custom allocator for dxcompiler.
Adds recovery for exceptions and out-of-memory handling.
Add custom allocator support to linker.
Fix for release-only test failure.
Removes assertion about presence of command-line option registration
* Add DxilD3DCompile and DxilD3DCompile2 as examle of library use.
It will compile shader into library first, then link to generate the final dxil.
If there's include used, it will compile included file into seperate lib and link all lib together.
* Add remove-discards pass
* explicit lfs
* Change pass name, remove copy-pasted member enum, reduced include set
* CR feedback: better way to delete instructions, stale comment, end iterator could go stale, CHECK-NOT is a good idea.
2. For none library profile, remove unused functions except entry and patchconstant function.
3. Fix issue caused by lost instanceCount for GS function props.
1. Save signatures for each entry.
2. Lower signatures and strip parameters for each entry.
3. Don't save function props for patch constant function.
4. Erase value from vectorEltsMap once it is used in SROA_Parameter_HLSL::replaceCastArgument and SROA_Parameter_HLSL::replaceCastParameter.
5. Remove unused member of DxilGenerationPass.
6. Fix typo when clear semantic for cloned return annotation.
1. Add Function props for entry.
2. Create DxilEntrySignature to group DxilSignature.
3. Move generation of loadinput/storeoutput into HLSignatureLower.
2. Clone shader entry to be called by other functions.
3. Skip ComputeViewIdState for library profile.
4. Change HLFunctionProps to DxilFunctionProps for DxilModule to use.
5. Don't add .flat to flattened function name.
This change is to fix the bug on fixing multi component UAV Load Flag Check by iterating over UAV resources in a DxilModule and check if a given resource is 1) either texture or typed buffer and 2) multi component resource. This change also changes shader flag collection stage after all necessary passes are complete from higher level DXIL to DXIL.
This commit adds a new pass that will transform all the storeOutput
instructions in the program. For each element in the output
signature we create an alloca to hold writes to the output.
We rewrite each storeOutput to store to the alloca'd location
instead. Then at each function return point we store to
each element in the output signature by reading the latest
value from the alloca'd location.
The above transformation provides the following properties
for outputs after it is run.
1. Remove all dynamic indexing on output writes
2. Remove multiple writes to same output location
3. Ensure all output locations in the signature are written
This commit updates the `IsPrecise` helper to also check for the fast math
flags on an instruction. Now, the function will return the correct value
for any instruction. Before it only worked for intrinsic calls.
We also added three helpers for working with fast math flags.
We provide getters and setters for precise fast math flags and
a helper to say if fast math flags are preserved through serialization
and deserialization.
Finally, we added a unit test for DxilModule that we can use to ensure
that the `IsPrecise` helper returns the correct value.
We can now expand the following intrinsics:
Acos
Asin
Atan
Hcos
Hsin
Htan
The expansion uses the same approximation algorithms used by the d3d compiler.
- bFinalStage indicates a non-PS stage is the final one and validation should be performed on direct outputs, rather than waiting for the next stage to merge outputs with inputs.
- Extra validation for PSV
- Treat tessfactors as dynamically indexed to prevent breaking them up
- Update PSV data for GS multi stream out and other improvements
- Add ViewIDPipelineValidation.inl
- Refactor DxilSigPoint to .inl for easy external use
- Add Barycentrics feature flag
* Change to DxilProgramSigSemantic::Barycentrics for runtime alignment
* Integrate os change e4aba062897e
- fix in Hashing.h for thread unsafe static
* Change to DxilProgramSigSemantic::Barycentrics for runtime alignment
* Disable ViewID Debug output and initialize ViewID feature flag
* Clean up some CMakeLists.txt lib changes
1. Generate memcpy in EmitHLSLFlatConversionAggregateCopy if type is match.
2. If value is used as Src of memcpy, mark load instead of store.
3. Do not do replace on GEP and bitcast.
4. Delete inst instead of push into DeadInsts when safe.
This will prevent same inst added into DeadInsts more than once.
* Make validator version a first-class citizen of DxilModule
* Add validator version checks to DxilContainerAssembler
- eliminates need to check validator version in dxcompilerobj.cpp
* Copy dynamic index mask and ViewID state to PSV data
- Make const version of DxilViewIDState::GetSerialized() that just returns serialized state
* Update PSV with ViewID data structures
* Update PSV with signature element data and gate PSV1 on SM 6.1
* Unwrap array on primitive when assigning semantic indexes
- fix semantic index assignment for matrix and primitives
- fix some tests
- disable two tests due to AssembleToContainer fail:
ValidationTest::SigOutOfRangeFail
ValidationTest::SimpleGs1Fail
unlike other system value semantics, pixel shader is allowed to declare at most two input attributes with SV_Barycentrics, with one declaration uses perspective interpolation type while other one uses noperspective interpolation type.
* Add documentation with source-level debugging with HLSL and DXIL.
* Fix trailing underscore in generate .rst documentation.
* Add container support for debug name part.
* Bump validator version to 1.1.
* Implement debug stripping in dxc, including /Fd dir-named behavior.
* Implement IDxcCompiler2 and CompileWithDebug.
1. Add ResourceToHandle pass to lower resource to handle.
2. Don't performPromotion in SROA_HLSL.
3. Add Option support for DynamicIndexingVectorToArray.
4. Fix bug in GetPassOption.
5. Replace GEP ptr, 0 with ptr in DynamicIndexingVectorToArray::ReplaceStaticIndexingOnVector.
6. Run mem2reg before DxilGenerationPass.
7.Legalize EvalOperations before change vector to array.
* [spirv] Add libclangSPIRV and skeleton FrontendAction for SPIR-V
* [spirv] Add -spirv into dxc for invoking EmitSPIRVAction
* [spirv] Build SPIR-V codegen conditionally
Added ENABLE_SPIRV_CODEGEN in CMake config to control the
building of SPIR-V component and wrap up SPIR-V code using it.
Also add -spirv to hctbuild to turn it on.
Make DxcCreateInstance load validator from dxil.dll when available
Add DxcVersionInfoFlags_Internal flag to indicate internal validator
Add DxcValidatorFlags_ModuleOnly to indicate absense of full container for validation, requiring explicit use. This will not succeed when using dxil.dll validator.
Bump Validator version
Implement IDxcVersionInfo in DxcCompiler to indicate DXIL highest version supported and debug flag for compiler separately from validator
Refactor version detection code for tests, adding dxil and validator versions
Make SystemValueTest version aware
Add version check to AttributeAtVertex and SV_Barycentrics tests.
Fix line endings for HlslTestUtils.h
1. Adding SV_Barycentric and removing barycentric intrinsics/dxilops
2. GetAttributeAtVertex only thakes no_interpolation attribute
3. SV_Barycentric can take any interpolation modifier except nointerpolation
4. SV_Barycentric can only have float3 type
- Add SV_ViewID loaded from intrinsic in Dxil,
for input to all graphics shader stages
- hctdb: rename shader_models to shader_stages,
add shader_model for min required shader model
- Validator: validate dxil version required for shader model
- DxilModule: Add GetDxilVersion
- Set Barycentric intrinsics to SM 6.1
- Update SystemValueTest for SM 6.1
This change is the initial step to support Barycentric coordinates for shader model 6.1 / DXIL 1.1
- Add GetBarycentrics, GetAttributeAtVertex intrinsics and their corresponding dxil ops
- Lowering intrinsics to dxil ops
- Adding codegen tests for these operations
1. Keep major for matrix pointers.
2. Change matrix values to row major to match hlsl.
Only ColMatLoad, RowMatrixToColMatrix and col matrix value parameter for entry function are col major matrix value.
And should only used by ColMatStore and ColMatrixToRowMatrix.
1. Add resource attribute to FieldAnnotation.
2. Find resource arribute from Argument in FindCreateHandleResourceBase.
3. Remove limitation handle must be instruction.
4. Add HandleToResCast to help lower resource parameter to handle parameter.
This change is to add another execution test case for partial derivative operations specific to pixel shaders.
The way we did this is by passing in the texture resource to the shader and taking the partial derivatives of these texture values.
Test is assuming the arithmetic precision of 1 ulp for derivative operations.
This change also has some changes on existing execution test
- enable passing in Texture2D resource type as a default heap
- reading primitive topology from XML file
- clean up data driven tests: use DirectX Math structures and removing unused structures
This commit adds the 'dxil' strategy for lowering extensions. This
strategy will change the extension call into a call to a dxil intrinsic.
This is useful for targeting dxil intrinsics that are not exposed in hlsl.
This change is to add execution tests for individual DXIL arithmetic operations.
Using TAEF's data-driven model to group operations by the type/number of operands for each instructions. (unary/binary operators for float/int/uint types)
Updating DXIL.rst for these instructions.
Update hctdb.py to enable internal links for DXIL instructions.
* Move unused global tracing flags into single usage, closes#157.
Normalizes API-provided names to include at least a current-relative
directory, simplifying file lookups. There are still restrictions on
how expressive include handlers enable the virtual file system to be,
these should be documented on the project wiki.
Handle some errors in BeginSourceFile paths where include handlers may
cause early errors, before compilation execution can begin.
Removes the prior scheme for directory handles, where the same handle
would be returned for any parent directory.
Future improvements would be to expand dxc paths before calling into
dxcompiler, and/or to fully implement OS API calls from console apps.
Improve hcttest to describe build config better and not override env var.
This change modifies dxil constant folding to use the opcode class when
deciding if a dxil function can be constant folded. We now require a
DxilModule to be available when constant folding dxil functions.
To ensure that the dxil module is available we add a new pass that loads a
dxil module from metadata if it does not exist. We use the new pass in the
dxopt tests for constant folding.
* Ensure the cached Function->OpCodeClass map is updated
The original goal of this change was to use opcode class for deciding when we
can perform constant folding on a function.
We maintain a mapping from Function* to OpCodeClass inside the OP class.
We wanted to use this map in constant folding to decide if we can constant
fold a function to avoid string comparison on the function names.
However, it turns out that the DxilModule is not always available during
constant folding of dxil calls so we cannot use the map inside of OP. The
change contains a few bug fixes and improvements that came out of trying
to get opcode class working inside constant folding.
1. Use opcode class in dxil constant folding where possible.
2. Make sure the opcode cache is refreshed properly.
3. Remove 64-bit test for bfi.
4. Add equality comparison for the ShaderModel class.
When switching to use the opcode class for constant folding, we discovered
that our test for 64-bit bfi is invalid. There is no 64-bit overload for
bfi in dxil, so the test we had written was not legal dxil. This change
removes the 64-bit test for bfi constant prop.
This commit adds the ability to constant fold dxil intrinsics when
all inputs are constant. We reuse the llvm constant folding
infrastructure and add special cases for calls to dxil intrinsics.
* Lower resource to createHandle at clang code gen.
1. A HL createHandle will have 1 or 2 parameters.
For uav/srv/sampler, 1 parameter. 1 is the resource load from resource ptr.
For cbuffer, 2 parameter. 1 is the same. 2 is for dynamic indexing on array of cbuffers.
uav/srv/sampler don't have 2 is because all the use of them is on builtin methods. Resource on methods is scalar.
createHandle function will have metadata to save the resource information like class/kind/type/...
2. Added 2 more passes DxilLegalizeResourceUsePass DxilLegalizeStaticResourceUsePass to remove load/store on local/static resource.
Also make sure HL createHandle don't have phi operand.
3. For DxilGenerationPass, Dxil createHandle will be generated after GenerateDxilOperations.
And HLObjectOperationLowerHelper now get RK/RC from MetadataAsValue argument of HL createHandle.
* Remove clang header generation from the build
These headers aren't used by the compiler nor by HLSL program.
* Remove references to clang-headers and redundant discard test.
* Fix a few leaks to make tests pass with AppVerifier.
Previously, If an extension uses the replication lowering strategy
and a non-vector overload was chosen we would return the function
un-modified. This change makes sure that we still use the
custom lowering name for the extension if one was specified.
This commit adds a standalone class for expanding macros and uses it
to for root signature and semantic define processing. Previously,
we were using the raw macro value instead of the expanded value
so something like:
#define MYRS "DescriptorTable(SRV(t0))"
#define RS MYRS
would fail to compile the root signature with an entry point of
RS because it would get a root signature value of MYRS instead
of the expanded value. Similarly, for semantic defines we were
saving the non-expanded value into the bitcode.
This commit also adds a new flag that is used to specify that
the root signature should be read from a #define. So compiling
with `-rootsig-define RS` will read the root signature from
a `#define RS ...` in the source. The -roosig-define flag
takes precedence over a root signature annotation in the souce.
* Dxc fix verifyrootsignature option
- distinguish stream type (stdout, stderr) when printing to console
- verifyrootsignature option returns correct errorlevel depending on its result
- adding more test cases to hcttest for verifyrootsignature option
- enable verifyrootsignature option when dxil.dll not present
* Fixes from review
- Change parameters for writing to consoles from File* to DWORD
- Change comment when dxil.dll does not exist
* Clean up dxc options:
setrootsignature, setprivate :
Allow replacing existing rootsignature or private data and make a new container with new parts
(e.g dxc /dumpbin private.dxil /setprivate private.data /Fo private.new.dxil)
extractrootsignature:
Make it back compatible with fxc by returning dxil container with RTS0 part only and having the user to provide /Fo option.
Other Options:
For unimplemented options that was from fxc, ignore those options and proceed given operation.
For unimplemented options that was not from fxc, remove them for now.
Add more test cases for dxc command line operations.
Fix ISenseOption flag to have valid HLSL version check for dxc (disable 2015)
* Fix Preprocess option for dxc
Fix hcttestcmd for testing invalid rootsignature
- Implement and centralize container validation components in DxilValidation
- Strip RootSignature from module metadata before serializing to container
- Use existing DxilModule when serializing rather than constructing new one
- Add DxilModule::TryGetDxilModule for capturing diagnostics on metadata load
- Expose DxilPartWriters/DxilContainerWriter for use elsewhere (such as in validation)
- Update command line options of dxc
(Qstrip_debug, Qstrip_priv, Qstrip_rootsignature, setrootsignature, getprivate, setprivate)
- Add IDxcContainerBuilder API and its implementation
- Resolve lifetime of dxil.dll on dxcompiler.dll to guarantee access of memory generated from dxil.dll
This commit adds support for extension intrinsics that work as
methods on resources. For example, we could have an extension
on buffers called `MyBufferOp`
Buffer<float2> buf;
float2 val = buf.MyBufferOp(int2(1, 2))
To support extension methods we add a new resource lowering strategy
that does three transformations to the intrinsic call
1. Expand vectors in place in the call arguments.
2. Convert non-void return value to dx.types.ResRet.
3. Convert resource parameter to dx.types.Handle value.
For example, assuming that MyBufferOp has opcode 138. The resource
lowering strategy would convert the call as HL-dxil to dxil as
follows
call <2 x float> MyBufferOp(i32 138, %class.Buffer %3, <2 x i32> <1 , 2> )
==>
call %dx.types.ResRet.f32 MyBufferOp(i32 138, %dx.types.Handle %buf, i32 1, i32 2 )
- Add DxilSignatureAllocator for signature packing
- Fix signature validation. Add more validation.
- Fix and add validation tests.
- Fix codegen for inout params with SV like SV_Coverage
- fix m_SemanticStartIndex on DxilSignatureElement::Initialize
- fix DxilSignatureElement::GetColsAsMask for start col == 2
- Add diags for signature allocation failures
- Use Regex in ValidationTest
This commit adds initial support for extension intrinsics and defines, including:
-Support for recognizing hlsl extensions as new intrinsic functions.
-Support for requesting lowering of extensions.
-Support for preserving semantic defines.
-Support for validating semantic defines.
This commit adds support for hlsl extensions in the form of additional
intrisic functions and semantic defines.
We now allow a dxcompiler instance to register that it can handle
additional intrinsic functions beyond the standard hlsl intrinsics. These
new intrinsics are called extensions. For each extension, the compiler
supplies a lowering strategy that is used by the dxcompiler to translate
from the source level intrinsic to a high-level dxil intrinsic.
We initially support the two lowering strategies: replicate and pack.
The replicate strategy will scalarize the vector elements in the call
and replicate the function call once for each element in the vector.
The pack strategy changes the vector arguments into literal struct
arguments.
We also now include support for "semantic defines". A semantic define is
a source level define whose value is preserved in the final dxil as
metatdata. The source level define can come as either a #define in
the hlsl source or a /D define from the command line.
We provide a hook to validate that a semantic define has a legal value
(for example to ensure that the value is an integer). Validation failures
can produce warnings and errors that will be emitted through the standard
clang diagnostic mechanism.
This code was originally written by marcelolr and modified by dmpots to
support packed lowering of intriniscs and validation of semantic defines.