Граф коммитов

1226 Коммитов

Автор SHA1 Сообщение Дата
Greg Roth 2ec5dd9c62
Handle more complicated dx.break circumstances (#2825)
This includes a number of different changes to allow the dx.break
artificial branch conditional to function properly in different
situations that might arise as a result of optimization passes.

First of all, the dx.break usage is eliminated entirely for any function
found to not have need of it during code gen finishing. The dependencies
are not all clear at this stage, so it only determines if the function
makes use of any wave operations.

During the original dx.break elimination pass, the code allows for any
kind of instruction. This requires a bit more information about the user
of dx.break since the immediate user might not be a branch. So first the
dx.break users are traversed the same way that the wave users are. Then
the wave users are used to cull the list of dxbreak users. Then the
remaining dx.break users, which are not wave sensitive have their branch
conditionals removed by traversing the users until the branch is found.

An incidental optimization in the finding of wave sensitive functions is
used which requires depth first traversal. Since certain checks are
performed on a per-block basis and the user chain can be expected to
stay in the same block at least for a bit, by using depth first search,
we can skip when the new block matches the one just processed.

There is a bit of consolidation of logic here. Instead of repeatedly
checking for the pattern that indicates a dx.break modified branch, it
is done once to collect all such blocks and thereafter, its presence
there is taken to mean that it qualifies.

A test is added to verify that this solves the original motivation of
this which was a circumstance where merging of conditionals resulted in
the dx.break call instruction was not immediately used by the branch.

Add additional early exit for waveless compiles

If the shader has no wave ops at all, there's no reason to create the
function at all.
2020-04-29 09:22:07 -07:00
Greg Roth a0196bcc22
Use Attribute to designate wave-sensitive intrinsics (#2853)
* Use Attribute to designate wave-sensitive intrinsics

This adds an intrinsic attribute to indicate wave-sensitivity that can
be indicated in gen_intrin_main.txt. This and other attributes are
passed along through function representations and lowerings. The
wave-sensitivity needs to be maintained specifically through SROA passes
since it is used by the CleanupDxbreak pass that comes after them.

Specifically this is done to allow extension intrinsics to indicate
wave-sensitivity, but the same mechanism is now used for builtin
intrinsics.

Most intrinsics get a mostly nameless function to represent them between
Codegen and dxilgen. This allows any that have the same prototype to
share the same function. For wave-sensitive intrinsics, they need a
different function or else the attrubute would be similarly shared with
intrinsics matching the prototype. So a minor change is made to their
function names to prevent this.

Adds testing for all these ops and a dummy extension one.
2020-04-28 18:19:51 -07:00
Xiang Li 49876c21b4
Not copy GEP arg when the ptr is alloca/arg/const global. (#2855)
* Not copy GEP arg when the ptr is alloca/arg/const global.
2020-04-28 14:40:27 -07:00
Adam Yang 5c64108bcc
Optimized bitcode loading. Added function to only materialize named MD. (#2854) 2020-04-28 14:04:19 -07:00
Greg Roth 9840380ef8
Correct DomTree usage in memcpy lowering (#2839)
* Correct DomTree usage in memcpy lowering

By copying the code meant for memcpy used on global variables, I forced
recalculation of the DominatorTree for each and every memcpy. This
was pretty slow and entirely unnecessary. The global variables are
invoked from SROA_HLSL_Parameters, which is a module pass that doesn't
really know which function its dealing with until it addresses the
actual memcpy. SROA_HLSL already depends on the domination tree analysis
pass which allows the pass to draw from that for the dominator tree.

In addition, copying from the original code made use of a Post Dominator
Tree. This was a mistake. What we really want to know is if the source
dominates the uses of the destination. In practice, rarely would
anything be found to post dominate unless they shared a block. In the
contrived case where a memcpy post dominated its destination, the way
dxilgen dealt with cbuffers made it irrelevant because they were all
replaced with the result of its initialization result in the entry.

Finally, as Xiang suggested, this now uses the source to determine
domination.

The result of these is faster compiles and faster code produced. No
incorrect behavior should have resulted from the previous code. So this
doesn't correct any.

Incidental removal of unused method in SROA_HLSL

* Avoid more problem memcpy replaces

The previous location for memcpy replace too narrow. It only handled one
way that memcpy replacement might be triggered. What's more, it was done
regardless of other things, potentially quicker to check that might
prevent the replacement. By moving the check into ReplaceMemcpy, it is
checked every time it needs to be and never when it doesn't.

Additionally, ReplaceMemcpy was called on memcpys that were created as
part of a memcpy split. This isn't a problem if they get replaced by
loads and stores, but if all uses of the dest are replaced by the src,
the memcpy is essentially removed. sub-memcpy params are only used
directly by their memcpy. So this effectively removes the sub memcpy
entirely. By detecting when the only users of the memcpy parameters are
that memcpy and refusing to replace it entirely, we evade this however
it arises.

This eliminates the earlier fix that disabled subelement memcpy when the
domination check failed. The effect is largely the same in many cases,
but makes the check more directly.
2020-04-28 05:45:16 -07:00
Minmin Gong 002ed9737e
Fix the "unused function" warning caused by UuidStrHash (#2848) 2020-04-25 12:26:28 -05:00
Xiang Li 49310e2b2c
Skip copy-in/copy-out for constant global variables. (#2836)
* Skip copy-in/copy-out for constant global variables.

* Enable copy for noinline.
TODO: analysis for global variable alias with parameter.

* Use SetVector and skip resource when copy.

* Disable mayAliasWithGlobal because optimization already covered case not replace when have alias.
When replace const global to a normal value has store, mark it non-constant.
2020-04-23 19:08:03 -07:00
Adam Yang 1053ca45ab
Merge debug info for global variables lowered into local allocas. (#2846) 2020-04-23 12:32:15 -07:00
Tex Riddell d3d9b19ec8
Fix WriteSamplerFeedback and Gather as gradient (or not) (#2845)
* Gather should not be considered a gradient operation.
* Mark WriteSamplerFeedback[Bias] as gradient
* Remove SampleCmpLevelZero from DxilConvergent
* Add WriteSamplerFeedback[Bias] to DxilConvergent
2020-04-22 18:37:37 -07:00
Jeff Noyle ed3d0ee0bf
Missed a commit from previous PR: Check for not-an-int (#2844)
Protect against null for subroutine params (dwarf type deref)
Fix pixtest build
2020-04-22 14:40:08 -07:00
Jeff Noyle 625c98fba3
PIX: Annotate structs for shader debugging of MS->AS payloads (#2826)
Three classes of fixes:

First a couple of trivial fixes in the DxcPix storage class, for >1-d arrays and embedded types.

Second, the addition of a new fragment iterator (now renamed "member iterator") that knows how to traverse the full structure of a struct, enabling the DxcPix* code to know bit offset and size for contained types.

Third, fixes to the numbering pass to know how to find alloca offsets via the various GetElementPtr statements that result when addressing struct members.

Lastly, a whole bunch of unit tests. Cuz this was really hard.

Remaining to do, as noted in the tests: the presence of a pointer-to-pointer type like floatNxM results in dbg.declare instructions not being emitted for structs. These tests (and perhaps some of the code) will need to be revisited if/when this is fixed.
2020-04-22 10:00:21 -07:00
Minmin Gong 7dc56492a0
Fix a couple warnings in code (#2746)
1. A warning about unused "group" variable
2. A CMake warning about unmatched if and endif
2020-04-20 00:54:34 -07:00
Jeff Noyle 35769dabe1
Pix source-to-instruction offsets (#2835)
This is a direct implementation of a new API for PIX that shortcuts the large and unwieldy (and not-really-working-well) DIA equivalent to find the set of instructions that correspond to a given source location.
2020-04-17 17:21:42 -07:00
Xiang Li 8b92463c32
Add -precise-output (#2827) 2020-04-13 22:24:09 -07:00
Greg Roth 7d665725a8
Add simplify to DxilValueCache for dx.break() (#2803)
* Add simplify to DxilValueCache for dx.break()

To allow loop unrolling, call instructions that reference dx.break()
are given a constant true boolean simplification in DxilValueCache.

This required making the constant string available by moving it into
Analysis.

As an incidental change, I corrected the spelling of simplify in a
couple cases.
2020-04-10 00:40:14 -07:00
Tex Riddell 6ef33dcec6
Fix assert/corrupt PDB when NumDirectoryBlocks > 128 (#2818) 2020-04-08 13:21:44 -07:00
Tex Riddell d58a0cd227
Handle phi and select when collecting cbuffer usage (#2807) 2020-04-03 17:24:59 -07:00
Tex Riddell ef73f1da8a
Disable optimization in instcombine that produces trunc to odd bit size (#2805) 2020-04-02 17:09:14 -07:00
Greg Roth 9ecca8cfbb
Don't assume branch user of dx.break (#2806)
The previous lowering of dx.break() assumed that the user was still a
branch as it was when it was set up. However, optimizations can switch
things around so this assumption is wrong.

By allowing for other instructions, the replacement takes place just the 
same regardless of whether its a branch.

Various code simplifications too.
2020-04-02 17:03:31 -07:00
Greg Roth 9007354c61
Account for non-one user of dx.break call (#2802)
* Account for non-one user of dx.break call

The original code expected that optimizations wouldn't remove any of the
users of a call to dx.break(). In loop unrolling cases, simplifyCFG may
do just that, which caused the original assert and assume we're right
method to break.

Now we assert we're in the confines of what is now known to be possible,
but we don't assume anything. There's a loop over the users of the call
instruction that I only ever expect to be executed once or never, but
it's there just in case.
2020-03-31 18:15:52 -07:00
Greg Roth d3af7f1237
Conditionalize breaks to keep them in loops (#2795)
* Conditionalize breaks to keep them in loops

This introduces dx.break, a temporary builtin that is applied as a
condition to any unconditional break in order to keep the basic block
inside the loop. Because it remains in the loop, operations that depend
on wave operations inside the loop will be able to get the right values.

Such builtins have to be added at finishcodegen time or else clang
throws an error for an undefined function. Consequently, the creation of
these is split in two. First the branch is created with just a
constant true conditional. Then at finishcodegen, it is converted to the
result of the dx.break() builtin.

By using the result of a temporary builtin function, the optimization
passes don't touch the false conditional like they might if we started
with the global constant. 

Normal break blocks don't need this conditional, but we don't know that
at code generation. So a later pass identifies break blocks with wave
sensitive operations that depend on wave ops that are inside the loop they
are breaking out of and preserves those conditionals while removing all
the rest.

As part of dxil finalization, the dx.break() function is removed and all
branches that depended on it are made to depend on function-local
loads and compares of a global variable dx.break.cond.

The DxBreak Fixup pass depends on dxil-mem2reg. It is placed immediately
after to allow as many optimizations to go as they would without this
change in shaders that don't have any wave ops.
2020-03-30 19:02:04 -07:00
Greg Roth a3ebb138ac
Prevent repeated memcpy splitting (#2789)
When splitting a memcpy because of faile dependency checks, any memcpy
created to copy part of the original is going to have the same
characteristics that caused the first to be split. Instead of splitting
repeatedly, just use the alternative code generation built into splitCpy
2020-03-24 15:43:23 -07:00
Tex Riddell bd88e666ad
Prevent FoldCmpLoadFromIndexedGlobal from introducing i64 use (#2787)
In future, we need to be more intentional for whether i64 use is desired, and produce i64 pattern if so.
2020-03-24 11:45:36 -07:00
Greg Roth 78b58851fd
Print Lines or "add -Zi" on error messages (#2782)
* Print Lines or "add -Zi" on error messages

This change adds to the existing EmitErrorOnInstruction functionality in
dxilutil to include EmitErrorOnFunction and EmitErrorOnGlobalVariable.
Each derives line number information from the metadata of these objects
if available. If it isn't available, the error or warning messages are
still printed, but an additional suggestion to add a -Zi flag is added.
2020-03-23 22:04:01 -07:00
Tex Riddell cbb49f0ea4
Fix assert with empty llvm.global_ctors targeting valver 1.4 (#2785) 2020-03-23 19:50:07 -07:00
Tex Riddell 9729946be7
Fix static global resource arrays. (#2786)
Allow static global resource arrays to be lowered to allocas
Note: allowing other aggregate types here leads to problems due to
not properly translating aggregate initializers.
2020-03-22 18:01:20 -07:00
Xiang Li 729307c4d9
Create x*x for pow(x,2). (#2777) 2020-03-19 11:48:47 -07:00
Adam Yang 43122fc20b
Special ICB variables for nops are now arrays for driver compat. (#2774) 2020-03-18 16:01:00 -07:00
Greg Roth 606d13b606
Fix memcpy replace with undominated users (#2766)
When the only use of an alloca is as the destination of a memcpy
operation, the compiler will try to replace all uses of the alloca with
the source of the memcpy. That's fine when all the uses are in locations
where the source of the memcpy is sure to be defined, otherwise, a lot
of complicated PHI insertion is needed to get things right. Lacking it
made LCSSA checks fail. When LCSSA did succeed in inserting PHIs, it did
it wrong which caused other problems.

Rather than do the complicated PHI insertion at memcpy replace, this
change borrows from the previous work for global variables in similar
circumstances and recharacterizes the use of the pointer to be stored
only, but not uniquely. The result of this is the memcpy is replaced
with a load/store pair. There is a good chance that pair will go on to
be replaced by the mem2reg pass, which eliminates the alloca in the end.
That pass has excellent PHI insertion logic that gets everything right.

The added test includes a number of different conditional locations for
memcpy that each produced unique issues in investigating this issue.
2020-03-18 14:37:00 -07:00
Adam Yang 6fe0776524
Separated cond mem2reg and scalarizing precise vector allocas (#2765) 2020-03-16 16:05:12 -07:00
Tex Riddell 79cb431936
Fix invalid sample count for typed buffers in struct (#2763)
* Fix invalid sample count for typed buffers in struct
* Decode resource prop in disasm, update tests
2020-03-11 22:31:49 -07:00
Jeff Noyle 816f3141d4
PIX: Add new API to retrieve info about compilation options (macros, target profile etc) (#2758)
This lets PIX call a much better-matched API than DIA to retrieve these things.
2020-03-10 13:11:55 -07:00
Xiang Li bc7b65cbe9
Run SROA multiple times. (#2754) 2020-03-08 18:25:20 -07:00
Jeff Noyle af79d8b431
Add tracerayinline to access tracking (#2751)
Co-authored-by: Jeff Noyle <jeffno@ntdev.microsoft.com>
2020-03-06 06:53:14 -08:00
Tex Riddell 39efc1c762
Fix lib assert/hang passing different struct with same layout to function (#2745) 2020-03-05 14:52:01 -08:00
Tex Riddell ccd45ba40b
Rewriter: improvements plus extract uniforms to global scope (#2730)
- Added rewriter functions for extracting uniforms into global scope
- Will not work if name collides (namespaces have bugs)
- Values under cbuffer _Params, resources outside (more bugs)
- Added rewriter HLSLOptions and support all through RewriteWithOptions
- Refactored dxr.exe to use HLSLOptions and RewriteWithOptions
- Exposed Write*VersionInfo functions from dxclib
- Fixed issue with External[Lib/Fn] for version printing.
2020-03-04 21:27:07 -08:00
Tex Riddell cd48b34db0
Allow Wave intrinsics in DXR shader stages. (#2742) 2020-03-04 21:21:22 -08:00
Adam Yang 9bf9701ca7
Separated dead block removal to its own pass. Fixed bug where dead resources are not removed (#2744) 2020-03-04 19:04:27 -08:00
Greg Roth 08f3100f25
Increase scan limit for DSE, add option (#2725)
* Increase scan limit for DSE, add option

Due to the large number of fields in a struct passed to an exported
function, the number of useless loads and stores exceeded the builtin
limit, which left a lot of them in. This increases the default limit
from 100 to 500 and adds a hidden parameter -memdep-block-scan-limit to
set it to whatever is needed for future workarounds.

The Dead Store Elimination pass examines stores to see if they are
unneeded. If it finds no uses between the store and the original load,
it eliminates both, but if it has to exceed the instruction limit to get
there, it gives up and leaves it in just in case.
2020-03-04 15:50:01 -08:00
Adam Yang 8d79e009a5
Printing out line num and debug variable and expression when disassembling. (#2738) 2020-03-03 16:33:01 -08:00
Xiang Li 73e5b381c7
fix crash when has sampler as return type. (#2736) 2020-03-03 15:47:41 -08:00
Adam Yang 52a54200c5
Faster printing when validation fails with really big shaders (#2737) 2020-03-03 15:19:25 -08:00
Adam Yang 227c8e6f5a
Added instructions to preserve intermediate values of computations. (#2721) 2020-03-03 00:29:42 -08:00
Jeff Noyle be3f3fa2ee
User/jeffnn/pix dontoverwriteoffsetcounter (#2729)
Fix for overflow case
Overloads for StoreVertexOutput
Reformat to remove curly-on-end
2020-03-02 09:23:26 -08:00
Tex Riddell 713c80ce4e
Rename 'module' to 'hModule' for C++20 compat (#2667)
* Rename 'module' to 'hModule' for C++20 compat
* Replace a bunch of unintended uses of new keyword 'module' with 'mod'
2020-03-01 18:12:31 -08:00
John Porto 63a3b45067
Adds the DxcPixDxilDebugInfo interface and friends. (#2715)
* Adds the DxcPixDxilDebugInfo interface and friends.

* Modifies the entrypoints to require InParam/OutParam for pointers, as well as CheckNotNull them

* Removes S_FALSE for happier Jeff

* Fixes broken test

* returns E_POINTER for nullptrs

* Returns S_FALSE from UnAlias for non-aliasing types.

* fails GetName for arrays

* Addresses CR comments
2020-02-25 17:50:23 -08:00
Jeff Noyle eb33030b03
Pix mesh shader output instrumentation (#2709)
This is a pass for PIX that adds instructions to write mesh shader output (vertices and indices) to a UAV for later ingestion by PIX in order to present a view of that output.
2020-02-21 10:25:34 -08:00
John Porto 6eb0e070fb
Adds pass for converting calls to dbg.value to dbg.declare (#2706)
* Adds pass for converting calls to dbg.value to dbg.declare

* fixes travis build breaks

* addresses CR comments
2020-02-20 09:13:07 -08:00
Xiang Li 4098229583
Code cleanup. (#2708)
* Code cleanup.
2020-02-19 16:41:11 -08:00
Tex Riddell 5a1fd7ebf5
Update DxilMDHelper's SM for correct MinValVersion info. (#2701)
Bug triggered DXASSERT, but otherwise harmless on fre builds.
2020-02-18 11:02:09 -08:00
Greg Roth c207a9a8d3
Correct reflection stripping detection (#2696)
When stripping reflection data, structures are searched for any member
or submember scalars that are smaller than 32 bits. When one is found
and min precision is being used, the structure is identified as not
being able to be stripped.

In the case of a shader that has a resource that contains a struct which
contains an array which contains a struct, the struct was not being
processed as a struct, but rather as a scalar, which would cause it to
return a size of zero, which was found to be less than 32 bits. This
isn't right if the innermost struct contained a 32 bit integer. Either
way, this resulted in inconsistent results for the parent struct and the
innermost struct. As a result, the annotations were stripped from the
innermost, but left on the outermost. Because the parent struct had
annotations, validation was attempted, but ultimately hit an assert or
caused validation failures because of the innermost struct being
stripped.

By reorganizing the function that searches for 16 bit scalars to look
for a struct only after processing the arrays, the struct is properly
traversed and all annotations are stripped, avoiding the failures.

Rather than determining all aspects of keeping or stripping reflection
annotation based on the individual struct and its contents, all structs
contained in a struct are collected so that they can be excluded from
stripping should their parent struct be found needing of maintaining its
annotations.

Thanks to @tex3d for suggesting (and mostly implementing) this addition
2020-02-18 10:34:12 -07:00
Xiang Li 8156151a58
Update name and fix nonuniformindex. (#2697)
* Add nonUniformIndex for CreateHandleFromHeap and change name to CreateResourceFromHeap.

* Check group when lower CreateResourceFromHeap.
2020-02-14 15:24:36 -08:00
Adam Yang bfb8143b27
hlslFlags is now a null-terminator separated list of compiler options (excluding target, entry, and defines) (#2698) 2020-02-13 19:50:17 -08:00
Adam Yang b33846a03c
Using DxilValueCache for unroll. (#2694) 2020-02-13 15:46:00 -08:00
Xiang Li 91fe12f0eb
Add GetResourceFromHeap. (#2691)
Add GetResourceFromHeap for hlsl.
For Dxil, add CreateHandleFromHeap and AnnotateHandle.

All handles ( createHandle, createHandleForLib, createHandleFromHeap ) must be annotated with AnnotateHandle before use.

TODO: add AnnotateHandle for pix passes.
            cleanup code about resource.
2020-02-12 21:50:02 -08:00
Greg Roth e26d9e3c22
Avoid validation failure when missing annotation (#2687)
When validating CBuffer ranges, the validator may find itself with a field
with an empty EltSize. This may be due to partial reflection stripping which
removes some of the annotations, but not all. This allows the validation to
begin, but runs into problems with the missing inner fields. Specifically, this
has been found to occur when using a library stage with a cbuffer containing
an array of structs. This is a bug with the compiler that this change works
around by bailing on validation late.

This can also occur when a cbuffer contains an array of empty structs. Not
a terribly useful case, but it should be handled correctly.

This also adds tests for each of the above described conditions.
2020-02-11 14:23:02 -07:00
Helena Kotas a42ffbf491
DXBC to DXIL Converter + unit tests (#2685)
Includes dxilconv-specific DXIL optimization passes added to opt.exe tool.
2020-02-11 12:07:26 -08:00
Adam Yang 76e34cc27d
Fixed lack of constant folding sometimes putting instructions before PHI's. (#2683) 2020-02-10 18:00:15 -08:00
Helena Kotas 5d741a0279
HLSL test infrastucture and other refactoring and helper classes (#2682)
* HLSL test infrastucture and other refactoring

Refactor common test infrastructure code into HLSLTestLib
Enable invocation of fxc and other executables via // RUN: commands in test files
Add latest d3dx12.h to include/dxc/Support and remove two other outdated copies
Improve DXIL container header validation on load
New helper classes DxilContainerReader and FixedSizeMemoryStream
Move LoadSubobjectsFromRDAT to DxilSubobjects.cpp

Co-authored-by: Greg Roth <grroth@microsoft.com>
2020-02-06 21:49:21 -08:00
Tex Riddell 45deef90b0
Add missing UTF8 for source and output filenames (#2674)
* Add missing UTF8 for source and output filenames

- These need to be converted back to UTF-16 after argument processing for
  a couple APIs and CA2W constructors were missing CP_UTF8 for the input
  encoding from the utf-8 argument strings.

* Add CP_UTF8 to more CW2A and CA2W uses
2020-02-06 18:23:31 -08:00
Xiang Li 9e88903d42
Check hasName before getName for StructType. (#2678) 2020-02-04 16:02:31 -08:00
Adam Yang 7a8c4a20da
Fixed incorrect PHI value in DxilValueCache. Also using it in a few more places. (#2668) 2020-01-30 17:27:22 -08:00
Xiang Li c0d82459f4
Merge fix for undef in first_iteration case from upstream. (#2672) 2020-01-30 14:56:32 -08:00
Greg Roth c9b1465954
Fix crash when using precise matrix with -Od (#2670)
* Fix crash when using precise matrix with -Od

To flag a variable as precise, a dx.attribute.precise call is inserted.
In the case of matrices, this becomes a problem if it persists too long
due to validation errors because the matrix is never lowered. When we
reach the HLMatrixLowerPass, the matrix can be lowered to vector. This
change detects when the precise call is applied to a matrix, lowers the
parameters and replaces the call with one taking a vector instead.
The call is necessary to keep the precise informaton across function
calls.

Adds variante of precise tests. precise/matrix.hlsl is modified by
matrix_od.hlsl to take the -Od parameter.
precise/propagate_to_producers_interproc.hlsl is modified to use
matrices since an earlier fix for this bug caused a regression when this
alteration was made.

Fixes #2189
2020-01-29 22:33:38 -07:00
Tex Riddell 32168eac84
Add pragma control for diagnostics and option printing (#2656)
* Add pragma control for diagnostics and option printing
* Test converting warning to error with #pragma dxc diagnostic error "..."
* Support options: -f[no-]diagnostics-show-option and -W[no-]<warning>
2020-01-27 18:07:03 -08:00
Xiang Li 0532d8cd08
Fix crash caused by precise on gs output. (#2650) 2020-01-16 10:49:26 -08:00
Adam Yang 9501f39cc8
Replaced llvm.donothing with loading from a special ICB (#2644)
* Fixed test and dxil prepare

* Removed unused vars and auto

* Stray newline

* No longer giving the loads a name

* Fixed another test
2020-01-10 14:40:10 -08:00
Adam Yang cd7ae78e57
Global variables lowered to local variables are classified as arg_variables (#2637) 2020-01-10 13:54:32 -08:00
Xiang Li 2f440d0462
Move wave sensitive check from validation into DxilValidateWaveSensit… (#2640)
* Move wave sensitive check from validation into DxilValidateWaveSensitivity pass.
2020-01-10 13:43:01 -08:00
Tex Riddell 925a79f8a8
Translate resource poison value to undef index instead of assert (#2621) 2020-01-07 14:40:27 -08:00
Tex Riddell f38f449ea7
Remove llvm.donothing for dxil <= 1.5 instead of using validator version (#2634) 2020-01-07 14:39:55 -08:00
Vishal Sharma 82b1625bb5
Don't emit discard when input is non-negative (#2628) 2020-01-06 14:46:47 -08:00
Xiang Li 9e5bd8b870
Bump to shader model 6.6 (#2631) 2020-01-06 11:04:44 -08:00
Xiang Li 9c89a1c2c6
Make shader model related code generated. (#2629)
* Make shader model related code generated.
2019-12-30 12:37:07 -08:00
Xiang Li e8422a4f4e
Fix build error for vs2019. (#2626) 2019-12-19 14:32:05 -08:00
Jeff Noyle c5df9d33e6
PIX shader debugging: debug phis by adding new blocks that write out relevant values (#2612)
This checkin adds new basic blocks as new preceding blocks between blocks containing Phis and the previous preceding blocks. The new blocks output PIX debug instrumentation for the value named by the Phi for that edge.
2019-12-04 08:59:15 -08:00
Jeff Noyle 2dcb1efd6b John's PR feedback: Move counter out of loop! 2019-12-03 17:11:31 -08:00
Xiang Li 3defb835f1
Set alignment for element global variable. (#2615) 2019-12-03 15:54:18 -08:00
Xiang Li 0867be4b4e
Keep alignment when LowerType. (#2614) 2019-12-02 16:33:23 -08:00
Jeff Noyle a95b0bc4ed Merge branch 'master' into PIX_Debug_Phis 2019-12-02 11:24:04 -08:00
Jeff Noyle d4c31404c3
Just run clang format on PIX passes (#2611)
Zero code change here, except moving the #include for DxilOperations.h to be on its own in a few files, in order to cope with our clang-format's habit of ordering includes alphabetically.
This is in preparation for a real change coming next.
2019-12-02 11:22:54 -08:00
Jeff Noyle 4efedeedf4 Yay that works 2019-11-27 16:33:13 -08:00
Xiang Li 815358229d
Use TAG_auto_variable for local variable which converted from global … (#2602)
* Use TAG_auto_variable for local variable which converted from global variable.
2019-11-25 23:10:24 -08:00
Jeff Noyle 0f075532c1 checkpoitn 2019-11-22 12:46:33 -08:00
Jeff Noyle 399227c175 checkpoint 2019-11-22 11:08:49 -08:00
Jeff Noyle dafa0cf7b9 Merge branch 'master' into PIX_Debug_Phis 2019-11-21 13:46:41 -08:00
Jeff Noyle e70fa4bdad checkpoint 2019-11-21 12:44:16 -08:00
Tex Riddell 00c6d87f68 Switch metadata generation on shader model instead of validator version.
- MinVal[Major|Minor] tracks shader model.
- Account for reflection by special casing for version 0.0 (no validation)
- Tolerate additions to non-critical metadata for future version
- Keep track of unrecognized non-critical metadata for validation
- Update validator to detect this case.
2019-11-20 17:12:42 -08:00
Tex Riddell f4965b71dd Integrate dxcapi v2 and other changes from internal (#2575)
* Integrate changes from internal.

- dxcapi v2
- new dxc options
- DxilValueCache
- PDB and NoOpt improvements
- noop / llvm::donothing() support

* Update dxrfallbacklayer for dxcapi internal changes

* Reorder diag block based on whether pDiag is set first.

* llvm::donothing() requires dxil 1.6 / SM 6.6 for now, lib as well.

* Fixes for spir-v, non-VC compiler and non-Windows builds

- DEFINE_CROSS_PLATFORM_UUIDOF for new interfaces
- add SAL annotations
- turn output argument validation for -P into warning
- handle warnings without concatenating them to main output
- update spirv preprocessing and compilation paths
- return E_NOTIMPL from IDxcUtils::CreateReflection
- cleanup: DxcContainerBuilder back to uft8, DxcTestUtils: remove comment

* Fix some warnings from clang/gcc.

* Fix unicode conversion problems on linux, where sizeof(wchar_t) == 4

Note this is an intermediate fix.
On linux, what we are calling utf16 is actually a wide string
that's probably utf32.  This change fixes issues introduced by
the new interface changes so things are consistent and pass tests.

A future fix should correct the encodings so they are correctly labeled
on platforms where wchar_t doesn't mean UTF16.

* Return false for IsBufferNullTerminated when CP_ACP.

One test for Disassembler was crashing because it created a pinned blob
with a size of 1 << 31 + 1 without actual memory backing this.  The
IsBufferNullTerminated would attempt to see if this was null terminated,
causing AV.

This change also removes CP_UTF8 from this test when it was creating
binary blobs, not UTF8 text blobs.
2019-11-13 16:16:45 -08:00
Vishal Sharma 48d3b17b2b
Fix DXIL linker issue (#2588) 2019-11-13 12:59:01 -08:00
Adam Yang 3c20b0d808
Fixed dia requiring zero termination in binary bitcode and crashing when emitting warning. 2019-11-12 22:24:07 -08:00
Xiang Li 94460c988b
Support register binding on resource in cbuffer. (#2582) 2019-11-11 12:03:06 -08:00
Xiang Li 462253a263
Clear register binding for resource in cbuffer. (#2580)
TODO: use correct register binding if exist.
2019-11-08 12:56:44 -08:00
amarpMSFT 42a511cb77
Add missing intrinsics to query InstanceContributionToHitGroupIndex via RayQuery (#2578)
* Add Candidate_ and Committed_InstanceContributionToHitGroupIndex() intrinsics to RayQuery object
2019-11-07 13:16:36 -08:00
Xiang Li f09333ba88
Support case memcpy has multiple overloads. (#2573) 2019-11-06 14:58:29 -08:00
Xiang Li 2b296a784e
Not hoist when has different operand. (#2569) 2019-11-05 21:53:45 -08:00
Jeff Noyle dfda2ee12d
Use DWORD-aligned raw access only! (#2559) 2019-10-29 16:36:45 -07:00
Adam Yang 3a870f2c08
Cleaning up unused alloca. (#2558)
* Cleaning up unused alloca

* Slight improvements

* Use single list. Added memcpy

* Made everything simpler
2019-10-28 22:20:22 -07:00
Xiang Li 0d24872386
Take care resource in cbuffer when not_use_legacy_cbuf_load. (#2543) 2019-10-26 13:05:24 -07:00
Tex Riddell 0513758aee
Fix header/reflection for RTAS/FeedbackTexture (#2549)
- Make header no longer conflict with officially added shader input types
- Replace uses with last enum +1/+2 to make compatible with prev SDKs
- Add reflection Test for FeedbackTexture2D[Array]
2019-10-24 17:27:27 -07:00
Tex Riddell 35e3e3105a
Reflection: Remove hack preventing more than one SV_ClipDistance (#2544)
No idea why this was here, but it was very old (predates GitHub repo),
and no tests failed when I removed it, so I think it's fine to remove.
2019-10-24 17:13:15 -07:00
Tex Riddell f032e2bce5
Fix precise on matrix with matrix subscript. (#2545) 2019-10-24 17:08:05 -07:00
Jeff Noyle 8ea40f8604
PIX DIA: Use min/max RVA of variable's uses as live range (#2548) 2019-10-24 15:54:57 -07:00
Adam Yang 755aed5536
Fixed an infinite loop in DxilEraseDeadRegion (#2542)
* Fixed infinite loop in erase dead region

* Slightly simpler

* Added test
2019-10-24 14:11:46 -07:00
Xiang Li b21fc88a55
Disable scalarization of const_static vector arrays. (#2546) 2019-10-24 11:58:38 -07:00
Helena Kotas 4391420da4
Integration from OS repo (#2541)
Includes OS PR #2130654: C++17 readiness: Enable /std:c++17 globally: Fixes for C2039 Errors
2019-10-22 15:43:26 -07:00
Tex Riddell 244670ec9e Pass DxilTypeSystem to GetLoweredUDT for type annotations 2019-10-22 09:48:19 -07:00
Tex Riddell 6eb541244a Lower vector/matrix early for UDT ptrs used directly such as Payload 2019-10-22 09:48:19 -07:00
Tex Riddell 6357448a38 Update type validation to support legal UDT case.
New Rules:
outer type may be: [ptr to][1 dim array of]( UDT struct | scalar )
inner type (UDT struct member) may be: [N dim array of]( UDT struct | scalar )
scalar type may be: ( float(16|32|64) | int(16|32|64) )

- Disallow pointers to pointers, and pointers in structs
- Disallow multi-dim arrays at top-level, but allow within struct
2019-10-22 09:48:19 -07:00
Tex Riddell a9fe6e19f5 Validation: Remove flawed special case for MeshShader Payload type 2019-10-22 09:48:19 -07:00
Tex Riddell 74480d1e72 Validation: Allow inner constant GEP for GEP and BitCast instructions 2019-10-22 09:48:19 -07:00
Tex Riddell f383b21be9 Add HLLowerUDT: early UDT ptr lowering for direct use in final Dxil
- DispatchMesh and other intrinsics that use UDT ptr directly need
  this to prevent copying and reconstruction, as well as prevent vectors
  and HL matrices from being used in final Dxil.
2019-10-22 09:48:19 -07:00
Tex Riddell 8a54ceba7c HLMatrixSubscript lowering: support lowering directly to array 2019-10-22 09:48:19 -07:00
Tex Riddell 5e06769ac8 MultiDimArrayToOneDimArray: support multi-dim array of struct
- Required for native structure ptr used directly by certain intrinsics,
  such as DispatchMesh.
2019-10-22 09:48:19 -07:00
Tex Riddell 698d0bec6b HLMatrixLowerPass Fix: Memory representation for constant init 2019-10-22 09:48:19 -07:00
Tex Riddell f5e9661afa Handle groupshared addrspace cast in ComputeViewIDState 2019-10-22 09:48:19 -07:00
Tex Riddell dd3af11b88 Fix DispatchMesh with groupshared payload. 2019-10-22 09:48:19 -07:00
Helena Kotas 57a8a6c8db
Fix validation failure for 16-bit float resource types (#2524)
Fix validation failure for 16-bit float resource types, add fp16 test coverage for Texture2D.Sample*
2019-10-21 11:26:37 -07:00
Tex Riddell a6a28a34bc
Fix lowering for all TextureCube[Array] Sample* and Gather* overloads (#2535)
* Fix lowering for all TextureCube[Array] Sample* and Gather* overloads

* Fix offsets for Gather test for change to i32 0 from i32 undef

- When not specifying immediate offsets when they can apply,
  we used to provide undef, but we should have been using zero.
2019-10-19 13:28:54 -07:00
Tex Riddell f5a75b4e4c
Update PSVSemanticKind for SV_ShadingRate and SV_CullPrimitive. (#2532) 2019-10-17 15:38:56 -07:00
Xiang Li 0bd0afe693
const folding on dxil.convergent.marker. (#2523) 2019-10-16 16:56:03 -07:00
Jeff Noyle 73e2229bf3
PIX: (DIA front-end): Use module entry point as backup iteration start #2527 2019-10-16 13:56:00 -07:00
Tex Riddell ee534ef3a1
Fix library cbuffer reflection for matrix array Elements count (#2513) 2019-10-16 09:34:05 -07:00
Jeff Noyle 779f2f878e
PIX: Add Amplification/Mesh shader to shader debugging instrumentation (#2521) 2019-10-14 16:14:38 -07:00
Xiang Li 77afe15b0d
Not fold gep 0, ... 0 for HLSL. (#2519)
not fold gep 0, ... 0 for HLSL.
2019-10-11 17:46:39 -07:00
Adam Yang 97ec60accd
Added pass to remove regions with no escaping values or side effects. (#2508)
* Erase dead region

* Pass dependencies

* Simpler heuristic, only checking that Begin dominates End and End post dominates Begin

* Small cleanups. No longer iterating whole block to find PHIs

* A few optimizations. Fixed infinite loops caused by self-loops
2019-10-08 15:45:22 -07:00
Tex Riddell 2a01c58f73
Fix RayQuery allocation for CSE, DCE, statics, arrays, and lifetimes (#2469)
Fixes problems like:
- extra AllocateRayQuery calls, or improper location (for lifetime)
- proper array support
- static global RayQuery

This RayQuery allocation changes:
- Add a constructor to RayQuery
- Set init sequence to use constructor in InitializeInitSequenceForHLSL, just for RayQuery
- For array: modify EmitCXXAggrConstructorCall to
  - loop over index instead of pointer to allow SROA of RayQuery struct
  - mark the loop as HlslForceUnroll
- Add hidden flag for HL intrinsics to allow internal intrinsic not produced
  by HLSL directly - mangle name so it can't be matched during parse.
- Add hidden HL AllocateRayQuery intrinsic
- Translate constructor call on ptr to HL AllocateRayQuery intrinsic call producing handle i32 during FinishCodeGen
- Translate RayQuery ptr to load i32 handle value for intrinsic methods during SROA_HLSL
- Flatten RayDesc for TraceRayInline
  (otherwise /Od fails validation since RayDesc type may still be present)
- No longer skip RayQuery for SROA_HLSL
- Update lowering for AllocateRayQuery, i32 handle, and flattened RayDesc
- Remove ReadNone attribute from AllocateRayQuery to prevent incorrect CSE optimizations
- Manually cleanup unused RayQuery allocations
2019-09-27 12:50:43 -07:00
Adam Yang 85b366764e
Fixed cfg related hash failures (#2491) 2019-09-26 21:20:58 -07:00
Tex Riddell 8a65c18d93
Refactor useful code from DxilTranslateRawBuffer into dxilutil (#2487) 2019-09-25 09:08:00 -07:00
Adam Yang da5f591736
Fixed very long loop when subrange is negative (#2486) 2019-09-24 17:58:06 -07:00
Tex Riddell b1fea1f075
Fix Amplification Shader (AS) payload size in runtime data (#2484) 2019-09-24 13:24:39 -07:00
Vishal Sharma da9a66f768
Fix compilation failures involving mul intrinsic overloads (#2470) 2019-09-23 12:46:57 -07:00
Tex Riddell 8fdb31fa36
Always write shader hash when supported, don't tie to debug name (#2468)
* Always write shader hash when supported, don't tie to debug name

* Stop trimming quotes from pdb name

This is the responsibility of cmd.exe, not the API.
The API user should not be adding extra quotes to the pdb name string
if they don't want them.
2019-09-19 17:36:18 -07:00
Adam Yang 56222b94be
Fixed a failure with LexicalBlockFile (#2453) 2019-09-09 09:34:39 -07:00
Jeff Noyle d7e448455c
calculate offsets for embedded arrays (#2444) 2019-09-06 15:28:46 -07:00
Tex Riddell 2bf23d3edf
Fix SampleLevel lowering for Cube/CubeArray (#2449) 2019-09-05 14:05:06 -07:00
Jeff Noyle 25ee2654d9
huh. (#2442) 2019-09-04 15:59:02 -07:00
amarpMSFT 63cbac780c
Mesh shader output size calculation fix (#2440)
missed align32 validation for mesh outputs
2019-09-03 17:40:00 -07:00
amarpMSFT 05d1ad532a
Acceleration structures should be classified as raw srv buffers, not typed srv buffers (#2425)
* acceleration structures should be classified as raw buffers
2019-08-22 16:35:47 -07:00
Tex Riddell 8f92b56130 Merge branch 'sep-reflect-merge' into sep-reflect 2019-08-22 10:50:29 -07:00
Tex Riddell 0d6d1f1773 Fix mesh shader signature element usage masks 2019-08-22 10:45:14 -07:00
Tex Riddell 171b98ff05 Merge branch 'sep-reflect-merge' into sep-reflect 2019-08-20 17:39:42 -07:00
Tex Riddell 16bb46f69d -validator-version: use '.', fix typo, complete linker support 2019-08-20 17:38:46 -07:00
Tex Riddell 0f23b6946c Merge remote-tracking branch 'ms/master' into sep-reflect 2019-08-19 00:43:25 -07:00
Tex Riddell 892765cc4b Default to stripping reflection from DXIL and fix a bunch of fallout.
Two test options, -Qstrip_reflect_from_dxil and -Qkeep_reflect_in_dxil
for making tests work with reflection removed, since many tests are relying
on main module disassembly-reassembly between test phases and reflection
metadata will no longer be present there.  The strip option is for the
few cases where tests don't want the reflection kept in DXIL by default.

Validator no longer requires function annotations for no reason.

Fix places where remove global hook was not being called when functions
were removed manually from the list.

StripReflection now deletes function annotations, unless targeting lib or
old validator that required them.  Preserve global constructor list and
add annotation for 1.4 validator.  The global hook fixes were required
here, otherwise annotations would refer to dead functions during linking.
Struct annotations may not be removed in library case when they still need
translation to legacy types.

Allow missing struct annotation when not necessary to upgrade the layout.

Preserve usage in reflection by upgrading the module, emitting metadata,
cloning for reflection, then restoring validator version and re-emit
metadata.

Fix size for 16-bit type for usage and reflected size.

Make various batch reflection tests require validator 1.5, since these
tests rely on module disassembly->assembly, which will not preserve extra
usage metadata for reflection in 1.4.

Include reflection part in IDxcAssembler, but don't strip from module,
since there are no options to prevent this from breaking a lot of tests.

Don't strip reflection from offline lib target.
2019-08-19 00:39:39 -07:00
Tex Riddell 4ac7d8d584 Fix -validator-version issue, add tests
- default should be UINT_MAX, not zero, since zero is used
- compare validator version against target profile minimum for error
- add back implicit -Vd for lib_6_1/2 in CompileWithDebug since
  target profile is supplied outside options, so option validation
  would not know to fail without -Vd.
- add tests for various cases
2019-08-16 20:52:23 -07:00
Tex Riddell f48650865e Add part header and RDAT to separate reflection output 2019-08-14 20:06:09 -07:00
Ehsan 922ef652fa
[spirv] Add option to flatten array of resources. (#2397)
* [spirv] Add option to flatten array of resources.

Current SPIR-V code generation uses 1 binding number for an array of
resources (e.g. an array of textures). However, the DX side uses one
binding number per array element. The newly added
'-fspv-flatten-resource-arrays' changes the SPIR-V backend behavior to
use one binding number per array element, and uses spirv-opt to flatten
the array.

TODO: Add a test where the array is passed around.
TODO: Test this works with steven's PR and proper results are produced.

* [spirv] Update tests to include array of samplers.

* [spirv] Take early exit condition out of the loop.

* [spirv] Add documentation for the new cmd option.

* [spirv] Invoke CreateDescriptorScalarReplacementPass when needed.

* [spirv] address code review comments.
2019-08-14 09:45:01 -04:00
Tex Riddell aeea959d88 Add default to switch to remove compiler warning. 2019-08-12 22:49:24 -07:00
Tex Riddell 5aec592757 Add seperate reflection part (STAT), add -Qstrip_reflect_from_dxil
- Put separate reflection in STAT part for now.
- Separate reflection is the module with deleted function bodies.
- Use new -Qstrip_reflect_from_dxil to drive stipping of reflection
  metadata from DXIL part, since now -Qstrip_reflect means strip
  the STAT reflection part, or don't include it in the first place.
- Update disassembler to use STAT part if available for reflecting
  resource bindings, buffer descriptions, and ViewID state.
- Put some Qstrip_* flags under DriverOption as well as CoreOption.
2019-08-12 18:15:19 -07:00
Tex Riddell 5894e7ab66 return bool changed from DxilModule::StripReflection 2019-08-12 16:15:52 -07:00
Tex Riddell bd929a7129 Reflection: optionally use new metadata+RDAT instead of instructions
- Makes it compatible with a module stripped of function bodies.
- Signature masks are improved, making it necessary to switch expected
  masks in DxilContainerTest.
2019-08-12 15:57:08 -07:00
Tex Riddell 4234a9ae53 Add CB Usage to metadata, compute in hlsl-dxil-lower-handle-for-lib 2019-08-12 15:32:17 -07:00
Tex Riddell eea0c94c08 Add Sig element usage masks to metadata, compute in hlsl-dxilfinalize 2019-08-12 15:30:01 -07:00
Tex Riddell 153edf8f5e RDAT: Fix shader stage and feature masks for ValVer 1.4 compat 2019-08-12 15:22:59 -07:00
Tex Riddell d9f134530d Fix Validator 1.4 compat break for bool signature element. 2019-08-12 15:20:51 -07:00
Tex Riddell b637b8ebab Fix metadata load for a Type template arg
- This would cause an assert when you store, write to bitcode, load from
  bitcode, store to bitcode again, then load again, when using the
  Type template arg metadata.
2019-08-12 15:20:51 -07:00
Tex Riddell f0bab7f861 Make MetadataHelper validator version aware
- use current validator version for adding template type metadata,
  rather than min for shader model.
2019-08-12 15:20:51 -07:00
Tex Riddell 406f537b49 Add -validator-version override
- remove auto-Vd on lib < 6.3, since this is already required to be
  explicit by option parsing.
2019-08-12 15:20:51 -07:00
Tex Riddell bc216664e5 DxilValidation: Restore original wording for Patch Constant Signature 2019-08-12 15:20:50 -07:00
Tex Riddell e5fc71ec2f Update GetMinValidatorVersion for DXR 1_1 (1.5) and Subobjects (1.4) 2019-08-12 14:40:31 -07:00
Tex Riddell 28a00c2c31 ShaderFlags: Set DXR 1.1 on RaytracingPipelineConfig1, new StateObjectFlag 2019-08-12 14:38:54 -07:00
Xiang Li 7c7bb98f0f
Use SExt instead of ZExt for frexp. (#2395)
* Use SExt instead of ZExt for frexp.
2019-08-06 13:58:35 -07:00
Adam Yang f5b8c0967e
Fixed memcpy lowering where instruction gets generated before its indices. (#2394) 2019-08-06 12:19:21 -07:00
Tex Riddell c8f7a6c970
Allow [Get/Set]NumThreads on Mesh/Amplification shaders (#2393) 2019-08-05 19:45:24 -07:00
czw831024 5fca2b49e1 add DXIL tests to verify mesh shader's output size and payload plus output size 2019-08-01 12:21:45 -07:00
Tex Riddell c012b4d0f5
Fix intrinsic arguments for WriteSamplerFeedback operations (#2387)
- Align coord dimensions with Sample for future flexibility and alignment
- Fix ddx and ddy arguments to support the correct number of dimensions
- Rewrite lowering, using SamplerHelper
- Clean up SampleHelper a bit, adding additional asserts/checks
- Set components to zero for default offset, not undef
2019-07-30 18:34:19 -07:00
Tristan Labelle 1156209ad6
Fix nondeterminism sources in the linker. (#2383) 2019-07-29 14:33:33 -07:00
amarpMSFT f9c973536e
flesh out SV_CullPrimitive support (#2373)
* flesh out SV_CullPrimitive support (and fill in some missing SV_ShadingRate entries)

* fix build breakl

* moved #defines and added comment

* removed depenency on adding new entries to OS header d3dcommon.h

* test fixes
2019-07-26 17:51:09 -07:00
Xiang Li f63e5dab7e
Only add nonuniform on resource ptr. (#2369) 2019-07-26 16:52:44 -07:00
Tristan Labelle 02ac47c26d
Replace std::make_unique with llvm::make_unique (#2367) 2019-07-25 18:32:10 -07:00
Tristan Labelle 146ce5ba0b
Improve validation error message formatting (#2301)
Chiefly, prints the instruction causing the validation failure rather than its address.
2019-07-25 12:29:37 -07:00
Greg Roth c619f91173 Remove lambda parameter shadow (#2366)
Later versions of clang fail to compile a lambda function that includes
a parameter that shadows an explicit capture. Such a one was introduced
to HLOperationLower recently. This preserves the existing behavior while
removing the shadow.
2019-07-25 08:58:41 -07:00
Tex Riddell f6ff322db3
Add mesh shader support to RootSignature parsing/validation and fix PSV (#2363)
* Add mesh shader support to RootSignature parsing/validation
* Fix PSV so MS output topology doesn't overlap SigInputVectors
* fix PSV version code to use latest when validation disabled
2019-07-24 16:39:34 -07:00
Tex Riddell 2facceae0b
Store mesh payload in function props, fix Wave/Quad/Barrier validation (#2361)
- Compute mesh payload size before final object serialization
  - During CodeGen for MS based on payload parameter
  - During CollecShaderFlagsForModule for AS based on DispatchMesh call
- Store payload sizes in corresponding funtion properties, serializing
  these properly for HL and Dxil Modules
- Use payload sizes from function props for PSV0 data during serialization
- Validate measured and declared payload sizes, don't just fill in
  properties during validation
- Fix Wave/Quad allowed shader stages, enabling Quad* with CS-like models
- rename payloadByteSize members to payloadSizeInBytes
- Add GetMinShaderModelAndMask overload taking CallInst for additional
  detail required to produce correct SM mask for Barrier operations
2019-07-24 10:22:28 -07:00
amarpMSFT 37acf90723 add payload size to Amplification Shader metadata to mirror MS metadata (#2359) 2019-07-23 17:27:42 -07:00
Jeff Noyle b54cbf0ad7
PIX: Change shader access tracking pass to use non-atomic stores (#2360) 2019-07-23 17:26:31 -07:00
Tristan Labelle 0332b4bd90
Update FeedbackTexture2D types to be templated (#2347)
- Update the HLSL syntax from FeedbackTexture2DMinLod to FeedbackTexture2D<SAMPLER_FEEDBACK_MIN_MIP>
- Update DXIL to only have two UAV types for FeedbackTexture2D[Array] and use an extra metadata field to distinguish between the sampler feedback type.
2019-07-23 12:27:05 -07:00
Tristan Labelle c7ccf50539
Fix crash with mesh shader output signatures (#2355)
The primitive signature was being used with vertexStoreOutput instructions. It would crash if the vertex signature was bigger than the primitive signature.
2019-07-19 07:37:20 -07:00
Tristan Labelle 49a5dd63c2
Remove globalopt dependency on the name of the entrypoint (#2350)
GlobalOpt has an explicit test for a function called "main" (eww...) to enable an optimization. This updates the test to match the entry point, independent of its name.
2019-07-17 14:31:26 -07:00
Xiang Li 7264e48bcb
Disable alloca merge when inline to make remove alloca easy. (#2349)
* Disable alloca merge when inline to make remove alloca easy.
2019-07-17 10:52:05 -07:00
Vishal Sharma c1d78e5116
Fix implementation of d3dcolortoubyte4 (#2345)
* Fix implimentation of d3dcolortoubyte4

* Address CR comment

* Remove lang version flag

* Add relevant except from staco
2019-07-16 16:51:27 -07:00
amarpMSFT c41606c737 Add RaytracingPipelineConfig1 subobject to DXR (#2342) 2019-07-16 01:34:45 -07:00
Tex Riddell 7d29e947b5 Misc cleanup 2019-07-13 19:30:58 -07:00
Tex Riddell 397a67082e Misc cleanup during review 2019-07-12 19:19:14 -07:00
Tex Riddell 7a085056f8 Rework PIX access tracking to remove hard coded table.
Instead of hard-coding properties here, constructing every overload of
every resource method we know about, then iterating users, we:
- iterate through used intrinsic functions, collecting the ones that
  have handle parameters (resource accesses).  This will pick up all
  used overloads.
- Instead of hard coding read/write access in a table here, we use the
  function attribute which carries this same information for main resource
  operations.
- Exceptions to matching function attribute:
  GetDimensions: DescriptorRead, but commented to match prior behavior
  BufferUpdateCounter: Counter
  TraceRay[Inline]: Read, but commented to match prior behavior for now
- Then, this type of resource access is emitted for the first handle arg
- Subsequent handles are considered DescriptorRead operations,
- DescriptorRead currently just aliases to Read, but could be useful in
  the future if we want to differentiate or support GetDimensions.
2019-07-12 17:06:10 -07:00
Tex Riddell dc835b8892 Set AllocateRayQuery to ReadNone to allow DCE on unused RayQuery alloc 2019-07-12 12:11:52 -07:00
Tex Riddell 02d4ba3095 Fix SROA of RayQuery alloca, update test
Skip RayQuery alloca for SROA_HLSL, but don't treat as HLSLObjectType elsewhere
2019-07-12 11:58:20 -07:00
Tex Riddell afb0482332 Fix opcodes in tests and metadata versioning to handle early serialization
- Fix metadata version logic to handle no shader model for early HLModule.
- Modify validation tests that change validator version to 1.3 to also
  modify the struct annotation to conform to that version of the type
  annotation metadata.
- Fix opcodes from merge.
- Add CHECKs for RayQuery intrinsics in tryAllOps, disabled until handle
  alloca can be properly eliminated.
2019-07-11 21:04:32 -07:00
Tex Riddell 6b4c3b4710 Merge remote-tracking branch 'rt/mesh-master' into merge-dxil-1-5 2019-07-11 17:25:02 -07:00
Sahil Parmar 17d6045f62 Merged PR 123: Fix SpirEgg/DXIL build issues from PR #116
Fix SpirEgg/DXIL build issues from PR #116
2019-07-12 00:24:44 +00:00
Tex Riddell afbe50930c Merge rayquery into merge-dxil-1-5 2019-07-11 17:20:38 -07:00
Amar Patel (GRAPHICS) 4b4d5ca5f6 Merged PR 122: Add GeometryIndex() to any hit, closest hit, and intersection shaders, with raytracing_tier_1_1 feature bit 2019-07-12 00:14:59 +00:00
Tex Riddell cd9fee2291 Merge rayquery into merge-dxil-1-5 2019-07-11 17:06:38 -07:00
Tristan Labelle b7868f8081 This change implements the `FeedbackTexture2D[Array](MinLOD|Tiled)` types in HLSL and the backing `WriteSamplerFeedback[Bias|Level|Grad]` DXIL intrinsics. 2019-07-11 16:51:16 -07:00
Sahil Parmar 968fe41136 Merged PR 116: Add support for HLSL Meshlets
This PR adds support for new HLSL mesh and amplification shaders.
2019-07-11 20:19:23 +00:00
Amar Patel (GRAPHICS) 6bac49fd65 Merged PR 120: Adding RayQuery intrinsics 2019-07-11 13:49:03 +00:00
John Porto 3440e968fe
DxilDia bugfixes: (#2327)
1) avoids leaking exceptions during symbol initialization
2) returns E_FAIL when the scope for a local variable is not found
3) uses getInlinedAtScope when searching for the local variable's scopes.
2019-07-10 10:55:54 -07:00
Tex Riddell 2209844cda Add RayQuery object, TraceRayInline method + template arg annotations 2019-07-09 18:55:55 -07:00
Helena Kotas e3cf6b6ed9
Fix external validator crash on empty struct (#2323)
Fixes #2322
2019-07-09 13:17:07 -07:00
Tex Riddell 03ecd7d3d7 Fix more: array[1] and unbounded cases, address feedback
- Array of [1] would crash in UpdateStructTypeForLegacyLayout
  since IsResourceArray would be false.
- Unbounded array has weird quirk in D3DCompiler reflection:
  all resources report BindCount == 0, except cbuffer reports UINT_MAX.
  Added simple unbounded cases for other resource types to verify.
- Fxc has some auto-binding bugs.
  Worked around in the test with explicit binding.
- Added suggested helper functions for strip/wrap array types.
  Used in code that's part of this change, but didn't search for other
  places to replace yet.
2019-07-06 12:49:19 -07:00
Tex Riddell 2165224f0e Fix reflection of ConstantBuffer/StructuredBuffer arrays, nested arrays
Bugs for arrays:
- Size incorrect for ConstantBuffer type
  (should not be divided by array size)
- Size incorrect for StructuredBuffer element
  (was calculated with array dimensions multiplied in)
- Nested arrays would assert (ConstantBuffer) or
  fail to count/unwrap all resource dims (StructuredBuffer)

Match fxc weirdness:
- ConstantBuffer array has Elements == 1 for inner element type
  instead of 0, which would be consistent with everything else.
- StructuredBuffer array has "[0]" appended to the Name
  for each array dimension of the buffer...
2019-07-05 19:17:48 -07:00
Tex Riddell b6d67a3851 Fix validation for legacy cbuffer layout
- base of struct should always be aligned - or internal bug
- offset for array member must always be aligned - (new) validation error
- alloc and verify struct layouts even when not array field
- out of bound check would have missed OOB on last array element
2019-07-05 18:27:50 -07:00
Tex Riddell 9a28817c2e Fix bugs in legacy cbuffer layout for matrix
- would translate all matrices into arrays, but all arrays have to be
  row-aligned, when only multi-row matrices are row-aligned by fxc.
- would force matrix cols=4 if IsCBuf, but:
  a) this is not the correct way to align matrices that should be aligned.
  b) IsCBuf was true if resource was array (which should not impact
     layout of elements inside)
- now we only translate matrix to array if rows > 1 because, that's the
  only case in which they must be aligned like an array.  And all arrays
  must be aligned (even elements == 1).
2019-07-05 18:27:18 -07:00
Tex Riddell f7c0da0d27 Fix type in cbuffer annotation when converting to legacy layout
Copy of original annotation would copy original struct type,
instead of one for legacy layout.  This leads to internal validation
using a different type when validating cbuffer layout than external
validation would use, since this member would be synthesized by the
key, rather than saved and restored directly.
2019-07-05 18:00:31 -07:00
Xiang Li e48c086f14
Check flatten attribute in SimplifyCFG. (#2311)
* Check flatten attribute in SimplifyCFG.
2019-07-02 16:59:13 -07:00
Tristan Labelle c0cf2018e6
Enable validation of rawbufferload/rawbufferstore (#2300)
The validation code was written for these but never ran due to this omission.
2019-07-01 18:45:53 -07:00
Adam Yang 47e31c8a4c
Removed quotations from debug name (#2303) 2019-06-26 11:52:24 -07:00
Tristan Labelle a0c95cd98b
Support non-int32 types with SV_***ID semantics (#2286)
It seemed overkill to request backends to support a @dx.op.threadId.i16 intrinsic so I'm always calling the i32 version and then truncating or zextending to the final type.
2019-06-25 16:21:18 -07:00
Tristan Labelle 33d5f9cecf
Fix memcpy size calculation in ScalarReplAggregatesHLSL::ReplaceMemcpy (#2298)
Multiplying by the alignment made no sense, and fixed a case where the alignment also made no sense.
2019-06-25 15:32:11 -07:00
Xiang Li cd0e61e77b
Fix validation error when element size for array of struct is not aligned (#2295)
* Fix validation error when element size for array of struct is not 16 aligned.

* Remove outdated comment.
2019-06-25 11:09:12 -07:00
Adam Yang 5ac1b30c2b
Fixed a bug where max unroll iteration attempt is cached. (#2291) 2019-06-24 15:49:59 -07:00
Tristan Labelle 9799235a6a
Fixed InterlockedCompareExchange failing validation (#2284)
The code handled instruction addrspacecasts but not constant expression ones. Refactored into a common method between two code paths.
2019-06-24 10:34:38 -07:00
Tex Riddell 4610aa7cbb
Merge pull request #2285 from tex3d/refl-offsets
- Added test with a number of apparently untested cases
- Updated some tests used for the more exhaustive reflection test
  which were failing to compile on fxc due to utf-8 bom and bool `|=`.
- Don't ignore dxc failures with ignore dxbc fails flag.
2019-06-20 22:14:51 -07:00
Tex Riddell c5985f483c Fix CBuffer and StructuredBuffer offsets for packing and reflection.
- Added test with a number of apparently untested cases
- Updated some tests used for the more exhaustive reflection test
  which were failing to compile on fxc due to utf-8 bom and bool `|=`.
- Don't ignore dxc failures with ignore dxbc fails flag.
2019-06-20 20:11:15 -07:00
Xiang Li 365e6e8aa9
Support case for memcpy float3[1], float3[1][1]. (#2280) 2019-06-20 15:34:13 -07:00
Tristan Labelle 4f16293686
Fix memcpy mishandling in parameter SROA. (#2281)
LowerMemcpy tries to see if a variable is only written to by a single memcpy. There is a special case for llvm::Arguments, since arguments have an implicit first write for their initialization in addition to whatever memcpy they might have. During parameter SROA, aggregate arguments get broken down into their constituing elements as plateholder AllocaInsts, and at the end of the pass they'll get promoted back into llvm::Arguments of the SROA'd function. In the intermediate steps, however, a test for isa<Argument> will fail to detect that an AllocaInst is a placeholder for a future argument, and so its implicit initialization will not be taken into account. This would cause a bad ReplaceMemcpy, yielding reads from uninitialized locals, causing undefs and all of that fun stuff.
2019-06-20 15:33:48 -07:00
Tex Riddell fccc807a5a
Fix StructuredBuffer field offsets in reflection (#2278) 2019-06-20 11:14:41 -07:00
Adam Yang e68a18756c
Fixed a bug where dbg.value are missing for PHIs (#2275) 2019-06-19 17:32:11 -07:00
Tex Riddell dacb21040a
Fix PDB generation to not require embedded hash inside DxilContainer (#2272)
This would break when official DXIL.dll was used, since it would exclude
the shader hash part that the validator wouldn't recognize, and is not
required for a valid shader.  This prevented shader signing for the
output of newer compiler versions.
2019-06-19 16:03:12 -07:00
Tristan Labelle e233221247
Fix off-by-one in assert. (#2266) 2019-06-15 14:42:50 -07:00
Tristan Labelle 12678272d3
Fixed offset being lost when lowering Buffer<Struct>[i].Field (#2261)
Buffer<Struct>[i].Field was handled by the structuredbuffer code, which emitted a rawBufferLoad intrinsic. However, typed buffers cannot use rawBufferLoad, so some further code patched it back into a bufferLoad. The problem is that bufferLoad cannot take a byte offset into the element to be loaded, it has to load the entire thing, so we always ended up loading the first struct field.

This fix makes us emit a bufferLoad directly, and use the offset to determine which ResRet components to return, falling back to using a temporary array for dynamic indexing if needed.
2019-06-14 15:54:47 -07:00
Adam Yang 4070005bfc
Fixed a random crash because of invalid pointer (#2262) 2019-06-14 15:40:18 -07:00
Tristan Labelle f3489bc99b
Workaround "legalize sample offset" pass introducing illegal alloca i8s. (#2265)
The legalize sample offset pass runs a bunch of optimizations in the hope to turn the offset argument into a constant value. This includes running the vanilla "SROA" pass, which differs from the HLSL one and will promote alloca i1s to alloca i8s, which is invalid dxil. This change modifies the "SROA" pass to leave alloca i1s intact as it is the solution least likely to introduce regressions.
2019-06-14 13:30:25 -07:00
Adam Yang 3bf5ad44d3
Fixed hash difference between debug and non-debug by sorting named value serialization. (#2259) 2019-06-13 16:10:42 -07:00
Adam Yang 298076a941
An invalid dbg declare doesn't automatically fail everything. (#2252) 2019-06-11 16:23:54 -07:00
Adam Yang e32833cac7
Loop count for all trivial form loops. Mem2Reg only when necessary. (#2250)
* Comments and small adjustments

* Deleting loops from scalar evolution correctly

* Better error/warning message
2019-06-07 18:40:43 -07:00
Tristan Labelle f749543a76
Fix buffer overflow in debug printing. (#2248) 2019-06-07 11:58:36 -07:00
Adam Yang bbf44f7a22
Fixed a crash with passing struct array element to functions (#2245) 2019-06-05 17:09:25 -07:00
Adam Yang dc6203ad5b
unroll(n) can now override loop bound. unroll(negative) now fails correctly (#2241) 2019-06-05 13:34:17 -07:00
Adam Yang 8da9314cb1
Fixed a compiler warning with unsigned/signed mismatch (#2221) 2019-05-30 15:58:39 -07:00
Adam Yang 2dec1cd0df
Putting debug info in PDB container (#2215) 2019-05-29 16:29:58 -07:00
Greg Roth 8ebb86399d Hide Unix symbols by default (#2213)
* Hide Unix symbols by default

Using the -fvisibility=hidden by default only exposes symbols that have
the __attribute__((visibility("default"))) attribute applied to them.
There are only a couple of functions that are meant to be exposed from
libdxcompiler. To expose the proper functions, the macros used to create
the UUIDs for class types are redefined with the attribute and
DXC_API_IMPORT is defined with the attribute above for non-windows
platforms.

Since we're hiding everything now, the portions that were explicitly
hidden to make MacOS work are removed.

This exposed a number of missing dependencies of libraries and unit
tests. Namely clang-hlsl-tests, clang-spirv-tests, and libclangCodeGen.

Resolves #2203

* Remove explicit marking of DxcThreadMalloc hidden

This was a workaround that is not longer necessary since all symbols are
hidden by default on Unix platforms now.
2019-05-29 12:56:02 -04:00
Tex Riddell 1d72beb39c
Add shader hash blob part (#2212)
- Change default to hash/name based on binary, not source
- Allow inclusion of hash/name without debug info (/Zi)
2019-05-24 17:17:12 -07:00
Tex Riddell af50891d4c
Build: Fix break with clang/gcc (#2210) 2019-05-24 14:33:24 -07:00
Tex Riddell 04fb1d6468 Merge remote-tracking branch 'ms/master' into dxil-1-5 2019-05-22 15:34:19 -07:00
Tex Riddell dfae866e19 Build: Update NumOpCodes static_assert in PIX pass 2019-05-22 15:30:47 -07:00
John Porto c9b9676c5d
User/joporto/dxcmem refactor (#2196)
* Moves the implementation of DxcThreadMalloc to dxcmem.cpp; removes DxcSwapThreadMalloc

* Remove DxcSetThreadMalloc(OrDefault) -- they were always been used for installing the default allocator

* Deletes copy and move ctors and assignment

* stores the allocator in the TLS slot.

* DxcSwapThreadMalloc should be able to install a null allocator

* Marks the DxcThreadMalloc members as hidden (linux only)
2019-05-20 20:52:10 -07:00
Tex Riddell ba668cf13e Merge remote-tracking branch 'ms/master' into dxil-1-5 2019-05-20 11:49:53 -07:00
Adam Yang f240070cf1
Fixed issues that change codegen when there's debug info. (#2190) 2019-05-17 13:35:08 -07:00
Tristan Labelle 347c04d51c
Split DxilGenerationPass.cpp into several files for its several passes (#2180) 2019-05-15 08:59:16 -07:00
Adam Yang 716a113650
Fixed a bug where necessary code gets deleted with library shaders. (#2185)
Deleted some old code where we manually delete users of resource GV's while stripping debug modules. This is normally unnecessary but harmless, because we are guaranteed to have reduced all resource variables, and any failure to do so would result in a validation failure. However, library shaders can have resource variables. Removing the instructions here would result in incorrect code.
2019-05-14 11:10:46 -07:00
Adam Yang 24e71b12e5
Fixed a validation failure with stripping reflect with library shaders. (#2182) 2019-05-14 08:00:08 -07:00
John Porto 0e704e1ea2
DxilDia uses factories (not lambdas) for symbol creation (#2177)
* Modifies DxilDia's to use factories instead of lambdas for creating symbols

DxilDia used to hold a vector of lambdas -- constructors -- for the symbols
it knew about. When the user requested a symbol, said constructor was to be
invoked, creating a brand new symbol.

C++ lambdas can be hard to debug (i.e., how does one inspect the lambda's
state?), so the library is modified to hold a vector of factories -- i.e.,
objects with fields and a CreateSymbol method.

* fixes typo when creating Typedef symbols

* caches the VarInfo when creating the LocalVariable factories
2019-05-10 11:00:24 -07:00
Tristan Labelle 17624a864b
Support space-only resource register annotations (#2163)
This change adds support for : register(spaceN), where a space is specified but no register, such that the resource allocator runs on the given space.

For backwards compatibility, -auto-binding-space applies to resources without register bindings but not to those with an explicit register assignment like : register(t0) - those always go in space 0. To support this, I had to add state to represent "unspecified space". In the UnusualAnnotation, it's a Optional<uint32_t>. In the DxilResource, it's UINT_MAX, like we do for unspecified registers.
2019-05-09 12:15:05 -07:00
Tristan Labelle f89e3b7150
Add support for RWRawByteBuffer.Store<T> (#2176)
Adds support for templatized RWRawByteBuffer.Store<T>. To avoid SROA making us lose the original layout of any struct arguments, a new pass runs before SROA and breaks down such cases into per-element stores. So better be careful with the likes of buf.Store(0, (int[65536])0);...
2019-05-09 11:51:38 -07:00
Tristan Labelle 3a7e6d38de
Add support for generalized templated byte address buffer loads (#2159)
Implements [RW]ByteAddressBuffer.Load<T> for all numerical or aggregate of numerical Ts, leveraging the same code from StructuredBuffer<T>.Load.
2019-05-08 11:39:25 -07:00
Tristan Labelle e4ceb30434
Convert more mallocs/frees to news/deletes (#2157)
This change replaces more malloc/calloc/realloc/free calls to new/delete. This creates several new out of memory exception paths (especially due to changing MallocAllocator) which exposed many memory mismanagement bugs that I found using the OOM test. Hence this grew to be a pretty big change.
2019-05-07 07:46:09 -07:00
Jeff Noyle 6679de6506
PIX shader access tracking: SM6.3 resource access overloads (#2172)
(finally) updating PIX's shader access tracking code to add the SM6.3 overloads of the various resource-access opcodes.
The new test passes on debug and release.
2019-05-06 15:03:05 -07:00
Tristan Labelle 7d1a397e4f
Fix double-delete when failing to load llvm bitcode (#2169)
We added an exception when the bitcode reader reports an error, however the code higher up in the call stack isn't fully exception-safe and it turns out that two unique_ptrs over the same MemoryBuffer are alive at the same time, hence the double-delete and, more often than not, crash.
2019-05-06 08:29:43 -07:00