Граф коммитов

686 Коммитов

Автор SHA1 Сообщение Дата
Grace Jennings 0c2c998326
Add --version flag to print dxc version info (#3912)
* basic -version flag from -? flag

* now --version and added to -? text
2021-08-24 09:53:13 -07:00
Jeff Noyle 5d63d48f44
PIX: Add thread ID to amplification shader payload (#3904)
This change is required to enable PIX to reconstruct mesh shader output data correctly. Without knowing the thread id of the AS thread that launched the mesh shader threads, we don't know which probably-a-mesh to attribute each MS's output to.
2021-08-13 14:04:46 -07:00
Tex Riddell 9c504e3754
Add -enable-short-circuit option for operators &&, ||, :? (#3892) 2021-08-06 20:59:13 -07:00
Tex Riddell cb6b85c6f9
-strict-udt-casting prevents implicit casts between unrelated structs (#3891) 2021-08-02 18:38:52 -07:00
Greg Roth 6d2d19a179 Merge remote-tracking branch 'origin/master' into hlsl2021 2021-06-17 12:39:25 -07:00
Jeff Noyle ec7e332308
PIX: Check SM66 handle types for dynamic indexing (#3819)
While working on another issue I noticed that I had forgotten these cases.
(PIX uses this pass to detect non-constant indexing into resource arrays in order to decide if it should run the whole "shader access tracking" pass to check for out-of-range access etc. So it was failing to detect dynamic indexing and so could choose not to run the SAT pass.)
2021-06-11 14:50:24 -07:00
Atte Seppälä 12b81409bb
Remove execinfo.h include in WinAdapter.h (#3797)
This small change makes it possible to build and use
DirectXShaderCompiler on Android, at least on arm64-v8a.

execinfo.h does not exist in Android toolchain. However it is
only used by clang-hlsl-tests which can be disabled from the build.

CaptureStackBackTrace is not used in any files
compiled in non-Windows builds. This is because CompilerTest.cpp,
where it was used, is disabled from non-Windows build at the moment.

Co-authored-by: Atte Seppälä <atte.seppala@ul.com>
2021-05-27 15:51:23 -07:00
Tex Riddell e6ba792e2d
Fold bitcast-to-base into GEP in MergeGepUse, plus refactor and resource fixes (#3801)
* Fold bitcast-to-base into GEP in MergeGepUse, plus refactor

  - Fixes case where bitcast isn't eliminated under -Od.
  - Fix issue with constant GEP path that was unintentionally using GEP
    from failed prior branch.
  - Change to worklist pattern, which simplifies things.
  - One case wasn't handled, which was going to complicate code:
    GEP -> bitcast -> GEP would combine bitcast -> GEP, but not combine
    outer GEP with the new one.
  - Add dxilutil::TryReplaceBaseCastWithGep that simply replaces if it matches.

* Fix cbuffer layouts when resources are in structs

  - keep track of resources in type annotation information
  - fix cbuffer packing locations for things around resources
  - when translating struct for legacy:
    - remove resource fields from struct types
    - properly replace HLSL type
  - remove unused structs and annotations
  - don't produce type annotations for silly things
  - fix other random bugs found along the way

* Fix ValidateVersionNotAllowed test issue with external validator

  It turns out the test wasn't correct in assuming dxil version reported was
  the one it should expect to be supported by the validator, since that
  represents the compiler version.  The validator version constrains the
  dxil version supported by the validator, not the compiler version.
  See comments in the test for details.

* Fix error reporting in ValidationTest::RewriteAssemblyToText

* Fix tests for translated structs and hostlayout name change.
2021-05-26 20:17:50 -07:00
Greg Roth 6999a51fe9
Update legacy structs during compile (#3659)
Instead of bailing out when compiling libraries without full 16-bit
support enabled, we update the struct type for legacy layouts and
replace the associated ops to the new types where necessary.

This results in converted types named with a leading "dx." prefix that
is an array type which often isn't expected in validation where it newly
shows up, so a change there to permit these types to pass.

As an incidental, remove function annotations when appropriate
regardless of full 16-bit enabling. There was no reason to make this
depend on that setting.

Adapt tests to the new types. Library targets may now produce a
dx.alignment* type instead of the user-specified type.

The prefix was confusing code that assumed it meant it was a builtin
type. Removing it allows reverting some of the hacky changes to evade
the problems that caused.

also fixes a couple mistakes in DxilLowerCreateHandleForLib and corrects
redundant inclusion of dxilmodule and module in params to
UpdateStructTypeForLegacyLayout

alignment.legacy wasn't really describing the current intent of the
modified type. Since the type is altered to match the layout of the
host, it's here renamed "hostlayout."

Fixes #3058
2021-05-25 13:12:26 -07:00
Tex Riddell 002f2e1db5
Fix bugs with ConstantBuffer resource object usage (#3796)
ConstantBuffer<> and TextureBuffer<> resource objects were not being
handled when the objects themselves were used instead of just reading from
them.  Examples are passing to/from functions, but also simply an extra
layer of ParenExpr.  For simple ParenExpr case, it's easy to just drill
through, but this isn't possible for other cases.  More temps need to be
potentially marked, so MarkRetTemp/MarkCallArgumentTemp is unified into
MarkPotentialResourceTemp which is also used for temp created during
EmitAggExprToLValue.  CBuffer init code in FinishCodeGen also needed
fixing.  When replacing resource users with subscript operator, users of
the actual resource object need the original HL resource type, which is
constructed through the HL HandleToResCast and stored to an alloca for
pointer users.

Fixed MutateResourceToHandle pass for CBV/TBV, and for HandleToResCast
case.

Fixed PatchTBufferLoad issues with AnnotateHandle and library cases.

Also added a pass to cleanup doubled AnnotateHandle calls leftover after
inlining.

Along the way, noticed a global init translation bug potentially impacting
the library case, but don't yet have fix implemented, or a specific test
case hitting this.

Also noticed that there's a whole special case for empty cbuffer that gets
created, but I don't think it's actually necessary.  Other code may need
adjusting, but it should be fine to skip emiting of the special opaque-
typed global value for the empty, unused cbuffer.  Added some FIXME
comments for this, but have not tried to make this change yet.
2021-05-24 12:55:03 -07:00
Vishal Sharma 742657ade4
Add option to override semantic defines (#3766) 2021-05-11 21:43:15 -07:00
Greg Roth 61bd18526d
Allow variable offsets to gathers (#3764)
Mostly reverts the change that added an error when offsets were not
immediate or out of range for gather operations. More research suggests
this error was not required and it broke some existing shaders

Fixes #3738
2021-05-11 17:04:58 -07:00
Adam Yang 9a8816ba73
Added an pdb writer overload that doesn't take any input blobs (#3715) 2021-04-28 15:13:17 -07:00
Helena Kotas af088513ab Merge of branch 'master' of https://github.com/Microsoft/DirectXShaderCompiler into merge-master-hlsl-2021 2021-04-21 17:25:59 -07:00
Xiang Li bd38504d5e
Add -dumprs to dxa. (#3703) 2021-04-21 15:31:44 -07:00
Xiang Li 6775e94ef1
Bump shader model to 6.7 (#3679) 2021-04-10 12:24:49 -07:00
Xiang Li 5fa34d1fd6
Save root sig for entry for lib. (#3669)
* Save root sig for entry for lib.
2021-04-07 21:32:17 -07:00
Vishal Sharma 3daa429525
Update BaseAlignLog2 field in ResourceProperties for StructuredBuffer (#3652) 2021-04-01 04:55:31 -07:00
Greg Roth cf135fa88b
Allow removal of trivially dead convergent marker (#3640)
Failing to remove this because it is marked as having side effects so it
can prevent unwanted code movement resulted in trivially dead code being
retained unnecessarily because the marker isn't removed until after dead
code elimination. By allowing its removal when the operation that needed
it has been removed so it has no users, this dead code can be
eliminated.
2021-03-29 17:54:28 -07:00
Helena Kotas 555ca24474
Add load library error to dxc exception handler; add header with exception error codes (#3639)
* Add load library error to dxc exception handler
* Add header with exception error codes
2021-03-26 23:37:41 -07:00
Greg Roth 6b44d611f7
convert recoverable exceptions to c++ (#3636)
Instead of raising structured exceptions for unreachable and fatal
errors, raising c++ exceptions allows returning an error code and
getting a useful message instead of requiring a structured exception
handler to catch it.

Add cast failure assert
2021-03-26 19:56:42 -07:00
Tex Riddell 6244ab8337
Validator/Dxil version error improvements (#3623)
- Move validator/dxil version checks up-front
  These should fail first rather than side effects of trying to validate
  details of a version we don't support.
- Improve message for unsupported validator or dxil version
  These errors are most likely if compiling separately from validation
  and failing to override the validator version properly, or running on
  an external validator that doesn't support a newer dxil.
- Use dxil version from metadata for DxilModule when loading,
  rather than just setting it to minimum based on shader model.
- Remove TODO from validator messages that shouldn't be there
2021-03-25 14:03:31 -07:00
Helena Kotas 53b2ad9ffd
Add d3d12TokenizedProgramFormat.hpp header from the Windows Driver kit (#3617)
* Add d3d12TokenizedProgramFormat.hpp header from the Windows Driver kit.

The header is now open source under the University of Illinois Open Source License.
This change reduces the DX compiler projects dependencies on WDK to TAEF testing
framework only. That means the project can be built without WDK if tests are
excluded from the build (HLSL_INCLUDE_TESTS cmake option to OFF).

Fixes #2965
2021-03-25 12:19:38 -07:00
Adam Yang 86104f415f
Internal validator error messages don't need /Zi anymore. (#3606) 2021-03-20 19:21:26 -07:00
Greg Roth 9b475a7fa3
Add dxc exception handler (#3604)
The most common cause of internal compiler errors are access violations
or stack overflows. This registers an exception handler in dxc.exe for
these cases that are otherwise unhandled. It prints a simple message
for these errors and passes the exception along.

In case this is unwanted for some reason, a hidden disabling flag is
added as well.

Adds LLVM builtin exceptions for assert, fatal, and unreachable. Adds a
default message for exceptions not explicitly addressed.

Alters behavior of llvm_unreachable so it always raises an exception
regardless of compiler support for unreachable hints.

Reports errors using fputs instead of std::cerr to ensure that no
allocation is necessary. Custom output is performed in a static array
that is output with fputs.
2021-03-18 20:37:21 -07:00
Adam Yang e8372b9b9a
Fixed arg pairs not correct for old source in module pdbs (#3599) 2021-03-18 01:33:06 -07:00
Adam Yang 640c9af748
Added way for caller to replace args in PDB utils (#3595) 2021-03-17 12:38:11 -07:00
Xiang Li de09119850
Add -link to dxc. (#3577)
* Add -link to dxc.
2021-03-15 09:09:58 -07:00
Tex Riddell 0b686f347b
FileChecker improvements: VFS, %dxl, FileCheck -D, error reporting (#3576)
- VFS captures output files for duration of test, enabling:
- %dxl test IDxcLinker
- add -D to FileCheck args to supply defined variables
- report failing RUN command when not consumed by FileCheck or XFail
2021-03-14 22:13:26 -07:00
Michael Haidl 1db765ea9c
DXC extension for DXR Payload Access Qualifiers (#3171)
This extension adds qualifiers for payload structures accompanied with semantic checks and code generation. This feature is opt-in for SM 6.6 libraries.  The information added by the developer is stored in the DXIL type system and a new metadata node is emitted during code generation. The metadata is not necessary for correct translation of DXIL, so it may be safely ignored, but it provides hints to unlock potential optimizations in payload storage between DXR shader stages.
2021-03-14 18:31:40 -07:00
Adam Yang 92e3f2a6be
Fixed a bug reading version string from PDB. Implemented IDxcVersionInfo3 for DxcCompiler. (#3570) 2021-03-12 19:35:15 -08:00
Xiang Li 48b2c4611a
Support -export-shaders-only for linker. (#3571)
* Support -export-shaders-only for linker.
2021-03-11 23:49:28 -08:00
Adam Yang 95850daf8f
Made -Zs the flag for slim PDB. Added -Qpdb_in_private. (#3541) 2021-03-09 19:06:52 -08:00
Greg Roth 226a660a15
Merge pull request #3538 from microsoft/master
merge branch 'master' into hlsl-2021
2021-03-04 15:32:56 -07:00
Tim Corringham a342872a17
Template support (#3533)
Template support for HLSL
- can be enabled using command line option -enable-templates
- supports both function templates and aggregate templates
- support is more or less equivalent to that in C++98
- use of template features and syntax introduced by C++11 or later may
  not be handled gracefully

Co-authored-by: Tim Corringham <tcorring@amd.com>
2021-03-03 11:16:53 -08:00
Tex Riddell 5bdf3574bc
Revert "Report error for unsupported types of SV semantics (#3043)" (#3532)
This reverts commit bece3d4faf.

Revert system value type checking code due to regressions.
Will re-merge once it's fully verified fixed.
2021-03-02 21:24:25 -08:00
Greg Roth 3bd5f9ccfa
Errors on non-immediate load/gather offsets (#3283)
Gather operations that take four separate offsets are meant to allow for
programmable, non-immediate values. This removes them from the immediate
and range validations by giving the immediate versions their own opcodes
For these intrinsics and loads, errors are generated when offsets are non-literal
or out of range. These largely use the slightly altered validation paths that the
sample intrinsics use.

Reword error when texture access offsets are not immediates to be
more all encompassing and grammatical.

Incidentally remove duplicate shaders being used for the validation
test from the old directory while identical copies in the validation
directory went unused. Redirected validation test to the appropriate
copies. This is consistent with the test shader re-org's stated intent

Sample operations have a maximum of 3 args, but gather has a maximum of
two since it always operates on 2D images. So gather operations only
pass in two offset args to the offset validation, which resulted in an
invalid access.

Rather than adding a specialized condition to evade this, just iterate
over the number of elements in the array. For sample it will be 3 and
for gather 2 and it will still check for expected undefined args
appropriately.

For the offset legalization pass, the opcode is used to determine the
start and end of the offset args

Only produce the loop unroll suggestion when within a loop

Base error line on call instruction instead of source of the offset

Sort by location in source when possible and remove duplicates

Adapt tests to verify and match these changes

Fixes #2590
Fixes #2713
2021-03-02 18:42:09 -08:00
Tex Riddell 19360a8fa6
Validate CBuffer size to max of 65536 bytes (#3507) 2021-03-02 16:34:35 -08:00
Adam Yang a7770e66eb
Added IDxcVersionInfo3 for custom version string (#3517) 2021-03-02 15:19:04 -08:00
Adam Yang c7d8ac5793
Added recompile function on PdbUtils that just hands back compilation result. (#3529) 2021-03-02 15:18:44 -08:00
Greg Roth 1ca779732f
Correct atomics on heap usage for libs (#3525)
The previous shader flag code failed to account for the library-specific
handle creation function
2021-02-28 16:02:44 -08:00
Greg Roth 62f390cb58
Set shader flags for dynamic resources (#3521)
Set shader flags for resource and sampler descriptor heap indexing based
on the CreateHandleFromHeap call and its ShaderHeap argument
2021-02-27 10:37:57 -07:00
Greg Roth 3f9db446e7
Correct 64-bit atomics on heap resource flags (#3520)
Left a few out
2021-02-26 22:58:33 -08:00
Greg Roth c3f9997709
64-bit atomics usage resource flags (#3518)
When a resource comes from a binding, we can't determine if it is on a
heap or not, so we set a flag on it to let the runtime sort it out.

When a resource comes from the heap directly, we know it's on the heap
so we set the shader flag to check for platform support.
2021-02-26 19:55:59 -08:00
Greg Roth deacf03358
Add used by atomics resource flag (#3513)
To identify resources that are used in 64-bit atomics operations in order to catch invalid use of heap resources
2021-02-26 11:25:28 -08:00
Vishal Sharma bece3d4faf
Report error for unsupported types of SV semantics (#3043) 2021-02-25 16:38:12 -08:00
Greg Roth 5fbaf73466
Error for groupshared outside of compute (#3472)
groupshared variables shouldn't be allowed outside of compute and
compute-like shaders. This adds a validation error when such
variables are used.

Add shader mask where groupshared are used

Correct several tests that used groupshared incorrectly

Incidental fix to atomics tests where the special case initializations
might have allowed the threads that failed to write to proceed
before the write took place and read the improperly initialized values

Changed up the assignments to avoid write collisions and be more
readable

Fixes #2603, #2677
2021-02-24 15:22:29 -08:00
Adam Yang b63b179587
Fixed pdbutils recompilation failure when the main file is in a relative path. (#3486) 2021-02-23 00:10:41 -08:00
Greg Roth 72e46b3709
Fix crashing from null elements producing errors (#3487)
Under certain rare circumstances, it is possible for errors to be
reported for function or global variables that are null.

This eliminates the absolute necessity for a non-null function or GV by
passing in the context needed to report any error. It also avoids some
of the cases where trying to get a debug function would give a null
which would replace the non-null function.
2021-02-22 14:56:54 -08:00
Jaebaek Seo f913bde7d0
[spirv] support optional use of base vertex instance (#3478)
In DX12, `SV_InstanceID` always counts from 0. On the other hand,
Vulkan `gl_InstanceIndex` starts from the first instance. Thus it
doesn't emulate actual DX12 shader behavior. To make it equivalent,
we can set `SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`.
However, it can break the existing working shaders (i.e., compatibility
issue). We want to provide an option for users to choose what they
exactly want by `SV_InstanceID`. This commit adds
`-fvk-support-nonzero-base-instance` option. If it is enabled, it emits
`SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`. Otherwise,
`SV_InstanceID = gl_InstanceIndex`.
2021-02-19 14:30:39 -05:00
Adam Yang 98ddd699de
Added shader reflection part to source-only PDB. (#3467) 2021-02-18 13:32:13 -08:00
Adam Yang 38e4ec9593
Added an internal interface to set a compiler on PdbUtils. (#3452) 2021-02-11 12:39:37 -08:00
Helena Kotas 6e4e1f8b93
Enable and fix warnings on execution tests (#3424)
* Move warning enabling pragma to cpp files before includes and fixed warning that popped up
Whitespace change in ShaderOpArithTable.xml

* Enable and fix warnings on execution tests

* One more warning
2021-02-03 11:46:10 -08:00
Adam Yang 01135442bb
Fixed the parameter for QuadReadLaneAt incorrectly being const (#3421) 2021-02-01 15:44:12 -08:00
Tex Riddell 3b99af518d
Implement fallback path for IsHelper on SM < 6.6 (#3408) 2021-01-29 11:05:05 -08:00
Greg Roth f68dfb5092
Add hidden option to disable lifetime markers (#3406)
Intended for debugging and workarounds
2021-01-28 19:54:16 -08:00
Xiang Li 48a661e490
Remove DxilCondenseResourcesPass. (#3386) 2021-01-22 23:49:34 -08:00
Tex Riddell b0f129cd09
Add ResourceKind/NumThreads to PSV0 (DxilPipelineStateValidation) (#3233)
- Increment MAX_PSV_VERSION to 2
- Clean up versioning logic
- Add NumThreads to PSV0 v2
2021-01-22 16:41:40 -08:00
Tex Riddell 7c9e487afd
Implement IsHelperLane() (#3382) 2021-01-22 12:45:18 -08:00
Xiang Li 797d4b8adc
Mutate resource to handle for shader model 6.6+. (#3374)
* Mutate resource to handle for sm_6_6+.

* Use bitcast to save global symbol and HLSL type for DxilResourceBase.
2021-01-21 17:16:06 -08:00
Adam Yang 2b8ef5b57e
Removed source from debug module. Support slim PDB. (#3348) 2021-01-20 12:25:28 -08:00
Xiang Li ac60ae1f4d
Add annotateHandle when link lower shader model to shader model 6.6. (#3368)
* Add annotateHandle when link lower shader model to shader model 6.6.
2021-01-19 18:15:58 -08:00
Xiang Li 83e538ecc3
Use IsEntry when LowerStaticGlobalIntoAlloca. (#3353) 2021-01-13 16:14:09 -08:00
Xiang Li 4dd6ff6b83
Add getElementStride to DxilResourceProperties. (#3312)
* Add getElementStride to DxilResourceProperties.
2020-12-12 17:21:28 -08:00
Vishal Sharma b0820241c1
Update validation rule info in hctdb.py (#3311) 2020-12-10 17:27:07 -08:00
Adam Yang 4d40670ef3
Fixed writing to RWTexture generating a load and store for each component (#3304) 2020-12-07 22:51:53 -08:00
Jesse Natalie 555fad0bc6
Add UINT8 typedef to WinAdapter.h (#3302) 2020-12-04 19:20:14 -08:00
Greg Roth d574d27c1f
Implement Shader Model 6.6 (#3293)
This is the work of many contributors from Microsoft and our partners.
Thank you all!

Add support for 64-bit atomics, compute shader derivatives, dynamic
resources (from heap), 8-bit Packed Operations, and Wave Size.
All of these require compiling with 6_6 targets with just a few
exceptions. Each of these features include unittests and Execution
tests.

64-bit atomics add 64-bit variants of all Interlocked* intrinsic
operations. This involves changing some of the code that matches
intrinsic overloads to call instructions. Also adds a few float
intrinsics for compare and exchange interlocked ops which are available
for all shader models 6.0 and up.

Compute shader derivatives adds dd[x|y], CalculateLevelOfDetail, and
Sample operations that require derivatives to compute. QuadRead
operations have been allowed in compute from 6.0+ and tests are added
for them here.

Dynamic resources introduce global arrays that represent the resource
and sampler heaps that can be indexed without requiring root signature
representations. This involves a new way of creating and annotating
resource handles.

8-bit Packed operations introduces a set of intrinsics to pack and
unpack 8-bit values into and out of new 32-bit unsigned types that can
be trivially converted to and from uints.

WaveSize introduces a shader attribute that indicates what size the
shader depends on the wave being. If the runtime has a different wave
size, trying to create a pipeline with this size will fail.
2020-12-02 21:10:44 -08:00
Adam Yang 747ee519eb
Added new PDB interface (#3288) 2020-12-02 14:25:09 -08:00
Ehsan d7bb9ee170
Introduce the implicit 'vk' namespace (#3133)
* [spirv] Introduce the implicit 'vk' namespace

If any of these are used for DXIL code generation, the compiler will
report an error about the unknown "vk" namespace.

* [spirv] Introduce vk::ReadClock intrinsic.

The following Vulkan specific intrinsic functions are added:

```hlsl
uint64_t vk::ReadClock(in uint32 scope);
```

Also the following Vulkan-specific implicit constants are added:

```
vk::CrossDeviceScope; // defined as uint 0
vk::DeviceScope       // defined as uint 1
vk::WorkgroupScope    // defined as uint 2
vk::SubgroupScope     // defined as uint 3
vk::InvocationScope   // defined as uint 4
vk::QueueFamilyScope  // defined as uint 5
```

Sample usage looks as follows:

```hlsl
uint64_t clock = vk::ReadClock(vk::WorkgroupScope);
```

If any of these are used for DXIL code generation, the compiler will
report an error about the unknown "vk" namespace.

* [spirv] Add documentation.

* Address code review comments.

* Test: Validate vk namespace is not allowed for dxil.

* Fix usage of DXASSERT.

* Move ValidateVkNamespaceNotAllowed test to HlslFileCheck.
2020-12-01 14:04:14 -06:00
Marijn Suijten af14220b45
[linux-port] Support full IID comparison on GCC (#3062)
The fallback layer for GCC currently uses pointer comparison to match
IIDs as uuid tagging for types is not supported.
Thus, when an external program dlopens libdxcompiler and calls
with a proper REFIID these will not match internal pointers and result
in E_NOINTERFACE.

Note that the reverted patches before this commit alleviated the
situation somewhat by replacing the pointer with a hash of the interface
name. While this again grants a stable value to be used when called from
external binaries these (usually FFI wrappers) now need to understand
whether the library has been compiled with full IID support (as a binary
compiled with Clang has no such limitations), and then pass the hash
anywhere they'd otherwise pass an IID.

WIP: Not all IIDs are added yet, which requires the extra "empty-GUID"
check in IsEqualIID.
2020-11-20 15:18:05 -08:00
Marijn Suijten 11e04d8d2b
Remove unnecessary whitespace in files touched by #3062 (#3272) 2020-11-19 14:22:42 -08:00
rkarrenberg 489c2e4d32
Move force-zero-store-lifetimes flag to help-hidden, remove accidental check from test. (#3264) 2020-11-17 04:56:52 -07:00
rkarrenberg eaa7f95d07
Enable generation of llvm.lifetime.start/.end intrinsics (#3034)
* Enable generation of llvm.lifetime.start/.end intrinsics.

- Remove HLSL change from CGDecl.cpp::EmitLifetimeStart() that disabled
- generation of lifetime markers in the front end.
- Enable generation of lifetime intrinsics when inlining functions
- (PassManagerBuilder.cpp).
- Both of these cover a different set of situations that can lead to
- inefficient code without lifetime intrinsics (see examples below):
  - Assume a struct is created inside a loop but some or all of its
    fields are only initialized conditionally before the struct is being
    used. If the alloca of that struct does not receive lifetime intrinsics
    before being lowered to SSA its definition will effectively be hoisted
    out of the loop, which changes the original semantics: Since the
    initialization is conditional, the correct SSA form for this code
    requires a phi node in the loop header that persists the value of the
    struct field throughout different iterations because the compiler
    doesn't know anymore that the field can't be initialized in a different
    iteration than when it is used.
  - If the lifetime of an alloca in a function is the entire function it
    doesn't need lifetime intrinsics. However, when inlining that function,
    the alloca's lifetime will then suddenly span the entire caller, causing
    similar issues as described above.
- For backwards compatibility, replace lifetime.start/.end intrinsics
  with a store of undef in DxilPreparePasses.cpp, or, for validator
  version < 1.6, with a store of 0 (undef store is disallowed). This is
  slightly inconvenient but achieves the same goal as the lifetime
  intrinsics. The zero initialization is actually the current manual
  workaround for developers that hit one of the above issues.
- Allow lifetime intrinsics to pass DXIL validation.
- Allow undef stores to pass DXIL validation.
- Allow bitcast to i8* to pass DXIL validation.
- Make various places in the code aware of lifetime intrinsics and their
- related bitcasts to i8*.
- Adjust ScalarReplAggregatesHLSL so it generates new intrinsics for
  each element once a structure is broken up. Also make sure that lifetime
  intrinsics are removed when replacing one pointer by another upon seeing
  a memcpy. This is required to prevent a pointer accidentally
  "inheriting" wrong lifetimes.
- Adjust PromoteMemoryToRegister to treat an existing lifetime.start
- intrinsic as a definition.
- Since lifetime intrinsics require a cleanup, the logic in
  CGStmt.cpp:EmitBreakStmt() had to be changed: EmitHLSLCondBreak() now
  returns the generated BranchInst. That branch is then passed into
  EmitBranchThroughCleanup(), which uses it instead of creating a new one.
  This way, the cleanup is generated correctly and the wave handling also
  still works as intended.
- Adjust a number of tests that now behave slightly differently.
  memcpy_preuser.hlsl was actually exhibiting exactly the situation
  explained above and relied on the struct definition of "oStruct" to be
  hoisted out to produce the desired IR. And entry_memcpy() in
  cbuf_memcpy_replace.hlsl required an explicit initialization: With
  lifetime intrinsics, the original code correctly collapsed to returning
  undef. Without lifetime intrinsics, the compiler could not prove this.
  With proper initialization, the test now has the intended effect, even
  though the collapsing to undef could be a desireable test for lifetime
  intrinsics.

Example 1:

Original code:
for( ;; ) {
  func();
  MyStruct s;
  if( c ) {
    s.x = ...;
    ... = s.x;
  }
  ... = s.x;
}

Without lifetime intrinsics, this is equivalent to:
MyStruct s;
for( ;; ) {
  func();
  if( c ) {
    s.x = ...;
    ... = s.x;
  }
  ... = s.x;
}

After SROA, we now have a value live across the function call, which will cause a spill:
for( ;; ) {
  x_p = phi( undef, x_p2 );
  func();
  if( c ) {
    x1 = ...;
    ... = x1;
  }
  x_p2 = phi( x_p, x1 );
  ... = x_p2;
}

Example 2:

void consume(in Data data);
void expensiveComputation();

bool produce(out Data data) {
    if (condition) {
        data = ...; // <-- conditional assignment of out-qualified parameter
        return true;
    }
    return false; // <-- out-qualified parameter left uninitialized
}
void foo(int N) {
    for (int i=0; i<N; ++i) {
        Data data;
        bool valid = produce(data); // <-- generates a phi to prior iteration's value when inlined. There should be none
        if (valid)
            consume(data);
        expensiveComputation(); // <-- said phi is alive here, inflating register pressure
    }
}

* Implement lifetime intrinsic execution test.

- Test SM 6.0, 6.3, and 6.5. The 6.5 test behaves exactly the same way
  as 6.3, it is meant a placeholder for 6.6.
- Test validator versions 1.5 and 1.6.
- Abstract a few things in the ExecutionTest infrastructure to enable
  better code sharing, e.g. for lib compilation.

* Make memcpy replacement conservative by removing lifetimes of both src and dst. Add regression test for this case.

* Allow to force replacing lifetime intrinsics by zeroinitializer stores via compile option.

* Fix regression where lifetimes caused code that was not cleaned up properly.

- Add SROA and Jump Threading passes as early in the optimization
  pipeline as possible without interfering with lowering. These two are
  required to fully remove redundant code due to insertion of cleanup blocks
  for lifetimes. Previously, SROA ran much too late, and Jump Threading
  was disabled everywhere.
- A side effect of this is that we can now have unstructured control
  flow more often. This also breaks one test that was originally written
  when a part of SimplifyCFG that could also create unstructured control
  flow was disabled. That part is still disabled, but jump threading has
  the same effect. I don't know why unstructured control flow is a
  problem for the optimization pipeline.
- Add a regression test that requires the two phases to be cleaned up properly.
- Disable the simplifycfg test which now fails even though simplifycfg
  still does what it should.

* Disable lifetime intrinsics for SM < 6.6, add flag to enable explicitly.

- Add missing default value for unrelated option StructurizeLoopExitsForUnroll.
- Re-enable simplify cfg test disabled in a previous commit.
2020-11-16 15:48:05 -08:00
Marijn Suijten 2ade6f84d6
Linux: Implement prefix-counted BSTR allocation in SysAllocStringLen (#3250)
While this prefix may be an implementation detail the BSTR type
guarantees to be able to store NULL values without accidentally
truncating the string, hence its length must be available to replace the
usual NULL terminator.  This string length in bytes is stored in the
four bytes preceding a BSTR pointer [1], whose value can be retrieved
using SysStringLen.

This commit implements the type in the same way in WinAdapter such that
users of the functions returning a BSTR (currently only in dxcisense)
can expect and use the type in the same way on Linux as they would on
Windows.
While this function and type currently only seem to be involved in names
and logging where NULLs are unlikely, it is good practice to mirror the
Windows type exactly for future-proofing.

Besides, SysAllocStringLen does not specify anything about taking
ownership of strIn so the realloc here is wrong either way.

[1]: https://docs.microsoft.com/en-us/previous-versions/windows/desktop/automat/bstr
2020-11-12 11:01:41 -06:00
Adam Yang 3be547b67c
Fixing store undef later in the compilation. (#3212) 2020-10-20 21:15:49 -07:00
Adam Yang e6b313b6ab
remove-dead-blocks handles switch. Split resource array before finalize. (#3184) 2020-10-15 13:05:39 -07:00
Adam Yang d8dec0efd7
Fixed some cases where O0 fails compilation (#3205)
- Fixed value tracking for dxil intrinsics
- Fixed some selects holding on to invalid resource uses
- Fixed some cases where unused globals hold on to invalid resource uses
- Fixed some cases where stores of undefs stick around
2020-10-14 22:41:03 -07:00
Jeff Noyle b85f1affc6
impl (#3200)
Co-authored-by: Jeff Noyle <jeffno@ntdev.microsoft.com>
2020-10-13 17:09:09 -07:00
Helena Kotas c565b3d20b
Cleanup and minor changes to improve integration with internal projects (#3199)
* Cleanup and minor changes to improve integration with internal projects

Adds -Debug option to hcttest.cmd

* Case insensitive compare for -Debug and -Release flags in hctbuild and hcttest
2020-10-13 16:22:50 -07:00
Greg Roth 754e99b405
Revert "Optimize compile times by not skipping allocas (#3168)" (#3183)
This reverts commit 9459577e8f.
2020-10-05 16:23:12 -07:00
Greg Roth 9459577e8f
Optimize compile times by not skipping allocas (#3168)
Instead of skipping past allocas whenever inserting a new insruction,
which ate up a lot of compilation time, they are inserted at the default
insertion point.

The result is that allocas that would have coallesced just after the
global load an input loads are dispersed throughout the commands. So as
part of dxil finalization, the allocas are moved to the beginning of the
entry block of each function. This results in some minor changes to a
couple tests due to the allocas preceding the loads.
2020-09-29 18:11:28 -07:00
Jaebaek Seo 7f985ff472
[spirv] generate OpenCL.DebugInfo.100 instructions (#3155)
OpenCL.DebugInfo.100
is a SPIR-V extended instruction set that provides DWARF style debug information.
This PR allows DXC SPIR-V backend to generate the rich HLSL debug information
using OpenCL.DebugInfo.100 instructions.
2020-09-22 16:50:29 -04:00
Tex Riddell c3fa785435
Remove assert.h include from DxilPipelineStateValidation.h (#3129)
Also:
- fixed size_t to uint32_t error when building with other tools.
- removed DxilPipelineStateValidation.h where unused.
2020-09-16 23:15:08 -07:00
Tex Riddell b3df129d1e
Refactor PSV code for improved maintainability and error handling (#3115) 2020-09-08 19:14:12 -07:00
Tex Riddell bba76d6619
Fix vs2019 build warnings for custom build items (#3114)
This should fix the warnings that have this pattern:

warning MSB8065: Custom build for item "...\tmpdxcetw.h.rule" succeeded,
but specified output "...\include\dxc\tracing\dxc\tracing\tmpdxcetw.h"
has not been created. This may cause incremental build to work incorrectly.
2020-09-04 16:21:53 -07:00
Adam Yang 5d6674ad0a
Hoist exits out of loops when unrolling to make code structured. (#3103) 2020-09-04 13:57:36 -07:00
David Peixotto 55b9194ccc
Modify the extension mechansim to handle custom lowering for resource… (#3081)
* Modify the extension mechansim to handle custom lowering for resource methods

The goal is to allow resource extension intrinsics to be handled the same way
as the core hlsl resource intrinsics.

Specifically, we want to support:

 1. Multiple hlsl overloads map to a single dxil intrinsic
 2. The hlsl overloads can take different parameters for a given resource type
 3. The hlsl overloads are not consistent across different resource types

To achieve these goals we need a more complex mechanism for describing how
to translate the high-level arguments to arguments for a dxil function.

This commit implements custom lowering by allowing the mapping from high-level
args to dxil args to be specified as extra information along with the
lowering strategy.

The custom lowering info describes this lowering using the following format.

* Add missing virtual destructors
2020-08-12 14:13:44 -07:00
Greg Roth efc14a6c40
Sundry code cleanups (#3073)
refactor usage of resourceparams to just use RP

create common way of extracting res properties

make getoverloadtype static

so it can be used to determine version
requirements

Disallow literals where not included

Previously, whether an intrinsic parameter type included an entry for
literals or not, a literal was permitted to be cast to it.

By moving up the explicit per-param check to before this cast takes
place, these implicit casts won't be added
2020-08-12 08:16:12 -07:00
Adam Yang c684cc0421
Added mechanism for extra file outputs (#3060) 2020-08-03 13:25:19 -07:00
Greg Roth cb0191d8db
Correct codegenoption assigning from semantic defines (#3061)
For structurize-returns, the codegenopts assigned were a copy of those
the compiler possesses. This is by design in clang as explicitly stated
in comments. As a result, when the compiler codegenopts are updated, the
codegenmodule's are not.

By adding the ability to query the optimization enables from the hlsl
extension helper, these options can be reflected.

Additionally, the structurize-returns flag would have required a define
with a '-' in it, which isn't possible. So now it converts between '_'
for defines and '-' for options, which is consistent with how other
flags are interpretted.

Adds a unit test similar to that of disable gvn that makes a simple
check to verify that the resturn structurization is enabled.
2020-08-01 19:53:24 -06:00
Greg Roth 6971d25e1e
Restrict parameters to Interlocked* Intrinsics (#3057)
Because the defined types of Interlocked* Parameters do not restrict
type casting, they can result in late errors that cause crashes due to
reference params with casts that aren't actually respected by dxil
generation.

By constraining the parameters as they should be, instead of crashes, we
get errors.

Fixes #2077
Fixes #2483
2020-07-28 12:56:50 -07:00
Greg Roth 41e5ba4448
Modify structurize returns option to use opt-enable (#3052)
This conforms to the established -opt-enable -opt-disable flag standard
to enable this optimization as well as tying in with the semantic
defines mechanism.
2020-07-27 22:19:45 -07:00
Adam Yang 9db52a7f01
Added hidden option to print after all passes (#3051) 2020-07-23 11:09:54 -07:00
Greg Roth 92328d817a
Propagate CodeGen Options and Semantic Defines (#3040)
Adds to -opt-enable and -opt-select flags to the -opt-disable flag and
broadens the utility of all by storing the settings as strings instead
of booleans. -opt-select requires a parameter indicating the setting
name and another indicating the value.

options are repesented as lowercase until they are converted into
semantic defines in metadata. presence of semantic define prefixes
is no longer assumed when trying to write semantic defines to
metadata.

Semantic defines with the user-specified prefix will be propagated to
the corresponding -opt-<enable|disable|select> CodeGen option. 
This allows either way of specifying these options to have the
same effect.

Adds two tests to verify that semantic defines will be
detectable as codegen flags. Another verifies that correct errors are
produced for contradictory flags.
2020-07-21 08:50:42 -07:00
Dan Ginsburg e93ab88f7d
Add -fvk-auto-shift-bindings that applies -fvk-*-shift bindings to re… (#2959)
* Add -fvk-auto-shift-bindings that applies -fvk-*-shift bindings to resources that do not have explicit register assignments.  Closes #2956
* For resources that don't have an explicit register binding, first categorize their registerType using new method DeclResultIdMapper::getImplicitRegisterType.
* When finding automatic binding indices for implicit register bindings, pass an optional shift to useNextBinding.
* Change getNextBindingChunk() to handle finding the next binding range given a bindingShift

* Fix regression with binding numbers causing AppVeyor failure and also fix crash I ran into in my own testing in getImplicitRegisterType() if the getSpirvInstr() did not have an AstResultType.

* Address review comments.

* Update documentation

Co-authored-by: Ehsan Nasiri <ehsannas@gmail.com>
2020-07-13 09:49:27 -05:00
Xiang Li 4211802b9a
Support -line-directive in dxr. (#3027)
* Support -line-directive in dxr.
2020-07-08 15:07:23 -07:00
Greg Roth a5843199c5
Enable warnings as errors for clang builds (#3012)
This fixes up the last of the clang errors that are unique to clang 10.
These take two forms, First, indentation that might indicate the mistaken
impression that a line is within an unbraced conditional block produces
a warning and is corrected in a few places. Second, a warning is
produced when a class has one of copy constructor or assignment operator
explicitly defined and the other left to implicit definition. The
solution is generally to add a definition with the = default assignment.

Finally, this adds the -Werror flag to all clang builds to keep the
build warnings clean and detect flaws that can affect all builds and
platforms.
2020-07-02 08:28:23 -07:00
David Peixotto b27351da66
Preserve precise metadata when replacing instructions in transformations (#3010)
We were not correctly preserving precise metadata when replacing instructions
in various transformations. This caused precise metadata to be dropped
when one a precise instruction was replaced with a non-precise instruction.

For example, given the two equivalent mul expressions

    float x = mul(a, b)
    precise float y = mul(a, b)
    z = x + y

GVN would replace `y` with `x` in the computation of `z`. This caused x to
be computed in non-precise mode.

The fix is to make sure we combine known dxil metadata when replacing instructions.
2020-06-30 09:57:08 -07:00
Greg Roth 29759a8942
Fix remaining Clang warnings (#3008)
Many of these take the form of removing unused member variables and
unused functions. Where they were truly unused, I removed them. Where
they were only used in asserts, I hid them for the release builds. Where
the code in question was original to LLVM, I didn't remove so much as
comment out the offending code. In most cases these changes are
counterparts to already excluded code for HLSL.

A few cases concern indexing or type matches. Where relevant, I lifted
the same solutions in the current source of the llvm project.

I altered the fix made recently to LinkAllPasses.h. Instead of just
disabling the warning, I used the temporary variables that the current
llvm project uses to avoid having to cast null pointers.
2020-06-29 19:24:52 -07:00