This commit generally only adds those instructions that intersect with
OpenCL.DebugInfo.100, although it does also add generation of
new DebugFunctionDefinition to tie DebugFunction to function.
Co-authored-by: baldurk <baldurk@baldurk.org>
Enables the associated feature flags when 2021 is selected. Also adds a
feature flag for bitfields.
Add to all feature tests a variant that uses -HV 2021
Made some small changes to a template test that relied on vector
operands with binary logical operators
Allow calling DxcInitThreadMalloc again after calling DxcCleanupThreadMalloc
Without this change we cannot call a sequence of
DxcInitThreadMalloc()
DxcCleanupThreadMalloc()
...
DxcInitThreadMalloc() // <- This will trigger an assert
because we hit an assert that assumes the default malloc pointer is nullptr.
Since loading a big object takes the memory pressure, reduction of the
load size can have some performance benefit. In particular, it is
useful for mobile GPUs. `-fspv-reduce-load-size` removes
OpLoad/OpCompositeExtract of struct/array types by running spirv-opt
--reduce-load-size pass.
Fixes#3889
* Stop creating binary blobs with UTF-8 encoding
* Strip BOM and fix some related issues
- detect and strip BOM only when interpreting as text:
- DxcCreateBlob: only when encoding is known and not CP_ACP
- DxcGetBlobAsUtf[8|16]: when encoding not known or CP_ACP
- strncmp in BOM detection would misfire (should have been memcmp)
- ambiguous BOM case not handled/noted
- empty blob case fixes (handle null-term empty, don't leak)
Note, there are still problems in the unicode handling:
- No actual support for UTF-32 or big endian input encodings.
- For Linux: "UTF-16" isn't, and bad assumptions around "wide" strings.
* Add defines for CP_UTF*; improve error for unsupported utf conversions
The test to determine if an -opt-disable or -opt-enable flag was
contradictory did not take into account hte possibility that the
previous inclusion was also an enable, which wouldn't be a
contradiction.
Original LLVM changes included in this change:
* DxcSupport/dxcmem: Fix clang dtor-name warning
../lib/DxcSupport/dxcmem.cpp:46:55: warning: ISO C++ requires the name after '::~' to be found in the same scope as the name before '::~' [-Wdtor-name]
g_ThreadMallocTls->llvm::sys::ThreadLocal<IMalloc>::~ThreadLocal();
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
::ThreadLocal
* Fix few g++ 8 warning with non obvious copy object operations
Reviewers: dblaikie, dexonsmith
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D50296
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339367 91177308-0d34-0410-b5e6-96231b3b80d8
Picked from: 425788d8b5.patch
* Disable GCC 9's -Winit-list-lifetime warning in ArrayRef
Summary:
This is a new warning which fires when one stores a reference to the
initializer_list contents in a way which may outlive the
initializer_list which it came from. In llvm this warning is triggered
whenever someone uses the initializer_list ArrayRef constructor.
This is indeed a dangerous thing to do (I myself was bitten by that at
least once), but it is not more dangerous than calling other ArrayRef
constructors with temporary objects -- something which we are used to
and have accepted as a tradeoff for ArrayRef's efficiency.
Currently, this warnings generates so much output that it completely
obscures any actionable warnings, so this patch disables it.
Reviewers: rnk, aaron.ballman
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70122
Picked from: 6c2151bf4c.patch
Co-authored-by: David Carlier <devnexen@gmail.com>
Co-authored-by: Pavel Labath <pavel@labath.sk>
The most common cause of internal compiler errors are access violations
or stack overflows. This registers an exception handler in dxc.exe for
these cases that are otherwise unhandled. It prints a simple message
for these errors and passes the exception along.
In case this is unwanted for some reason, a hidden disabling flag is
added as well.
Adds LLVM builtin exceptions for assert, fatal, and unreachable. Adds a
default message for exceptions not explicitly addressed.
Alters behavior of llvm_unreachable so it always raises an exception
regardless of compiler support for unreachable hints.
Reports errors using fputs instead of std::cerr to ensure that no
allocation is necessary. Custom output is performed in a static array
that is output with fputs.
This extension adds qualifiers for payload structures accompanied with semantic checks and code generation. This feature is opt-in for SM 6.6 libraries. The information added by the developer is stored in the DXIL type system and a new metadata node is emitted during code generation. The metadata is not necessary for correct translation of DXIL, so it may be safely ignored, but it provides hints to unlock potential optimizations in payload storage between DXR shader stages.
Template support for HLSL
- can be enabled using command line option -enable-templates
- supports both function templates and aggregate templates
- support is more or less equivalent to that in C++98
- use of template features and syntax introduced by C++11 or later may
not be handled gracefully
Co-authored-by: Tim Corringham <tcorring@amd.com>
In DX12, `SV_InstanceID` always counts from 0. On the other hand,
Vulkan `gl_InstanceIndex` starts from the first instance. Thus it
doesn't emulate actual DX12 shader behavior. To make it equivalent,
we can set `SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`.
However, it can break the existing working shaders (i.e., compatibility
issue). We want to provide an option for users to choose what they
exactly want by `SV_InstanceID`. This commit adds
`-fvk-support-nonzero-base-instance` option. If it is enabled, it emits
`SV_InstanceID = gl_InstanceIndex - gl_BaseInstance`. Otherwise,
`SV_InstanceID = gl_InstanceIndex`.
The fallback layer for GCC currently uses pointer comparison to match
IIDs as uuid tagging for types is not supported.
Thus, when an external program dlopens libdxcompiler and calls
with a proper REFIID these will not match internal pointers and result
in E_NOINTERFACE.
Note that the reverted patches before this commit alleviated the
situation somewhat by replacing the pointer with a hash of the interface
name. While this again grants a stable value to be used when called from
external binaries these (usually FFI wrappers) now need to understand
whether the library has been compiled with full IID support (as a binary
compiled with Clang has no such limitations), and then pass the hash
anywhere they'd otherwise pass an IID.
WIP: Not all IIDs are added yet, which requires the extra "empty-GUID"
check in IsEqualIID.
* Enable generation of llvm.lifetime.start/.end intrinsics.
- Remove HLSL change from CGDecl.cpp::EmitLifetimeStart() that disabled
- generation of lifetime markers in the front end.
- Enable generation of lifetime intrinsics when inlining functions
- (PassManagerBuilder.cpp).
- Both of these cover a different set of situations that can lead to
- inefficient code without lifetime intrinsics (see examples below):
- Assume a struct is created inside a loop but some or all of its
fields are only initialized conditionally before the struct is being
used. If the alloca of that struct does not receive lifetime intrinsics
before being lowered to SSA its definition will effectively be hoisted
out of the loop, which changes the original semantics: Since the
initialization is conditional, the correct SSA form for this code
requires a phi node in the loop header that persists the value of the
struct field throughout different iterations because the compiler
doesn't know anymore that the field can't be initialized in a different
iteration than when it is used.
- If the lifetime of an alloca in a function is the entire function it
doesn't need lifetime intrinsics. However, when inlining that function,
the alloca's lifetime will then suddenly span the entire caller, causing
similar issues as described above.
- For backwards compatibility, replace lifetime.start/.end intrinsics
with a store of undef in DxilPreparePasses.cpp, or, for validator
version < 1.6, with a store of 0 (undef store is disallowed). This is
slightly inconvenient but achieves the same goal as the lifetime
intrinsics. The zero initialization is actually the current manual
workaround for developers that hit one of the above issues.
- Allow lifetime intrinsics to pass DXIL validation.
- Allow undef stores to pass DXIL validation.
- Allow bitcast to i8* to pass DXIL validation.
- Make various places in the code aware of lifetime intrinsics and their
- related bitcasts to i8*.
- Adjust ScalarReplAggregatesHLSL so it generates new intrinsics for
each element once a structure is broken up. Also make sure that lifetime
intrinsics are removed when replacing one pointer by another upon seeing
a memcpy. This is required to prevent a pointer accidentally
"inheriting" wrong lifetimes.
- Adjust PromoteMemoryToRegister to treat an existing lifetime.start
- intrinsic as a definition.
- Since lifetime intrinsics require a cleanup, the logic in
CGStmt.cpp:EmitBreakStmt() had to be changed: EmitHLSLCondBreak() now
returns the generated BranchInst. That branch is then passed into
EmitBranchThroughCleanup(), which uses it instead of creating a new one.
This way, the cleanup is generated correctly and the wave handling also
still works as intended.
- Adjust a number of tests that now behave slightly differently.
memcpy_preuser.hlsl was actually exhibiting exactly the situation
explained above and relied on the struct definition of "oStruct" to be
hoisted out to produce the desired IR. And entry_memcpy() in
cbuf_memcpy_replace.hlsl required an explicit initialization: With
lifetime intrinsics, the original code correctly collapsed to returning
undef. Without lifetime intrinsics, the compiler could not prove this.
With proper initialization, the test now has the intended effect, even
though the collapsing to undef could be a desireable test for lifetime
intrinsics.
Example 1:
Original code:
for( ;; ) {
func();
MyStruct s;
if( c ) {
s.x = ...;
... = s.x;
}
... = s.x;
}
Without lifetime intrinsics, this is equivalent to:
MyStruct s;
for( ;; ) {
func();
if( c ) {
s.x = ...;
... = s.x;
}
... = s.x;
}
After SROA, we now have a value live across the function call, which will cause a spill:
for( ;; ) {
x_p = phi( undef, x_p2 );
func();
if( c ) {
x1 = ...;
... = x1;
}
x_p2 = phi( x_p, x1 );
... = x_p2;
}
Example 2:
void consume(in Data data);
void expensiveComputation();
bool produce(out Data data) {
if (condition) {
data = ...; // <-- conditional assignment of out-qualified parameter
return true;
}
return false; // <-- out-qualified parameter left uninitialized
}
void foo(int N) {
for (int i=0; i<N; ++i) {
Data data;
bool valid = produce(data); // <-- generates a phi to prior iteration's value when inlined. There should be none
if (valid)
consume(data);
expensiveComputation(); // <-- said phi is alive here, inflating register pressure
}
}
* Implement lifetime intrinsic execution test.
- Test SM 6.0, 6.3, and 6.5. The 6.5 test behaves exactly the same way
as 6.3, it is meant a placeholder for 6.6.
- Test validator versions 1.5 and 1.6.
- Abstract a few things in the ExecutionTest infrastructure to enable
better code sharing, e.g. for lib compilation.
* Make memcpy replacement conservative by removing lifetimes of both src and dst. Add regression test for this case.
* Allow to force replacing lifetime intrinsics by zeroinitializer stores via compile option.
* Fix regression where lifetimes caused code that was not cleaned up properly.
- Add SROA and Jump Threading passes as early in the optimization
pipeline as possible without interfering with lowering. These two are
required to fully remove redundant code due to insertion of cleanup blocks
for lifetimes. Previously, SROA ran much too late, and Jump Threading
was disabled everywhere.
- A side effect of this is that we can now have unstructured control
flow more often. This also breaks one test that was originally written
when a part of SimplifyCFG that could also create unstructured control
flow was disabled. That part is still disabled, but jump threading has
the same effect. I don't know why unstructured control flow is a
problem for the optimization pipeline.
- Add a regression test that requires the two phases to be cleaned up properly.
- Disable the simplifycfg test which now fails even though simplifycfg
still does what it should.
* Disable lifetime intrinsics for SM < 6.6, add flag to enable explicitly.
- Add missing default value for unrelated option StructurizeLoopExitsForUnroll.
- Re-enable simplify cfg test disabled in a previous commit.
While this prefix may be an implementation detail the BSTR type
guarantees to be able to store NULL values without accidentally
truncating the string, hence its length must be available to replace the
usual NULL terminator. This string length in bytes is stored in the
four bytes preceding a BSTR pointer [1], whose value can be retrieved
using SysStringLen.
This commit implements the type in the same way in WinAdapter such that
users of the functions returning a BSTR (currently only in dxcisense)
can expect and use the type in the same way on Linux as they would on
Windows.
While this function and type currently only seem to be involved in names
and logging where NULLs are unlikely, it is good practice to mirror the
Windows type exactly for future-proofing.
Besides, SysAllocStringLen does not specify anything about taking
ownership of strIn so the realloc here is wrong either way.
[1]: https://docs.microsoft.com/en-us/previous-versions/windows/desktop/automat/bstr
When dxilconv initialization fails in InitMaybeFail (for example under
memory pressure) the malloc references are null yet DxcClearThreadMalloc
is called expecting non-null references.
Adds nullptr checks to DxcClearThreadMalloc before erasing and releasing
the allocators.
OpenCL.DebugInfo.100
is a SPIR-V extended instruction set that provides DWARF style debug information.
This PR allows DXC SPIR-V backend to generate the rich HLSL debug information
using OpenCL.DebugInfo.100 instructions.
This conforms to the established -opt-enable -opt-disable flag standard
to enable this optimization as well as tying in with the semantic
defines mechanism.
Adds to -opt-enable and -opt-select flags to the -opt-disable flag and
broadens the utility of all by storing the settings as strings instead
of booleans. -opt-select requires a parameter indicating the setting
name and another indicating the value.
options are repesented as lowercase until they are converted into
semantic defines in metadata. presence of semantic define prefixes
is no longer assumed when trying to write semantic defines to
metadata.
Semantic defines with the user-specified prefix will be propagated to
the corresponding -opt-<enable|disable|select> CodeGen option.
This allows either way of specifying these options to have the
same effect.
Adds two tests to verify that semantic defines will be
detectable as codegen flags. Another verifies that correct errors are
produced for contradictory flags.
* Add -fvk-auto-shift-bindings that applies -fvk-*-shift bindings to resources that do not have explicit register assignments. Closes#2956
* For resources that don't have an explicit register binding, first categorize their registerType using new method DeclResultIdMapper::getImplicitRegisterType.
* When finding automatic binding indices for implicit register bindings, pass an optional shift to useNextBinding.
* Change getNextBindingChunk() to handle finding the next binding range given a bindingShift
* Fix regression with binding numbers causing AppVeyor failure and also fix crash I ran into in my own testing in getImplicitRegisterType() if the getSpirvInstr() did not have an AstResultType.
* Address review comments.
* Update documentation
Co-authored-by: Ehsan Nasiri <ehsannas@gmail.com>
Many of these take the form of removing unused member variables and
unused functions. Where they were truly unused, I removed them. Where
they were only used in asserts, I hid them for the release builds. Where
the code in question was original to LLVM, I didn't remove so much as
comment out the offending code. In most cases these changes are
counterparts to already excluded code for HLSL.
A few cases concern indexing or type matches. Where relevant, I lifted
the same solutions in the current source of the llvm project.
I altered the fix made recently to LinkAllPasses.h. Instead of just
disabling the warning, I used the temporary variables that the current
llvm project uses to avoid having to cast null pointers.
It can be helpful to disable an optimization pass or set of passes in
select circumstances. This adds the ability to disable the gvn pass
only, but introduces a way to disable various passes just by accepting
the chosen string to represent them in HLSLOptions and using the
DisablePasses values to determine when a pass needs to be left out in
PassManagerBuilder.
- Added rewriter functions for extracting uniforms into global scope
- Will not work if name collides (namespaces have bugs)
- Values under cbuffer _Params, resources outside (more bugs)
- Added rewriter HLSLOptions and support all through RewriteWithOptions
- Refactored dxr.exe to use HLSLOptions and RewriteWithOptions
- Exposed Write*VersionInfo functions from dxclib
- Fixed issue with External[Lib/Fn] for version printing.
* Increase scan limit for DSE, add option
Due to the large number of fields in a struct passed to an exported
function, the number of useless loads and stores exceeded the builtin
limit, which left a lot of them in. This increases the default limit
from 100 to 500 and adds a hidden parameter -memdep-block-scan-limit to
set it to whatever is needed for future workarounds.
The Dead Store Elimination pass examines stores to see if they are
unneeded. If it finds no uses between the store and the original load,
it eliminates both, but if it has to exceed the instruction limit to get
there, it gives up and leaves it in just in case.
* HLSL test infrastucture and other refactoring
Refactor common test infrastructure code into HLSLTestLib
Enable invocation of fxc and other executables via // RUN: commands in test files
Add latest d3dx12.h to include/dxc/Support and remove two other outdated copies
Improve DXIL container header validation on load
New helper classes DxilContainerReader and FixedSizeMemoryStream
Move LoadSubobjectsFromRDAT to DxilSubobjects.cpp
Co-authored-by: Greg Roth <grroth@microsoft.com>
* Add pragma control for diagnostics and option printing
* Test converting warning to error with #pragma dxc diagnostic error "..."
* Support options: -f[no-]diagnostics-show-option and -W[no-]<warning>
* Integrate changes from internal.
- dxcapi v2
- new dxc options
- DxilValueCache
- PDB and NoOpt improvements
- noop / llvm::donothing() support
* Update dxrfallbacklayer for dxcapi internal changes
* Reorder diag block based on whether pDiag is set first.
* llvm::donothing() requires dxil 1.6 / SM 6.6 for now, lib as well.
* Fixes for spir-v, non-VC compiler and non-Windows builds
- DEFINE_CROSS_PLATFORM_UUIDOF for new interfaces
- add SAL annotations
- turn output argument validation for -P into warning
- handle warnings without concatenating them to main output
- update spirv preprocessing and compilation paths
- return E_NOTIMPL from IDxcUtils::CreateReflection
- cleanup: DxcContainerBuilder back to uft8, DxcTestUtils: remove comment
* Fix some warnings from clang/gcc.
* Fix unicode conversion problems on linux, where sizeof(wchar_t) == 4
Note this is an intermediate fix.
On linux, what we are calling utf16 is actually a wide string
that's probably utf32. This change fixes issues introduced by
the new interface changes so things are consistent and pass tests.
A future fix should correct the encodings so they are correctly labeled
on platforms where wchar_t doesn't mean UTF16.
* Return false for IsBufferNullTerminated when CP_ACP.
One test for Disassembler was crashing because it created a pinned blob
with a size of 1 << 31 + 1 without actual memory backing this. The
IsBufferNullTerminated would attempt to see if this was null terminated,
causing AV.
This change also removes CP_UTF8 from this test when it was creating
binary blobs, not UTF8 text blobs.
Two test options, -Qstrip_reflect_from_dxil and -Qkeep_reflect_in_dxil
for making tests work with reflection removed, since many tests are relying
on main module disassembly-reassembly between test phases and reflection
metadata will no longer be present there. The strip option is for the
few cases where tests don't want the reflection kept in DXIL by default.
Validator no longer requires function annotations for no reason.
Fix places where remove global hook was not being called when functions
were removed manually from the list.
StripReflection now deletes function annotations, unless targeting lib or
old validator that required them. Preserve global constructor list and
add annotation for 1.4 validator. The global hook fixes were required
here, otherwise annotations would refer to dead functions during linking.
Struct annotations may not be removed in library case when they still need
translation to legacy types.
Allow missing struct annotation when not necessary to upgrade the layout.
Preserve usage in reflection by upgrading the module, emitting metadata,
cloning for reflection, then restoring validator version and re-emit
metadata.
Fix size for 16-bit type for usage and reflected size.
Make various batch reflection tests require validator 1.5, since these
tests rely on module disassembly->assembly, which will not preserve extra
usage metadata for reflection in 1.4.
Include reflection part in IDxcAssembler, but don't strip from module,
since there are no options to prevent this from breaking a lot of tests.
Don't strip reflection from offline lib target.
- default should be UINT_MAX, not zero, since zero is used
- compare validator version against target profile minimum for error
- add back implicit -Vd for lib_6_1/2 in CompileWithDebug since
target profile is supplied outside options, so option validation
would not know to fail without -Vd.
- add tests for various cases
* [spirv] Add option to flatten array of resources.
Current SPIR-V code generation uses 1 binding number for an array of
resources (e.g. an array of textures). However, the DX side uses one
binding number per array element. The newly added
'-fspv-flatten-resource-arrays' changes the SPIR-V backend behavior to
use one binding number per array element, and uses spirv-opt to flatten
the array.
TODO: Add a test where the array is passed around.
TODO: Test this works with steven's PR and proper results are produced.
* [spirv] Update tests to include array of samplers.
* [spirv] Take early exit condition out of the loop.
* [spirv] Add documentation for the new cmd option.
* [spirv] Invoke CreateDescriptorScalarReplacementPass when needed.
* [spirv] address code review comments.
- Put separate reflection in STAT part for now.
- Separate reflection is the module with deleted function bodies.
- Use new -Qstrip_reflect_from_dxil to drive stipping of reflection
metadata from DXIL part, since now -Qstrip_reflect means strip
the STAT reflection part, or don't include it in the first place.
- Update disassembler to use STAT part if available for reflecting
resource bindings, buffer descriptions, and ViewID state.
- Put some Qstrip_* flags under DriverOption as well as CoreOption.
* Hide Unix symbols by default
Using the -fvisibility=hidden by default only exposes symbols that have
the __attribute__((visibility("default"))) attribute applied to them.
There are only a couple of functions that are meant to be exposed from
libdxcompiler. To expose the proper functions, the macros used to create
the UUIDs for class types are redefined with the attribute and
DXC_API_IMPORT is defined with the attribute above for non-windows
platforms.
Since we're hiding everything now, the portions that were explicitly
hidden to make MacOS work are removed.
This exposed a number of missing dependencies of libraries and unit
tests. Namely clang-hlsl-tests, clang-spirv-tests, and libclangCodeGen.
Resolves#2203
* Remove explicit marking of DxcThreadMalloc hidden
This was a workaround that is not longer necessary since all symbols are
hidden by default on Unix platforms now.
* Moves the implementation of DxcThreadMalloc to dxcmem.cpp; removes DxcSwapThreadMalloc
* Remove DxcSetThreadMalloc(OrDefault) -- they were always been used for installing the default allocator
* Deletes copy and move ctors and assignment
* stores the allocator in the TLS slot.
* DxcSwapThreadMalloc should be able to install a null allocator
* Marks the DxcThreadMalloc members as hidden (linux only)
- New -Qembed_debug is required to embed PDB in shader container
- -Zi used without -Qembed_debug will not embed debug info anymore,
and will issue a warning from CompileWithDebug().
- When compiling with Compile() and -Zi, -Qembed_debug is assumed
for compatibility reasons (lots of breaks without it)
- In dxc and CompileWithDebug() -Fd implies -Qstrip_debug
- Debug name is based on -Fd, unless path ends with '\', meaning you
want auto-naming and file written under the specified directory
- Debug name always embedded when debug info used, or -Fd used
- -Fd without -Zi just embeds debug name for CompileWithDebug(),
still error with dxc, since it can't write to your file.
- If not embedding debug info, it doesn't get written to the container,
only to be stripped out again.
- Fix padding for alignment in DebugName part.
- Default to DebugNameForBinary instead of DebugNameForSource if no
DebugInfo enabled
- Also fixed missing dependency on table gen options from libclang
DxcCreateBlobOnHeapCopy allocated using the COM allocator and freed using the default thread allocator, leading to crashes.
But more generally, some DxcCreateBlob functions would take ownership of the passed-in buffer without also having the caller explicitly specify what IMalloc to use for deallocation, which is an error-prone pattern, so I reworked these methods a bit.
This also lead me to find some memory leaks and general memory mismanagement in the the rewriter.
Added front-end `-flegacy-resource-reservation` to control behavior (added automatically for SM <= 5.0)
Added an error for unbounded resources used with `-flegacy-resource-reservation`
Added a mechanism for `DxilModule` to preserve intermediate options during its passes but not in the final DXIL
Changed `HLModule` to not remove unreferenced `DxilResources` but rather make their symbol UndefValue
Fixed assumptions of `DxilResources` having a valid `GlobalVariable`
Added reserved resource range gather pass before elimination of unused resources (if SM <= 5.0)
Changed resource allocation to account for reserved ranges
Some HLSL features does not have direct mapping to SPIR-V, instead
we emulate them using lots of SPIR-V instructions. Because they
can cause performance hit, the compiler warns by default. This
commit add an option, -Wno-vk-emulated-features, to let developers
turn off such warnings.
Added a new command-line option: -fspv-debug=<category>, where
category can be file, source, line, and tool, to give developers
fine-grained control of what debug information they want in
the generated SPIR-V code.
Conversions on Apple didn't work. That's because the locales are
formatted a bit differently. This alters the codepage to locale
conversion in WinAdapter on Apple to conform to their practices
format: -fvk-bind-register <type-number> <space> <binding> <set>
Also created a short alias for it: -vkbr.
This option gives the ultimate manual control of descriptor
assignment. It requires:
* All resources are annotated with :register() in the source code
* -fvk-bind-register is specified for every resource
It overrules all other mechanisms.
It cannot be used together with -fvk-{u|b|s|t}-shift.
Windows char conversion functions differ from the std methods in
a few ways. Windows ignores null terminators when lengths are
provided, length limits represent character lengths not byte sizes,
the current linux implementation ignores codepages, but the files
used in many HLSL tests are in iso 8859-1 not utf8. So some
differentiation is needed.
This solves all but one of the above problems. To limit conversion
by source characters instead of solely the size of the output
buffer, where necessary, a temporary buffer is created to hold the
source string that corresponds to the provided character length and
a null terminator is appended to make the std:: conversion stop there.
If there is a null terminator there already, this is skipped for
performance purposes. Such is often the case.
To respect the codepage parameter, it is converted to the corresponding
locale string and setlocale is used to temporarily set the locale
accordingly and set it back afterwards.
The aspect that is missed is that the Windows conversion functions
will happily blow past a null terminating character if the length
specified encompasses it. This is rare enough in general, and unlikely
in the way these functions are used in the source base.
corrects one errant non-breaking space character that found its way
into an HLSL file that the std:: conversion choked on. It didn't seem
to serve any purpose anyway.
Recent changes to FileIOHelper enabled HeapMalloc, which extends
IMalloc, but didn't mark the overriden functions with override.
This produces warnings on some compilers. Added a little clang-
formatting to the affected region too.
This option reciprocates (multiplicatively inverts) SV_Position.w
after reading it from stage input in PS. This is used to accommodate
the difference between Vulkan and DirectX.
Heap allocation functions were previously partially implemented via
a few macros that provided some of the functionality.
This replaces those macros with a more expansive implementation of
the interface as defined in Microsoft documents. Simple structs are
used to store pointer and size implementation, enabling queries of
allocation sized from pointers, putting an upper limit on the total
amount of memory allocated by the given heap, retrieving a
default heap for the process, and freeing of all heap memory at the
time of heap destruction.
Most flags are unsupported save ZERO_MEMORY. Heap creation includes
an initial size that is meant to preallocate memory and make latter
allocations more efficient. This does not support that. Still for
the usages in the code base, this should appear externally to be
a fully functional implementation.
Includes a macro for CaptureStackBackTrace as well for CompilerTest
* [linux-port] Win adapter changes for HLSL tests
A number of Windows-specific features are present in the HLSL tests
as well as driver features that had to be enabled to get those tests
running, notably intellisense classes.
This change adds to the existing WinAdapter and WinFunctions files
to include the features needed for these additions to driver and tests.
They cover a lot of ground, but mostly consist of memory allocation
and string changes, because you can never have enough ways to
allocate memory and manipulate strings.
An incidental change is to the existing CreateFileW implementation
to retry a file if the error indicated it was interrupted. This was
taken from the LLVM Path.inc implementation for Unix.
The StringCchPrintfA implementation for Unix systems mistakenly
used the wrong printf variants. Instead of accessing the variadic
arguments, it was just using the va_list variable as a format
parameter. Additionally, reusing va_list has undefined results.
So a copy needs to be made using va_copy.
MultiByteToWideChar and WideCharToMultiByte were returning -1 on
failure, which caused failures to be ignored. When successful, they
were returning the number of characters converted excluding the
terminating null character because that's what the std:: functions
used for the implementation do. Both are inconsistent with the
original Windows implementations.
Simply adding one to the return value works in all cases save those
where the number of characters converted was limited by the parameter
In this case, the number of characters converted doesn't include
the terminating null. So no addition is warrented.
Added addition errors in keeping with the function descriptions.
The CW2A conversion class left out the terminating null character
when calculating the max size of the converted string.
The previous implementation of CreateFileW for Unix didn't set any permission
flags for the new files it created. This sets the default file
permissions granting read and write permissions to the owner and
read permissions to everyone else. This perpetuates the behavior
before the offending change.
Also adds a few simple smoke tests to verify the code produced by
dxc as well as file creation behavior and other basic smoke-level
functionality of dxc.