This introduces parallel types for IO-type containing aggregates used as
non-entry point function parameters or return types, or declared as variables.
Further uses of the same original type will share the same sanitized deep
structure.
This is intended to be used with the wrap-entry-point branch.
Previously, a type graph would turn into a type tree. That is,
a deep node that is shared would have multiple copies made.
This is important when creating IO and non-IO versions of deep types.
This needs some render testing, but is destined to be part of master.
This also leads to a variety of other simplifications.
- IO are global symbols, so only need one list of linkage nodes (deferred)
- no longer need parse-context-wide 'inEntryPoint' state, entry-point is localized
- several parts of splitting/flattening are now localized
Makes it easier to include glslang in a larger CMake project---instead
of having to call `target_link_libraries(glslang OSDependent OGLCompiler
HLSL)`, for example, you only need to call
`target_link_libraries(glslang)` and it will pull in the helpers it
needs.
This is also better in terms of cleaning up the "public interface",
of sorts, for building glslang: end-users probably shouldn't need to
know or be explicitly dependent on internal targets.
- Add support for invocation functions with "InclusiveScan" and
"ExclusiveScan" modes.
- Add support for invocation functions taking int64/uint64/doube/float16
as inout data types.
Since EOpMatrixSwizzle is a new op, existing back-ends only work when the
front end first decomposes it to other operations. So far, this is only
being done for simple assignment into matrix swizzles.
This partially addressess issue #670, for when the matrix swizzle
degenerates to a component or column: m[c], m[c][r] (where HLSL
swaps rows and columns for user's view).
An error message is given for the arbitrary cases not covered.
These cases will work for arbitrary use of l-values.
Future work will handle more arbitrary swizzles, which might
not work as arbitrary l-values.
This encapsulates where the string could overflow, removing 40 lines
of fragile code. It also improves handling of numbers that are too long.
There are a couple of open issues that could related to this function
being more rational (locale dependence, 1.#INF).
(Still adding tests: do not commit)
This fixes PR #632 so that:
(a) The 4 PerVertex builtins are added to an interface block for all stages except fragment.
(b) Other builtin qualified variables are added as "loose" linkage members.
(c) Arrayness from the PerVertex builtins is moved to the PerVertex block.
(d) Sometimes, two PerVertex blocks are created, one for in, one for out (e.g, for some GS that
both reads and writes a Position)
Any previous use would only be for "", which would probably mean changing
include(...) -> includeLocal(...)
See comments about includeLocal() being an additional search over
includeSystem(), not a superset search.
This also removed ForbidIncluder, as
- the message in ForbidIncluder was redundant: error results were
already returned to the caller, which then gives the error it
wants to
- there is a trivial default implementation that a subclass can
override any subset of (I still like abstract base classes though)
- trying to get less implementation out of the interface file anyway
- fixed ParseHelper.cpp newlines (crlf -> lf)
- removed trailing white space in most source files
- fix some spelling issues
- extra blank lines
- tabs to spaces
- replace #include comment about no location
- some paths didn't release 'res'
- token is always '\n' after proper acceptance of the directive itself,
so no need to test it, change it to '\n', etc.
- assuming setCurrentColumn(0) is not needed unless there are header tokens,
but not clear why it is ever needed
Note: much of the simplified code read as if the included header tokens had
actually been processed, versus queued up for processing; maybe that explains
some things.
This PR adds support for default function parameters in the following cases:
1. Simple constants, such as void fn(int x, float myparam = 3)
2. Expressions that can be const folded, such a ... myparam = sin(some_const)
3. Initializer lists that can be const folded, such as ... float2 myparam = {1,2}
New tests are added: hlsl.params.default.frag and hlsl.params.default.err.frag
(for testing error situations, such as ambiguity or non-const-foldable).
In order to avoid sampler method ambiguity, the hlsl better() lambda now
considers sampler matches. Previously, all sampler types looked identical
since only the basic type of EbtSampler was considered.
This commit adds support for copying nested hierarchical types of split
types. E.g, a struct of a struct containing both user and builtin interstage
IO variables.
When copying split types, if any subtree does NOT contain builtin interstage
IO, we can copy the whole subtree with one assignment, which saves a bunch
of AST verbosity for memberwise copies of that subtree.
This adds structure splitting, which among other things will enable GS support where input structs
are passed, and thus become input arrays of structs in the GS inputs. That is a common GS case.
The salient points of this PR are:
* Structure splitting has been changed from "always between stages" to "only into the VS and out of
the PS". It had previously happened between stages because it's not legal to pass a struct
containing a builtin IO variable.
* Structs passed between stages are now split into a struct containing ONLY user types, and a
collection of loose builtin IO variables, if any. The user-part is passed as a normal struct
between stages, which is valid SPIR-V now that the builtin IO is removed.
* Internal to the shader, a sanitized struct (with IO qualifiers removed) is used, so that e.g,
functions can work unmodified.
* If a builtin IO such as Position occurs in an arrayed struct, for example as an input to a GS,
the array reference is moved to the split-off loose variable, which is given the array dimension
itself.
When passing things around inside the shader, such as over a function call, the the original type
is used in a sanitized form that removes the builtIn qualifications and makes them temporaries.
This means internal function calls do not have to change. However, the type when returned from
the shader will be member-wise copied from the internal sanitized one to the external type.
The sanitized type is used in variable declarations.
When copying split types and unsplit, if a sub-struct contains only user variables, it is copied
as a single entity to avoid more AST verbosity.
Above strategy arrived at with talks with @johnkslang.
This is a big complex change. I'm inclined to leave it as a WIP until it can get some exposure to
real world cases.
Also, eliminate the 'atom' field of TPpToken.
Parsing a real 300 line shader, through to making the AST, is about 10% faster.
Memory is slightly reduced (< 1%).
The whole google-test suite, inclusive of all testing overhead, SPIR-V generation,
etc., runs 3% faster.
Since this is a code *simplification* that leads to perf. improvement, I'm not
going to invest too much more in measuring the perf. than this. The PP code is
simply now in a better state to see how to further rationalize/improve it.
Removed the preprocesser memory pool.
Removed extra copies and unnecessary allocations of objects related to the ones
that were using the pool.
Replaced some allocated pointers with objects instead, generally using more
modern techiques. There end up being fewer memory allocations/deletions to get right.
Overall combined effect of all changes is to use slightly less memory and
run slightly faster (< 1% for both, but noticable).
As part of simplifying the code base, this change makes it easier to see
PP symbol tracking, which I suspect has an even bigger run-time simplification
to make.
Implement token pasting as per the C++ specification, within the current
style of the PP code.
Non-identifiers (turning 12 ## 10 into the numeral 1210) is not yet covered;
they should be a simple incremental change built on this one.
Addresses issue #255.
This change is helpful for integration with Chromium, which recently
added a compiler option to warn when compiling any source files which
use extended characters. In this case the offending character was a
single unicode dash in a comment.
This wasn't needed until the recent generalization of "main" to "entry point",
so makes some HLSL-specific code be generic now, for GLSL functional correctness.
In file included from C:/Projects/glslang/glslang/MachineIndependent/glslang.y:59:0:
glslang/MachineIndependent/ParseHelper.h:276:24: error: 'va_list' has not been declared
va_list args);
^~~~~~~
PR #577 addresses most but not all of the intrinsic promotion problems.
This PR resolves all known cases in the remainder.
Interlocked ops need special promotion rules because at the time
of function selection, the first argument has not been converted
to a buffer object. It's just an int or uint, but you don't want
to convert THAT argument, because that implies converting the
buffer object itself. Rather, you can convert other arguments,
but want to stay in the same "family" of functions. E.g, if
the first interlocked arg is a uint, use only the uint family,
never the int family, you can convert the other args as you please.
This PR allows making such opcode and arg specific choices by
passing the op and arg to the convertible lambda. The code in
the new test "hlsl.promote.atomic.frag" would not compile without
this change, but it must compile.
Also, it provides better handling of downconversions (to "worse"
types), which are permitted in HLSL. The existing method of
selecting upconversions is unchanged, but if that doesn't find
any valid ones, then it will allow downconversions. In effect
this always uses an upconversion if there is one.
Use "--source-entrypoint name" on the command line, or the
TShader::setSourceEntryPoint(char*) API.
When the name given to the above interfaces is detected in the
shader source, it will be renamed to the entry point name supplied
to the -e option or the TShader::setEntryPoint() method.
This PR handles implicit promotions for intrinsics when there is no exact match,
such as for example clamp(int, bool, float). In this case the int and bool will
be promoted to a float, and the clamp(float, float, float) form used.
These promotions can be mixed with shape conversions, e.g, clamp(int, bool2, float2).
Output conversions are handled either via the existing addOutputArgumentConversion
function, which this PR generalizes to handle either aggregates or unaries, or by
intrinsic decomposition. If there are methods or intrinsics to be decomposed,
then decomposition is responsible for any output conversions, which turns out to
happen automatically in all current cases. This can be revisited once inout
conversions are in place.
Some cases of actual ambiguity were fixed in several tests, e.g, spv.register.autoassign.*
Some intrinsics with only uint versions were expanded to signed ints natively, where the
underlying AST and SPIR-V supports that. E.g, countbits. This avoids extraneous
conversion nodes.
A new function promoteAggregate is added, and used by findFunction. This is essentially
a generalization of the "promote 1st or 2nd arg" algorithm in promoteBinary.
The actual selection proceeds in three steps, as described in the comments in
hlslParseContext::findFunction:
1. Attempt an exact match. If found, use it.
2. If not, obtain the operator from step 1, and promote arguments.
3. Re-select the intrinsic overload from the results of step 2.
This PR adds a CreateParseContext() fn analogous to CreateBuiltInParseables(),
to create a language specific built in parser. (This code was present before
but not encapsualted in a fn). This can now be used to create a source language
specific parser for builtins.
Along with this, the code creating HLSL intrinsic prototypes can now produce
them in HLSL syntax, rather than GLSL syntax. This relaxes certain prior
restrictions at the parser level. Lower layers (e.g, SPIR-V) may still have
such restrictions, such as around Nx1 matrices: this code does not impact
that.
This PR also fleshes out matrix types for bools and ints, both of which were
partially in place before. This was easier than maintaining the restrictions
in the HLSL prototype generator to avoid creating protoypes with those types.
Many tests change because the result type from intrinsics moves from "global"
to "temp".
Several new tests are added for the new types.
Previously, an error was thrown when assigning a float1 to a scalar float,
or similar for other basic types. This allows that.
Also, this allows calling functions accepting scalars with float1 params,
so for example sin(float1) will work. This is a minor change in
HlslParseContext::findFunction().
Rationalizes the entire tracking of the linker object nodes, effecting
GLSL, HLSL, and SPIR-V, to allow tracked objects to be fully edited before
their type snapshot for linker objects.
Should only effect things when the rest of the AST contained no reference to
the symbol, because normal AST nodes were not stale. Also will only effect such
objects when their types were edited.
This PR adds:
1. The "u" register class for RW* objects.
2. --shift-image-bindings (== --sib), analogous to --shift-texture-bindings etc.
3. Case insensitive reg classes.
4. Tests for above.
- add optional callback to handle mapping of uniform variables in linking phase
- if no resolver is provided, it uses the internal default resolver with all shifts and auto bind settings
Change-Id: Icfe38a9eabe8bfc8f8bb6d8150c06f7ed38bb762
This PR only changes a few lines of code, but is subtle.
In HLSL, comparison operators (<,>,<=,>=,==,!=) operate component-wise
when given a vector operand. If a whole vector equality or inequality is
desired, then all() or any() can be used on the resulting bool vector.
This PR enables this change. Existing shape conversion is used when
one of the two arguments is a vector and one is a scalar.
Some existing HLSL tests had assumed == and != meant vector-wise
instead of component-wise comparisons. These tests have been changed
to add an explicit any() or all() to the test source. This verifably
does not change the final SPIR-V binary relative to the old behavior
for == and !=. The AST does change for the (now explicit, formerly
implicit) any() and all(). Also, a few tests changes where they
previously had the return type wrong, e.g, from a vec < vec comparison
in hlsl.shapeConv.frag.
Promotion of comparison opcodes to vector forms
(EOpEqual->EOpVectorEqual) is handled in promoteBinary(), as is setting
the proper vector type of the result.
EOpVectorEqual and EOpVectorNotEqual are now accepted as either
aggregate or binary nodes, similar to how the other operators are
handled. Partial support already existed for this: it has been
fleshed out in the printing functions in intermOut.cpp.
There is an existing defect around shape conversion with 1-vectors, but
that is orthogonal to this PR and not addressed by it.
I happened upon numArgs while hunting for unused variables. I suspect
the intent was to apply it as shown in this patch. However, I am not a
compiler dude. Someone more appropriate should grok this change.
A need arose to use capabilities from TIntermediate during
node promotion. These methods have been moved from virtual
methods on the TIntermUnary and TIntermBinary nodes to methods
on TIntermediate, so it is easy for them construct new nodes
and so on.
This is done as a separate commit to verify that no test results
are changed as a result.
This fixes defects as follows:
1. handleLvalue could be called on a non-L-value, and it shouldn't be.
2. HLSL allows unary negation on non-bool values. TUnaryOperator::promote
can now promote other types (e.g, int, float) to bool for this op.
3. HLSL allows binary logical operations (&&, ||) on arbitrary types, similar
(2).
4. HLSL allows mod operation on arbitrary types, which will be promoted.
E.g, int % float -> float % float.
This PR sets the TQualifier layoutFormat according to the HLSL image type.
For instance:
RWTexture1D <float2> g_tTex1df2;
becomes ElfRg32f. Similar on Buffers, e.g, Buffer<float4> mybuffer;
The return type for image and buffer loads is now taken from the storage format.
Also, the qualifier for the return type is now (properly) a temp, not a global.
- hlsl.struct.frag variable changed to static, assignment replacd.
- Created new low level functions addBinaryNode and addUnaryNode. These are
used by higher level functions such as addAssignment, and do not do any
argument promotion or conversion of any sort.
- Two functions above are now used in RWTexture lvalue conversions. Also,
other direction creations of unary or binary nodes now use them, e.g, addIndex.
This cleans up some existing code.
- removed handling of EOpVectorTimesScalar from promote()
- removed comment from ParseHelper.cpp
This commit splits lValueErrorCheck into machine dependent and independent
parts. The GLSL form in TParseContext inherits from and invokes the
machine dependent part in TParseContextBase. The base form checks language
independent things. This split does not change the set of errors tested
for: the test results are identical.
The new base class interface is now used from the HLSL FE to test lvalues.
There was one test diff due to this, where the test was writing to a uniform.
It still does the same indirections, but does not attempt a uniform write.
This commit adds l-value support for RW texture and buffer objects.
Supported are:
- pre and post inc/decrement
- function out parameters
- op-assignments, such as *=, +-, etc.
- result values from op-assignments. e.g, val=(MyRwTex[loc] *= 2);
Not supported are:
- Function inout parameters
- multiple post-inc/decrement operators. E.g, MyRWTex[loc]++++;
If a member-wise assignment from a non-flattened struct to a flattened struct sees a complex R-value
(not a symbol), it now creates a temporary to hold that value, to avoid repeating the R-value.
This avoids, e.g, duplicating a whole function call. Also, it avoids re-using the AST node, making a
new one for each member inside the member loop.
The latter (re-use of AST node) was also an issue in the GetDimensions intrinsic decomposition,
so this PR fixes that one too.
- Add new queries: TProgram::getUniformTType and getUniformBlockTType,
which return a const TType*, or nullptr on a bad index. These are valid for
any source language.
- Interface name for HLSL cbuffers is taken from the (only) available declaration name,
whereas before it was always an empty string, which caused some troubles with reflection
mapping them all to the same index slot. This also makes it appear in the SPIR-V binary
instead of an empty string.
- Print the binding as part of the reflection textual dump.
- TType::clone becomes const. Needed to call it from a const method, and anyway it doesn't
change the object it's called on.
- Because the TObjectReflection constructor is called with a TType *reference* (not pointer)
so that it's guaranteed to pass in a type, and the "badReflection" value should use a nullptr
there, that now has a dedicated static method to obtain the bad value. It uses a private
constructor, so external users can't create one with a nullptr type.
Previously, the binding auto-mapping facility was free to use any unused
binding. This change makes auto-bindings use the same offset value as
explicit bindings.
Fix for two defects as follows:
- The IO mapping traverser was not setting inVisit, and would skip some AST nodes.
Depending on the order of nodes, this could have prevented the binding from
showing up in the generated SPIR-V.
- If a uniform array was flattened, each of the flattened scalars from the array
is still a (now-scalar) uniform. It was being converted to a temporary.
This checkin adds a --flatten-uniform-arrays option which can break
uniform arrays of samplers, textures, or UBOs up into individual
scalars named (e.g) myarray[0], myarray[1], etc. These appear as
individual linkage objects.
Code notes:
- shouldFlatten internally calls shouldFlattenIO, and shouldFlattenUniform,
but is the only flattening query directly called.
- flattenVariable will handle structs or arrays (but not yet arrayed structs;
this is tested an an error is generated).
- There's some error checking around unhandled situations. E.g, flattening
uniform arrays with initializer lists is not implemented.
- This piggybacks on as much of the existing mechanism for struct flattening
as it can. E.g, it uses the same flattenMap, and the same
flattenAccess() method.
- handleAssign() has been generalized to cope with either structs or arrays.
- Extended test infrastructure to test flattening ability.
This PR adds the ability to offset sampler, texture, and UBO bindings
from provided base bindings, and to auto-number bindings that are not
provided with explicit register numbers. The mechanism works as
follows:
- Offsets may be given on the command line for all stages, or
individually for one or more single stages, in which case the
offset will be auto-selected according to the stage being
compiled. There is also an API to set them. The new command line
options are --shift-sampler-binding, --shift-texture-binding, and
--shift-UBO-binding.
- Uniforms which are not given explicit bindings in the source code
are auto-numbered if and only if they are in live code as
determined by the algorithm used to build the reflection
database, and the --auto-map-bindings option is given. This auto-numbering
avoids using any binding slots which were explicitly provided in
the code, whether or not that explicit use was live. E.g, "uniform
Texture1D foo : register(t3);" with --shift-texture-binding 10 will
reserve binding 13, whether or not foo is used in live code.
- Shorter synonyms for the command line options are available. See
the --help output.
The testing infrastructure is slightly extended to allow use of the
binding offset API, and two new tests spv.register.(no)autoassign.frag are
added for comparing the resulting SPIR-V.
This PR factors out the code that knows how to walk just the live parts of the AST.
The traverser in reflect.cpp is renamed to TReflectionTraverser, and inherits from
TLiveTraverser, which will also be used by a future binding offset PR.
The code is now smart about the entry point name (no longer hardcoded to "main").
There is an option to traverse all code (live+dead), because a consumer of the
class may wish to use it for both purposes without wanting a whole separate
class hierarchy.
Addresses issue #304 and issue #307 by replacing unmatched type OpStores with
per-member copies. Covers assignment statements and most argument passing, but
does not yet cover r-value-based argument passing.
Takes some pressure off of issue #304.
Structures don't inherit locations and then explicitly decorate
members with them, so removed this reason to have another instance
of a structure type.
Code using atEndOfFile was dead, instead do something useful with
the scanners atEndOfInput(). This allows a better error message
for early termination of cascading errors.
This also enables vecN -> vec1 shape conversions for all places doing shape
conversions.
For signature selection, makes shape changes worse than any other comparison
when deciding what conversions are better than others.
This is part of the change to have desktop shaders respect precision
qualifiers on Vulkan, but since the defaults are all highp, and that's
different from ES fragment shaders, detect likely cases and warn about
them (but being careful to not be too noisy if it's unlikely to be a
problem).
Sets highp defaults for the appropriate types, for all stages,
and turns on precision qualifiers for non-ES shaders. Required
fixing some qualifier orders for desktop built-in declarations
for pre-420 shaders.
Use the new function selector for #version 400 and above,
parameterized for the GLSL #version 400 selection rules.
This can be used for both GLSL and HLSL, and other languages
as well.
When preprocessing only, some tokens were emitted as <bad token>.
This fixes them to preserve their original content.
This supplants PR #182, with a correction and test results.
From the ES spec + Bugzilla 15931 and GL_KHR_vulkan_glsl:
- Update precision qualifiers for all built-in function prototypes.
- Implement the new algorithm used to distinguish built-in function
operation precisions from result precisions.
Also add tracking of separate result and operation precisions, and
use that in generating SPIR-V.
(SPIR-V cares about precision of operation, while the front-end
cares about precision of result, for propagation.)
The sequence
#define m()
int m"
creates a token of no length (a string of 0 size). Protect
against a string of 0 size as well as the existing protect
against a null string.