This patch breaks linear gradients into two primitives:
- LinearGradient is always rendered via the brush shader. It is only used with SWGL.
- CachedLinearGradient is always rendered via a cached render task and the brush image shader. Its implementation is very similar to conic and radial gradients, with the addition of a fast-path for axis aligned gradients with two stops.
In addition the following changes are made:
- The gradient fast path is simpler (only deals with two gradient stops).
- Decomposing axis-aligned gradients into parts eligible for the fast path happens during scene building instead of frame building.
Differential Revision: https://phabricator.services.mozilla.com/D112018
On android video data is provided to webrender via EGL external
images. Webrender currently relies on the extension
GL_OES_EGL_image_external_essl3 to render these images. Unfortunately,
however, there are a number of devices which support GLES 3, but do
not support GL_OES_EGL_image_external_essl3.
This means we that must use the older GL_OES_EGL_image_external
(non-essl3) extension to render video on such devices. This requires
shaders to be written in ESSL1 rather than ESSL3.
Most of webrender's shaders use too many modern GLSL features to be
compatible with ESSL1. For that reason, this patch implements ESSL1
compatible variants of just the composite and cs_scale shaders, as
they are both relatively simple.
In the happy path, videos are promoted to compositor surfaces and we
simply use this new composite shader variant. In other cases, this
patch makes it so that we use a render task to perform a copy of the
video frame using the cs_scale shader, then the output of that render
task can be rendered using the regular ESSL3 TEXTURE_2D variant of
whatever shader is required. The extra copy is unfortunate, but
rewriting every shader to be ESSL1 compatible is unrealistic.
Differential Revision: https://phabricator.services.mozilla.com/D108908
On low powered android devices it has been observed that we are GPU
bound on many pages. The composite shader, despite being relatively
simple, can account for a large proportion of these cycles due to the
large number of fragments it touches.
On Mali-T GPUs, the composite fragment shader is bound by loading the
varyings (for example, this takes 3 cycles on a Mali-T830). This patch
adds a fast path variant of the shader which removes the vColor and
vUVBounds varyings, reducing the number of cycles per fragment to 1 on
this GPU. This variant can only be used where the shader does not need
to modulate the output by a color (ie aColor is white), and when the
UV coordinates do not need to be clamped (eg because the entire
texture is being composited). Fortunately both of these conditions are
true in the common case of compositing picture cache tiles.
Differential Revision: https://phabricator.services.mozilla.com/D108686
It is the cs_* shader used to cache linear gradients in the simple/fast path (more complex linear gradients are rendered with a brush shader. The goal of this patch series is to move all gradients to cached render tasks that will be composited with brush_image, so there will be a cs_linear_gradient introduced in a followup to cover the general case as well as cs_radial_gradient and cs_conic_gradient to replace the brush equivalents.
Differential Revision: https://phabricator.services.mozilla.com/D107640
On some Mali devices we have encountered driver crashes caused by
calling textureSize(samplerExternalOES) in a shader without also
potentially sampling from the texture in the shader. ARM's suggested
workaround was to trick the driver in to thinking that the texture may
be sampled from (ie by sampling in a branch which is never dynamically
taken).
This is done by checking the value of a dummy uniform, and sampling
the texture if the value is non-default. Using a constant expression
did not work because the compiler would optimize the condition (and
therefore the sample) away.
Also re-enable webrender on Mali-72 and G76 devices, as it was blocked
due to this bug.
Differential Revision: https://phabricator.services.mozilla.com/D105493
With this change, all color/alpha intermediate surfaces are individual
textures, rather than texture arrays.
This can in theory cause more batch breaks in some cases, but this
is likely to be very rare in practice.
Benefits:
- No more allocating the array at the size of the largest task / slice.
- Remove a source of many driver bugs on android devices.
- Simplify integration of future patches with render task graph.
Much of the render target array texture code is still present, since
picture cache tiles in the Draw compositor still make use of texture
arrays. However, once these are switched to normal textures, we can
remove most of the slice layer, blit workaround functionality etc.
Remove the default feature setting for selecting the image sampler
kind. Instead, this must be explicitly specified by the shader or
a dynamic feature define, which makes sampler selection less error prone.
Differential Revision: https://phabricator.services.mozilla.com/D99006
With this change, all color/alpha intermediate surfaces are individual
textures, rather than texture arrays.
This can in theory cause more batch breaks in some cases, but this
is likely to be very rare in practice.
Benefits:
- No more allocating the array at the size of the largest task / slice.
- Remove a source of many driver bugs on android devices.
- Simplify integration of future patches with render task graph.
Much of the render target array texture code is still present, since
picture cache tiles in the Draw compositor still make use of texture
arrays. However, once these are switched to normal textures, we can
remove most of the slice layer, blit workaround functionality etc.
Remove the default feature setting for selecting the image sampler
kind. Instead, this must be explicitly specified by the shader or
a dynamic feature define, which makes sampler selection less error prone.
Differential Revision: https://phabricator.services.mozilla.com/D99006
With this change, all color/alpha intermediate surfaces are individual
textures, rather than texture arrays.
This can in theory cause more batch breaks in some cases, but this
is likely to be very rare in practice.
Benefits:
- No more allocating the array at the size of the largest task / slice.
- Remove a source of many driver bugs on android devices.
- Simplify integration of future patches with render task graph.
Much of the render target array texture code is still present, since
picture cache tiles in the Draw compositor still make use of texture
arrays. However, once these are switched to normal textures, we can
remove most of the slice layer, blit workaround functionality etc.
Remove the default feature setting for selecting the image sampler
kind. Instead, this must be explicitly specified by the shader or
a dynamic feature define, which makes sampler selection less error prone.
Differential Revision: https://phabricator.services.mozilla.com/D99006
With this change, all color/alpha intermediate surfaces are individual
textures, rather than texture arrays.
This can in theory cause more batch breaks in some cases, but this
is likely to be very rare in practice.
Benefits:
- No more allocating the array at the size of the largest task / slice.
- Remove a source of many driver bugs on android devices.
- Simplify integration of future patches with render task graph.
Much of the render target array texture code is still present, since
picture cache tiles in the Draw compositor still make use of texture
arrays. However, once these are switched to normal textures, we can
remove most of the slice layer, blit workaround functionality etc.
Remove the default feature setting for selecting the image sampler
kind. Instead, this must be explicitly specified by the shader or
a dynamic feature define, which makes sampler selection less error prone.
Differential Revision: https://phabricator.services.mozilla.com/D99006
The pixel-local-storage functionality was an experiment for faster
drawing of clip masks on low end tiled GPUs. However, it's never
reached a point where it was shippable and showing clear performance
wins.
This patch removes the experimental PLS support - we can always
revive it from git history if we ever want to consider it again.
Differential Revision: https://phabricator.services.mozilla.com/D98290
This change adds a code path to avoid instancing, enabled (if supported) on non-Intel GPUs.
Side note: we still need a plan on what to do on devices that support neither of base-instance or SSBO.
Differential Revision: https://phabricator.services.mozilla.com/D87826
This change adds a code path to avoid instancing, enabled (if supported) on non-Intel GPUs.
Side note: we still need a plan on what to do on devices that support neither of base-instance or SSBO.
Differential Revision: https://phabricator.services.mozilla.com/D87826
The patch ended up more complicated than I anticipated due to a lot of places in webrender assuming texture arrays unless specified otherwise.
The patch also merges TextureTarget into ImageBufferKind, and removes the realloc code path ing the texture cache (which is supposed to be dead code since because of performance issues on windows+intel).
Differential Revision: https://phabricator.services.mozilla.com/D95562
The patch ended up more complicated than I anticipated due to a lot of places in webrender assuming texture arrays unless specified otherwise.
The patch also merges TextureTarget into ImageBufferKind, and removes the realloc code path ing the texture cache (which is supposed to be dead code since because of performance issues on windows+intel).
Differential Revision: https://phabricator.services.mozilla.com/D95562
The patch ended up more complicated than I anticipated due to a lot of places in webrender assuming texture arrays unless specified otherwise.
The patch also merges TextureTarget into ImageBufferKind, and removes the realloc code path ing the texture cache (which is supposed to be dead code since because of performance issues on windows+intel).
Differential Revision: https://phabricator.services.mozilla.com/D95562
I found it hard to understand the code that builds shader features,
and even harder to modify it with a new feature. This PR refactors the shader
building code by removing the macro and introducing a FeatureList abstraction, internally.
It also sorts the features on both ends (alternatively, we could use a set).
Differential Revision: https://phabricator.services.mozilla.com/D83455
Part 1 - support RGB external surfaces for promotion to compositor
surfaces; add new shader permutations to handle all buffer kinds.
Set the promotion flag when the pixel format has no alpha, or when the
texture provider can guarantee all-solid alpha values.
Differential Revision: https://phabricator.services.mozilla.com/D71120
Part 1 - support RGB external surfaces for promotion to compositor
surfaces; add new shader permutations to handle all buffer kinds.
Set the promotion flag when the pixel format has no alpha, or when the
texture provider can guarantee all-solid alpha values.
Differential Revision: https://phabricator.services.mozilla.com/D71120
The list of pre-optimized shaders was being generated in a
non-deterministic order, causing large build time regressions. This
sort the list of shaders before writing them to the shaders.rs file.
Differential Revision: https://phabricator.services.mozilla.com/D71905
Move more shader parsing code to webrender_build, so it can be used
both at runtime and build time.
At build time optimize a set of shaders and feature flag combinations,
using glslopt. Some features are skipped because they are not
supported by the gl version, because the optimizer does not support
them, or because webrender does not need them currently.
Use build-parallel to ensure the optimization is performed in parallel
using the make jobserver. Write the optimized shader source to a
hashmap to be used at runtime, in addition to the unoptimized source.
Differential Revision: https://phabricator.services.mozilla.com/D70032
This is an experiment with only image and solid to see what the infrastructure can be like.
If it works out I'll extend the it with more brush types. More work will be needed to get text rendering in there as well.
The multi-brush shader includes all brushes that it potentially needs suport for. Which brushes actually get compiled in is then specified via WR_FEATURE defines.
Since brushes can't have the same names for their entry points, they specify the function to use via a macros (WR_BRUSH_VS_FUNCTION and WR_BRUSH_FS_FUNCTION).
Differential Revision: https://phabricator.services.mozilla.com/D53725
--HG--
extra : moz-landing-system : lando
This chanes the shader parsing code to only inject #included shader sources once (the first time) if they are included multiple times.
This will allow some extra flexibility needed by the multi-brush shader.
Differential Revision: https://phabricator.services.mozilla.com/D53651
--HG--
extra : moz-landing-system : lando