We were locking composite surfaces, which were tracked within surfaces in frame_surfaces.
frame_surfaces was subsequently clipped and pruned. We then crawled frame_surfaces for
composite surfaces to unlock, which were unintentionally pruned by that point.
The solution to this mess is to just track composite surfaces that are locked independently
of frame_surfaces, so no more issues with clipping.
Differential Revision: https://phabricator.services.mozilla.com/D180609
We were locking composite surfaces, which were tracked within surfaces in frame_surfaces.
frame_surfaces was subsequently clipped and pruned. We then crawled frame_surfaces for
composite surfaces to unlock, which were unintentionally pruned by that point.
The solution to this mess is to just track composite surfaces that are locked independently
of frame_surfaces, so no more issues with clipping.
Differential Revision: https://phabricator.services.mozilla.com/D180609
Contrary to ld64, lld doesn't ignore libraries it's given on the command
line, and -lLLVM-14 ends up as a link error when using lld.
So instead of relying on the flags llvm-config outputs to be kind of
ignored, we replace them at the source by wrapping llvm-config itself.
Differential Revision: https://phabricator.services.mozilla.com/D178900
The old version that is currently used doesn't support lld for mac, and
tries to pass it flags for BFD ld (e.g. --as-needed).
For some reason meson checks fail because of some CFLAGS not being used
during the checks being reported as an error, so we make those errors
quiet.
Differential Revision: https://phabricator.services.mozilla.com/D178899
We have seen reports that various websites, including twitter, perform
poorly on older Adreno devices due, to backdrop filter. We previously
encountered similar on Mali-G710 devices in bug 1809738, and it
appeared to be due to having to copy the contents of large
framebuffers--required to render the backdrop filter--to and from the
GPU's tile memory. On Mali we were able to avoid this penalty by
ensuring we performed an unscissored clear immediately after binding
the framebuffer, allowing the driver to omit initalizing the contents
of tile memory prior to rendering.
It's plausible that older Adreno drivers are not clever enough to be
able to make this optimization. However, there exists an extension
QCOM_tiled_rendering, which allows us to explicitly tell the driver
which subregion of a render target we are rendering too, and whether
it must be pre-initilized or post-resolved.
This patch makes use of this extension when rendering to color and
picture cache targets. In both cases we supply the region that is
being rendered and must only resolve the color attachment back to main
memory. In most cases we can additionally avoid initializing tile
memory prior to rendering, with the exception being in
draw_color_target() when we do not perform an initial clear, in which
case we must initialize the color attachment.
Unfortunately there appears to be a driver bug with this extension on
a variety of Adreno 5xx and 6xx devices, specifically on driver
version V@0490, resulting in incorrect rendering. We therefore block
usage of the extension on affected drivers, but use it elsewhere when
supported.
This results in a significant performance improvement on twitter when
tested on a Nexus 5 (Adreno 330) device.
Differential Revision: https://phabricator.services.mozilla.com/D176154
We have a driver-specific regression related to the shader optimizer
and also a small performance regression in a couple of tests. Rather
than back out the (large) patch set, disable the new path for now
to provide time to investigate and fix those issues.
Differential Revision: https://phabricator.services.mozilla.com/D176796
We have seen reports that various websites, including twitter, perform
poorly on older Adreno devices due, to backdrop filter. We previously
encountered similar on Mali-G710 devices in bug 1809738, and it
appeared to be due to having to copy the contents of large
framebuffers--required to render the backdrop filter--to and from the
GPU's tile memory. On Mali we were able to avoid this penalty by
ensuring we performed an unscissored clear immediately after binding
the framebuffer, allowing the driver to omit initalizing the contents
of tile memory prior to rendering.
It's plausible that older Adreno drivers are not clever enough to be
able to make this optimization. However, there exists an extension
QCOM_tiled_rendering, which allows us to explicitly tell the driver
which subregion of a render target we are rendering too, and whether
it must be pre-initilized or post-resolved.
This patch makes use of this extension when rendering to color and
picture cache targets. In both cases we supply the region that is
being rendered and must only resolve the color attachment back to main
memory. In most cases we can additionally avoid initializing tile
memory prior to rendering, with the exception being in
draw_color_target() when we do not perform an initial clear, in which
case we must initialize the color attachment.
This results in a significant performance improvement on twitter when
tested on a Nexus 5 (Adreno 330) device.
Differential Revision: https://phabricator.services.mozilla.com/D176154
We have seen reports that various websites, including twitter, perform
poorly on older Adreno devices due, to backdrop filter. We previously
encountered similar on Mali-G710 devices in bug 1809738, and it
appeared to be due to having to copy the contents of large
framebuffers--required to render the backdrop filter--to and from the
GPU's tile memory. On Mali we were able to avoid this penalty by
ensuring we performed an unscissored clear immediately after binding
the framebuffer, allowing the driver to omit initalizing the contents
of tile memory prior to rendering.
It's plausible that older Adreno drivers are not clever enough to be
able to make this optimization. However, there exists an extension
QCOM_tiled_rendering, which allows us to explicitly tell the driver
which subregion of a render target we are rendering too, and whether
it must be pre-initilized or post-resolved.
This patch makes use of this extension when rendering to color and
picture cache targets. In both cases we supply the region that is
being rendered and must only resolve the color attachment back to main
memory. In most cases we can additionally avoid initializing tile
memory prior to rendering, with the exception being in
draw_color_target() when we do not perform an initial clear, in which
case we must initialize the color attachment.
This results in a significant performance improvement on twitter when
tested on a Nexus 5 (Adreno 330) device.
Differential Revision: https://phabricator.services.mozilla.com/D176154
We have seen reports that various websites, including twitter, perform
poorly on older Adreno devices due, to backdrop filter. We previously
encountered similar on Mali-G710 devices in bug 1809738, and it
appeared to be due to having to copy the contents of large
framebuffers--required to render the backdrop filter--to and from the
GPU's tile memory. On Mali we were able to avoid this penalty by
ensuring we performed an unscissored clear immediately after binding
the framebuffer, allowing the driver to omit initalizing the contents
of tile memory prior to rendering.
It's plausible that older Adreno drivers are not clever enough to be
able to make this optimization. However, there exists an extension
QCOM_tiled_rendering, which allows us to explicitly tell the driver
which subregion of a render target we are rendering too, and whether
it must be pre-initilized or post-resolved.
This patch makes use of this extension when rendering to color and
picture cache targets. In both cases we supply the region that is
being rendered and must only resolve the color attachment back to main
memory. In most cases we can additionally avoid initializing tile
memory prior to rendering, with the exception being in
draw_color_target() when we do not perform an initial clear, in which
case we must initialize the color attachment.
This results in a significant performance improvement on twitter when
tested on a Nexus 5 (Adreno 330) device.
Depends on D176153
Differential Revision: https://phabricator.services.mozilla.com/D176154
The size of the surface is irrelevant for external surfaces, since it is
handled by the native compositor with each call to AttachExternalImage. By
removing the size from the hash key, this patch ensures that real or
imagined changes to surface sizes for external surfaces will no longer
call create_compositor_external_surface, thrashing the resources of the
native compositor.
Differential Revision: https://phabricator.services.mozilla.com/D175495
This adds the new infrastructure for rendering masked primitives
and uses it for simple rectangle primitives. Follow up patches
will port other primitives to it (and transformed rectangles).
Instead of rendering an alpha mask and then applying that during
picture cache rendering of content, the underlying content is
drawn to an off-screen surface, and the mask is applied on
top of that via multiplicative blending.
This is particularly helpful for applying masks to dynamically
rendered pictures in future, as we can apply the mask over the
already rendered picture without allocating an extra surface.
Since the content and mask is rendered together to a surface,
we can take advantage of this in future by caching the result
in the texture cache, rather than a temporary render target.
This means we don't need to redraw clip masks for this content
each time the surrounding area is invalidated.
Since the clip-mask is rendered in to the off-screen surface,
it is cheaper and simpler to composite the content in to the
main scene, avoiding an extra texture fetch and some tricky
fragment shader logic to sample the correct part of the mask.
To reduce the number of off-screen pixels that get drawn, the
system supports splitting the content up in to a series of
segments. This can either be a 9-patch, for the simple and
common case of a single rounded clip, or a tile grid across
the primitive. The tile grid can make it much faster to apply
large image masks, where there are often large areas that we
can determine are not affected by the mask image.
Differential Revision: https://phabricator.services.mozilla.com/D173095
This adds the new infrastructure for rendering masked primitives
and uses it for simple rectangle primitives. Follow up patches
will port other primitives to it (and transformed rectangles).
Instead of rendering an alpha mask and then applying that during
picture cache rendering of content, the underlying content is
drawn to an off-screen surface, and the mask is applied on
top of that via multiplicative blending.
This is particularly helpful for applying masks to dynamically
rendered pictures in future, as we can apply the mask over the
already rendered picture without allocating an extra surface.
Since the content and mask is rendered together to a surface,
we can take advantage of this in future by caching the result
in the texture cache, rather than a temporary render target.
This means we don't need to redraw clip masks for this content
each time the surrounding area is invalidated.
Since the clip-mask is rendered in to the off-screen surface,
it is cheaper and simpler to composite the content in to the
main scene, avoiding an extra texture fetch and some tricky
fragment shader logic to sample the correct part of the mask.
To reduce the number of off-screen pixels that get drawn, the
system supports splitting the content up in to a series of
segments. This can either be a 9-patch, for the simple and
common case of a single rounded clip, or a tile grid across
the primitive. The tile grid can make it much faster to apply
large image masks, where there are often large areas that we
can determine are not affected by the mask image.
Differential Revision: https://phabricator.services.mozilla.com/D173095
Since landing bug 1823411 we have been receiving crash reports on a
variety of Mali-T devices when attempting to compile the brush_blend
shader. This appears to be due to changing v_color_mat to mediump,
though the reason why that crashes is currently unknown. This patch
reverts it to highp to avoid the crash.
This is being landed as-is due to being so late in the cycle, in order
to prevent crashes making it to beta. Further work should be to
determine precisely what conditions cause the crash, and add a test to
ensure we do not encounter it again.
Differential Revision: https://phabricator.services.mozilla.com/D174722