Note: Sorry in advance that this patch is so big. Unfortunately
splitting it up would create lots of redundant changes.
This should be the last big refactoring for the Wayland compositor
backend for now.
Up until now SurfacePoolWayland was a pool of actual `wl_surface`s,
as before bug 1718569 we had no direct access to `wl_buffer`s when
using EGL. However, the way `SurfacePoolCA` manages native surfaces
is closer to what in Wayland terminology would be a buffer pool:
buffers are heavy-weight and expansive to allocate, while
`wl_surface` objects are cheap to recreate.
So instead of having a pool of surfaces, each of them having its
own pool of buffers, make `wl_surface`s part of tiles and make
`SurfacePoolWayland` manage `wl_buffer`s (in the form of SHM- or
DMABuf buffers). This will allow us to share buffers (especially
depth buffers) more efficiently, reducing VRAM usage and allocation
times.
Apart from that it will also simplify our tile management logic.
Most importantly, we'll need to reorder `wl_surface`s less often and
less complex (no `place_below` the parent surface) and can also drop
reattaching subsurfaces to compositors. Especially the former will
likely decrease CPU time in compositors.
Overall this patch makes `NativeLayerWayland` behave more like
`NativeLayerCA` while taking in lessons learned from
`WindowSurfaceWaylandMB`.
Differential Revision: https://phabricator.services.mozilla.com/D119993
The pending primitives have already been snapped, so this only
has an effect if the shadow offset is fractional. If that's the
case, it's probably reasonable to either (a) not snap at all,
or (b) round the shadow offset earlier.
Differential Revision: https://phabricator.services.mozilla.com/D121500
Currently each VAO owns its own VBO for per-instance array data. Prior
to submitting each draw call that instance buffer is orphaned and a
new one allocated at exactly the right size to contain the batch's
data. This frequent reallocation appears to cause a rendering glitch
on some Adreno 4xx devices, likely due to a bug in the driver's
internal allocation/recycling code. It can also be expensive on some
devices, as the drivers struggle with the overheads of allocating many
small buffers much more than the cost of actually transferring the
data.
This patch makes it so that the VAOs no longer own an instance buffer
each. Instead, they share a pool of buffers. This pool always
allocates buffers large enough to hold the largest batch we
allow. update_vao_instances() now writes data to the pool's current
buffer until that buffer is full, and then either an old one will be
recycled or a new one allocated. After writing to the buffer it must
be bound to the VAO at the correct offset, and then the draw call is
submitted.
By manually managing the buffer lifecycles ourselves we avoid the
glitches on Adreno 4xx, and also improve performance on some devices.
Differential Revision: https://phabricator.services.mozilla.com/D121104
It's unused on mozilla-central, and Thunderbird can just use the canvas
frame as regular (X)HTML documents, so just use a canvas frame instead
of an nsRootBoxFrame for XUL as well.
nsRootBoxFrame was needed because of various XUL-specific things like
tooltips and so on lived there. But with the move away from XUL, that
functionality has been added to nsCanvasFrame already, behind a
principal check instead.
This also allows simplifying our background propagation setup, which was
only half-working for XUL documents (this bug is a consequence of that).
With this, most of the callers of nsCSSRendering::IsCanvasFrame can go.
They're only two of the frames that would return true for that that
actually paint backgrounds (nsCanvasFrame and nsRootBoxFrame), so the
codepaths in display list building and painting can just check
frame->IsCanvasFrame() instead.
The remaining caller to that function is
nsContainerFrame::SyncWindowProperties, and the change is also legit, in
the sense that the only thing SyncWindowProperties() really cares about
is propagating the max/min-width constraints from the root element's
style to the view/widget, and the only frame that would return true from
IsCanvasFrame and have a view is the viewport frame which is the root of
the frame tree.
Differential Revision: https://phabricator.services.mozilla.com/D90846
We know from testing that the device valid rect itself can sometimes
be empty which produces the symptom that the compositor valid rect
is empty after quantization to i32. So it is sufficient to just check
if that device rect is empty, rather than having to use get_surface_rect
to reproduce the math downstream that produces the compositor valid
rect.
Differential Revision: https://phabricator.services.mozilla.com/D121453
This accomplishes 2 things:
1. Allows us to directly fetch the layersId of the process that is
autoscrolling, which avoids having to fetch it in AutoScrollChild and pass it
around. This fixes autoscrolling out-of-process frames with Fission enabled.
2. Makes it easier to handle autoscrolling of in-process documents, since that
can't happen through PBrowser.
Differential Revision: https://phabricator.services.mozilla.com/D120766
This accomplishes 2 things:
1. Allows us to directly fetch the layersId of the process that is
autoscrolling, which avoids having to fetch it in AutoScrollChild and pass it
around. This fixes autoscrolling out-of-process frames with Fission enabled.
2. Makes it easier to handle autoscrolling of in-process documents, since that
can't happen through PBrowser.
Differential Revision: https://phabricator.services.mozilla.com/D120766
The mpsc channel used for reporting glyph rasterization results does not implement
Sync/Send, so the recommended usage pattern is to wrap the channel endpoints in a
Mutex to safeguard them. This, however, turns out to be slow in some cases. Instead,
just always use a crossbeam channel which is thread-safe to begin with and does not
require a Mutex.
Differential Revision: https://phabricator.services.mozilla.com/D121244
Some of the defines are outdated and never actually set, and the
remaining ones can be easily set through existing or easily added
ifdefs.
Differential Revision: https://phabricator.services.mozilla.com/D121064
It has been unsupported since bug 1323303, > 4 years ago.
This removes MOZ_ENABLE_SKIA but keeps USE_SKIA for moz2d for now
Differential Revision: https://phabricator.services.mozilla.com/D120933
If we encounter a device reset in
RenderCompositorD3D11SWGL::TileD3D11::Map, we should fail the call, and
rely upon the device reset checks at the end of a render pass to
recreate our compositor sessions.
Differential Revision: https://phabricator.services.mozilla.com/D121251
Some clang/rustc build setups seem to result in rounding scenarios where the
valid rect, once converted to i32, is somehow empty. This can cause downwind
problems if we actually try to render those tiles. Instead, just cull it before
it becomes a problem.
Differential Revision: https://phabricator.services.mozilla.com/D121144
Although this currently only contains the existing data u8 vec of
serialized bytes, it will allow us to expand this structure in
future to contain other byte arrays.
This will be used for storing other payload information, such as
interned primitive types or data updates.
Differential Revision: https://phabricator.services.mozilla.com/D121160
We have in the past encountered a driver bug on Adreno 3xx devices
where using a flat scalar varying causes the fragment shader output to
be incorrect. See bugs 1630356 and 1705433.
We have now encountered this in cs_border_solid too, due to
vMixColors. As before, packing the varying in to a vector works around
the bug.
Differential Revision: https://phabricator.services.mozilla.com/D121211
Currently each VAO owns its own VBO for per-instance array data. Prior
to submitting each draw call that instance buffer is orphaned and a
new one allocated at exactly the right size to contain the batch's
data. This frequent reallocation appears to cause a rendering glitch
on some Adreno 4xx devices, likely due to a bug in the driver's
internal allocation/recycling code. It can also be expensive on some
devices, as the drivers struggle with the overheads of allocating many
small buffers much more than the cost of actually transferring the
data.
This patch makes it so that the VAOs no longer own an instance buffer
each. Instead, they share a pool of buffers. This pool always
allocates buffers large enough to hold the largest batch we
allow. update_vao_instances() now writes data to the pool's current
buffer until that buffer is full, and then either an old one will be
recycled or a new one allocated. After writing to the buffer it must
be bound to the VAO at the correct offset, and then the draw call is
submitted.
By manually managing the buffer lifecycles ourselves we avoid the
glitches on Adreno 4xx, and also improve performance on some devices.
Differential Revision: https://phabricator.services.mozilla.com/D121104
While running SandboxTest on Windows/Debug setup we can encounter a GPU
process hanging at shutdown. This is due to a destruction race between
GPUParent and VideoBridgeParent, and we end up with one
CompositorThreadHolder reference that is not properly released, making
the GPU process waiting for it.
This change aims at doing the release earlier to prevent such a
destruction race.
Differential Revision: https://phabricator.services.mozilla.com/D121045
This may be another source of failing OOP iframe document print. I will try to
see whether it can be reproducible locally later.
Differential Revision: https://phabricator.services.mozilla.com/D121075
Before this patch the clear tiles were drawn last which meant they would always be drawn on top of all transparent content. It was finde under the assumption that nothing is ever rendered on top of clear tiles, however it turns out to be an issue with the proton window modal darkining which renders some semi-transparent black on top of the whole window (including window controls on windows 8 which use clear tiles). Since the occlusion culling treats clear tiles as opaque they will cut through anything under them regardless of the drawing order, so we can render them before transaprent tiles to get the correct result which is to overwrite what's under the clear tile while still being able to render semi-transparent content on top.
Differential Revision: https://phabricator.services.mozilla.com/D120844
Maybe<T*> is a redundant representation if there is no semantic
difference between Nothing() and Some(nullptr), and just makes
for cumbersome access syntax.
Differential Revision: https://phabricator.services.mozilla.com/D120568
Transforms stored in WebRenderLayerScrollData::mAncestorTransform would
always end up on the topmost hit-testing tree node generated by that
WebRenderLayerScrollData node. That topmost node may not correspond
to the ASR of the transform item(s) which are the source of the
mAncestorTransform. The transform being in the wrong place in the
hit-testing tree can in turn violate APZ's assumptions about different
nodes that scroll together having the same ancestor transform.
To resolve this, this patch stores the transform item's ASR in the
WebRenderLayerScrollData, and applies it to the matching hit-testing
tree node.
Differential Revision: https://phabricator.services.mozilla.com/D120567
This *mostly* gets us the latest WebIDL API of WebGPU. There is a few limits we are missing, and maybe some things I didn't notice.
But it gets us the new `GPUCanvasContext`, `GPUSupportedLimits`, and `GPUVertexStepMode`.
Differential Revision: https://phabricator.services.mozilla.com/D120764
When changing a reserve/push loop to a resize_with during review
I accidentally removed the clear. This can mean that we end up
with duplicate entries in this pass update arrays.
Differential Revision: https://phabricator.services.mozilla.com/D120690
This seems to be what Chrome does on Windows, and should do the right
thing in other OSes.
On macOS we can do something better and look up the language-dependent
font. That requires further work, but shouldn't probably block this (as
long as system-ui is turned off by default at least).
Differential Revision: https://phabricator.services.mozilla.com/D120714
Alias -apple-system to it, and put it behind a pref for now. This is
pretty boring (read: uncontroversial hopefully) code. The follow-up work
is modifying StaticPresData to look up the fonts using system APIs,
probably. Maybe a bit more work if on macOS they can't be named.
Differential Revision: https://phabricator.services.mozilla.com/D119984
If invalidate_tile() is called not between an enclosing begin_frame()/end_frame(), the tile invalid flag could exist as true in next begin_frame()/end_frame(). It causes the problem. The flag needs to be cleared in begin_frame().
Differential Revision: https://phabricator.services.mozilla.com/D120823
Rayon's collect primitive is somehow establishing recursive dependencies on
waiting for task completion, such that if a lot of glyph jobs are submitted
all at once, this can result in huge recursive stack chains. This simplifies
the glyph job queuing to just send everything immediately over the result
channel, rather than waiting for all jobs in the batch (via collect). We then
rely upon sorting upon receipt to put everything back in a sane order.
Differential Revision: https://phabricator.services.mozilla.com/D120520
Move plane splitter outside the picture state, meaning it doesn't
need to be retained in the picture primitive and passed around
during prepare pass. Another small step to simplifying the prepare
step (interim goal is to remove the recursion here, and then
introduce other prim types, such as a 3d rendering context, that
are able to "contain" child primitives).
Differential Revision: https://phabricator.services.mozilla.com/D120423
In future, we want to persist a lot more of the information about
primitive instances between display lists. This will allow us to
implement a number of optimizations and improvements to the scene
and frame building code (such as reducing per-frame per-prim work
that is typically mostly redundant, supporting prims other than
pictures that can contain child primitives, unifying and optimizing
primitive dependency updates).
As a step towards that, this patch introduces a picture graph which
is very similar to the idea of the render task graph builder. With
this in place, we're able to run picture updates without using
recursion, and ensuring we update pictures in the correct order
that the pass requires. This will allow us to un-tangle a lot of
the existing scene building, visibility, prep and batching pass
complexity, in order to be able to implement the above.
Differential Revision: https://phabricator.services.mozilla.com/D120191
On some Adreno 4xx devices with older drivers we have seen cases where
rendering of masks to alpha targets appears to have no effect. This
results in stale mask data being used for clipping resulting in, for
example, rounded corners appearing as rectangles or strange shapes
rather than solid colours.
This patch performs an unscissored glClear of an alpha target prior to
rendering to it on affected devices, which works around the bug.
Differential Revision: https://phabricator.services.mozilla.com/D120465
It leads to OpenGL errors, as we must initialize a webrender Device in
order to initialize the shaders, and this Device attempts to use
OpenGL commands not supported by the GL context. For example the GL
context may only support GLES 2, hence why we are using SWGL in the
first place.
Differential Revision: https://phabricator.services.mozilla.com/D120356
In the cs_blur fragment shader there is a for loop with a number of
iterations determined by a flat varying. On some old Adreno drivers
this causes severe issues including hangs and crashes. These can be
avoided by clamping the number of iterations to a statically known
value.
Differential Revision: https://phabricator.services.mozilla.com/D120281