This patch removes more main thread dependencies from the content side
of WebGPU. Instead of issuing a resource update for an external image,
we now use an async image pipeline in conjunction with
CompositableInProcessManager from part 1. This allows us to update the
HTMLCanvasElement bound to the WebGPU device without having to go
through the main thread, or even the content process after the swap
chain update / readback has been requested.
Differential Revision: https://phabricator.services.mozilla.com/D138887
For WebGPU, we produce the textures in the compositor process and the
content process doesn't need to be that involved except for hooking up
the texture to the display list. Currently this is done via an external
image ID.
Given that WebGPU needs to work with OffscreenCanvas, it would be best
if its display pipeline was consistent whether it was gotten from an
HTMLCanvasElement, OffscreenCanvas on the main thread, or on a worker
thread. As such, using an AsyncImagePipeline would be best.
However there is no real need to bounce the handles across process
boundaries. Hence this patch which adds CompositableInProcessManager.
This static class is responsible for collecting WebRenderImageHost
objects backed by TextureHost objects which do not leave the compositor
process. This will allow WebGPUParent to schedule compositions directly
in future patches.
Differential Revision: https://phabricator.services.mozilla.com/D138588
Most of the support for presenting a WebGLFramebuffer to a swap chain existed as part of the
mechanism for opaque WebXR framebuffer support. However, such "opaque" framebuffer are meant
to be opaque in the sense that their attachments can't be inspected or changed, which does
not provide the requisite level of control for efficiently implementing Canvas2D snapshots.
To this end, the existing Present mechanism is slightly extended to allow presenting to the
swap chain already present in WebGLFramebuffer without the existence of a corresponding
MozFramebuffer.
This also fixes a bug in that AsWebgl() was no longer being utilized in CanvasRenderer, such
that a new mechanism that routed GetFrontBuffer() was needed to fix the code rot.
There are also some efforts to remove a couple redundant copies I noticed in profiles along
the way.
Differential Revision: https://phabricator.services.mozilla.com/D138119
Most of the support for presenting a WebGLFramebuffer to a swap chain existed as part of the
mechanism for opaque WebXR framebuffer support. However, such "opaque" framebuffer are meant
to be opaque in the sense that their attachments can't be inspected or changed, which does
not provide the requisite level of control for efficiently implementing Canvas2D snapshots.
To this end, the existing Present mechanism is slightly extended to allow presenting to the
swap chain already present in WebGLFramebuffer without the existence of a corresponding
MozFramebuffer.
This also fixes a bug in that AsWebgl() was no longer being utilized in CanvasRenderer, such
that a new mechanism that routed GetFrontBuffer() was needed to fix the code rot.
There are also some efforts to remove a couple redundant copies I noticed in profiles along
the way.
Differential Revision: https://phabricator.services.mozilla.com/D138119
This patch ensures that we only update the external image resource for
WebGPU when there has been an actual change for the resource. In order
to guarantee this, we wait for the present to complete, and only then
issue the update. WebRenderBridgeChild::SendResourceUpdates will also
trigger a frame generation if any resources were changed, which means we
don't need to trigger a paint on the frame itself anymore.
Note that we still have a race condition when we write into the
MemoryTextureHost while in PresentCallback, and the renderer thread may
be accessing the pixel data to upload to the GPU.
Differential Revision: https://phabricator.services.mozilla.com/D138349
Current GPUVideoTextureHost, which wraps BufferTextureHost does not create additional copies of the texture data. Then we could remove calling DisableExternalTextures() in GPUVideoTextureHost::EnsureWrappedTextureHost().
And IsWrapingBufferTextureHost() is added for handling a case that BufferTextureHost is wrapped by GPUVideoTextureHost.
Differential Revision: https://phabricator.services.mozilla.com/D138227
This is a mechanical change which was performed by a script based on the
contents of direct_call.py, and then manually checked over to fix
various rewriting bugs caused by my glorified sed script. See the
previous part for more context on the change.
Differential Revision: https://phabricator.services.mozilla.com/D137227
This simplifies the logic around descriptors significantly, which is
especially useful considering how few places use the type. There is a
small change required on Windows to create the NamedPipe directly and
transfer around each end's handle, rather than connecting between
processes after the fact.
A named pipe has to be used, rather than an anonymous pipe, as
bidirectional communication is required.
Differential Revision: https://phabricator.services.mozilla.com/D130381
Bug 1742985 added the test helper_zoom_after_gpu_process_restart.html,
but it doesn't actually get run on any platform with the GPU process
enabled. (Due to bug 1495580 on windows, and because the GPU process
isn't yet enabled on android.)
The test kills the GPU process, then tries to wait for it to be
restarted before proceeding. However, the function
ensureGPUProcessReadyForTests doesn't always work as intended, as the
GPUProcessManager may not have yet noticed that the process has been
killed, and therefore may return immediately from EnsureGPUReady.
This patch removes the buggy ensureGPUProcessReadyForTests function,
and instead makes the test wait for the "compositor-reinitialized"
topic to be observed.
Differential Revision: https://phabricator.services.mozilla.com/D138125
Before Bug 1713276 fix, BufferTextureData of ffmpeg decoded video was recycled.
ShmemTextureData::CropYCbCrPlanes() changed BufferDescriptor. And it blocks the BufferTextureData to be recycled. The BufferDescriptor needs to be restored to original value before recycling.
Differential Revision: https://phabricator.services.mozilla.com/D137829
For all purposes, this is the same as devicePixelRatio. It was meant to
skip the resistFingerprinting check the devicePixelRatio getter does,
but we do that now using CallerType in WebIDL, so if we cared about that
for these tests (which we don't) we could just do
SpecialPowers.wrap(window).devicePixelRatio.
As a follow-up we could move the NoOverride to window for symmetry. But
it's only used by devtools touch simulation so not sure if worth it.
Differential Revision: https://phabricator.services.mozilla.com/D138021
Some versions of Mesa, including what we currently use on CI, need more
stack memory than what we currently provide on the compositor thread
order to pass the WebGL test suite in out-of-process mode. This patch
increases the limit and re-enables the previously broken tests.
(Testing on Try found that the threshold was somewhere between 384k and
448k, but because this depends on a third-party library and content
controlled input, I'm giving it 512k so we have some safety margin.)
Differential Revision: https://phabricator.services.mozilla.com/D137787
Similar to PWebGL, we want PCanvasManager to manage the PWebGPU
protocol. This will allow us to reuse the machinery that works for both
the main thread, and arbitrary worker threads to create PWebGPU
protocols.
For now, the only owner is still the main thread, so it should work very
similarly as to how it does with PCompositorBridge.
This patch also introduces some quality of life changes, such as making
the protocol ref-counted, and avoiding respinning the wheel for
CanSend() for IPDL actors.
Differential Revision: https://phabricator.services.mozilla.com/D134097
This simplifies the logic around descriptors significantly, which is
especially useful considering how few places use the type. There is a
small change required on Windows to create the NamedPipe directly and
transfer around each end's handle, rather than connecting between
processes after the fact.
A named pipe has to be used, rather than an anonymous pipe, as
bidirectional communication is required.
Differential Revision: https://phabricator.services.mozilla.com/D130381
This change mitigates the gap between the external_scroll_offset informed from
the main-thread and scroll_offset informed from APZ.
Some wrench reftests for this change are in the next commit.
Differential Revision: https://phabricator.services.mozilla.com/D133444
The reason why the global ScrollGeneration::sCounter doesn't work for the APZ
case is the APZ sampler thread is per our top level browser window, so if
there are multiple browser windows at the same time, the sCounter will be muted
from different sampler threads.
Differential Revision: https://phabricator.services.mozilla.com/D133439
This was originally implemented in D111662 to work around a crash
that otherwise would get triggered in Gnome/Mutter.
The fix for that crash has been available for a couple of month now,
thus remove the workaround again.
This may help preventing flickering for menus in certain situations.
Differential Revision: https://phabricator.services.mozilla.com/D137251
Rounding the values to integers appears to be to aggressive in some
situations. The rounding was introduced to work around floating
point errors under the assumption that we'd always end up with integer
values. However, apparently we sometimes compute values `<0.5`, making
us end up with `0`, triggering a protocol errer.
Replace that with an intersection with the actual buffer size. This
will slightly increase wire communication, but should avoid the
crashes.
Differential Revision: https://phabricator.services.mozilla.com/D137249
It's valid to move their computations before the recursion because
their inputs (the current display item and the stacking context's
deferred transform item) do not change over the course of the
recursive call.
Depends on D136827
Differential Revision: https://phabricator.services.mozilla.com/D136828
We are moving its computation before a balanced pair of
mAsrStack.push_back() and mAsrStack.pop() calls, so its
value remains the same.
Depends on D136826
Differential Revision: https://phabricator.services.mozilla.com/D136827
Previously the process startup timestamp was computed lazily but this caused
some issues with some of our static analysis infra (see bug 1678152). This
moves computing the timestamp early during process startup and makes it happen
unconditionally.
Differential Revision: https://phabricator.services.mozilla.com/D136406
WebRender retains about 50MBs of memory for every window. When switching
between lots of tabs on Android (where 1 tab = 1 window), this can cause
problems as the app will consume a significant amount of memory.
To avoid this problem, we send a memory pressure event whenever a session is
deactivated, which signals to WebRender that it should deallocate the memory.
This message is sent on a delay of 10s to avoid interfering with tab switching,
and we cancel the message if the tab becomes active again before we fire the
memory pressure event.
Co-Authored-By: Cathy Lu <calu@mozilla.com>
Co-Authored-By: Jonathan Almeida [:jonalmeida] <jonalmeida942@gmail.com>
Differential Revision: https://phabricator.services.mozilla.com/D136965
Confirmed that it fails without the patch with:
```
Expected at least one hit result in the APZTestData
SimpleTest.ok@SimpleTest/SimpleTest.js:417:16
runSubtestsSeriallyInFreshWindows/</advanceSubtestExecution/spawnTest/w.ok@gfx/layers/apz/test/mochitest/apz_test_utils.js:479:32
hitTest@gfx/layers/apz/test/mochitest/apz_test_utils.js:833:5
test@gfx/layers/apz/test/mochitest/helper_hittest_fixed_item_over_oop_iframe.html:48:18
```
Differential Revision: https://phabricator.services.mozilla.com/D136912
This is no longer necessary as the Quantum DOM project is no longer
happening, and removing support simplifies various components inside of
IPDL.
As some code used the support to get a `nsISerialEventTarget` for an
actor's worker thread, that method was replaced with a method which
instead pulls the nsISerialEventTarget from the MessageChannel and
should work on all actors.
Differential Revision: https://phabricator.services.mozilla.com/D135411
Previously the process startup timestamp was computed lazily but this caused
some issues with some of our static analysis infra (see bug 1678152). This
moves computing the timestamp early during process startup and makes it happen
unconditionally.
Differential Revision: https://phabricator.services.mozilla.com/D136406
Recently DMABuf modifiers are not correctly implemented for YUV surfaces - when a modifier is present we use it
for all planes to create EGLImage.
In this patch we import modifiers for all used YUV planes and use them correctly for particular planes.
Differential Revision: https://phabricator.services.mozilla.com/D136782
This implements a customized buffer allocator for ffmpeg decoder to allow it to store decoded data on shmem, so decoded data can be shared with the compositor process without doing extra copy.
As ffmpeg decoder needs a special alignment which will be larger than the actual image, we would need to crop the planes by telling plane descriptor correct place size in order to display image correctly.
Otherwise, showing a larger image causes visible incorrect border on the right and bottom of the actual image.
This will help improve the performance of software decoding while using ffmpeg and ffvpx, which is about h264 and vpx on Linux, vpx on Windows and MacOS.
Inaddition, Here is a result [1] showing that how much improvement using shmem can help.
[1] https://bit.ly/3dy4rya
Differential Revision: https://phabricator.services.mozilla.com/D130220
Make VideoFramePool thread safe to avoid multiple access during software decode to DMABuf:
- Create Mutex for VideoFramePool access
- Mark surface as used when it's provided by VideoFramePool to avoid race conditions.
Differential Revision: https://phabricator.services.mozilla.com/D135557
This implements a customized buffer allocator for ffmpeg decoder to allow it to store decoded data on shmem, so decoded data can be shared with the compositor process without doing extra copy.
As ffmpeg decoder needs a special alignment which will be larger than the actual image, we would need to crop the planes by telling plane descriptor correct place size in order to display image correctly.
Otherwise, showing a larger image causes visible incorrect border on the right and bottom of the actual image.
This will help improve the performance of software decoding while using ffmpeg and ffvpx, which is about h264 and vpx on Linux, vpx on Windows and MacOS.
Inaddition, Here is a result [1] showing that how much improvement using shmem can help.
[1] https://bit.ly/3dy4rya
Differential Revision: https://phabricator.services.mozilla.com/D130220