This patch just adds the plumbing to allow for baked in blocklist rules
or the downloadable blocklist to prevent certain configurations from
getting WebGPU. It does not add any rules.
It also changes us from allowing WebGPU only in nightly, including
tests, to not release and not beta. This allows try to run the WebGPU
tests as expected, since even try builds forked from mozilla-central are
not considered nightly builds by CI (or so it seems).
Differential Revision: https://phabricator.services.mozilla.com/D141682
A better solution would check against the same value as reported by gfxConfig, which takes the pref as well as whether webgpu was blocked (for example due to buggy drivers) into account, but this still is good sanity check and easy to uplift.
Differential Revision: https://phabricator.services.mozilla.com/D140535
A better solution would check against the same value as reported by gfxConfig, which takes the pref as well as whether webgpu was blocked (for example due to buggy drivers) into account, but this still is good sanity check and easy to uplift.
Differential Revision: https://phabricator.services.mozilla.com/D140535
SimulateDeviceReset() works differently from actual device reset handling. It seems better to make SimulateDeviceReset() more similar to actual device reset handling.
Differential Revision: https://phabricator.services.mozilla.com/D140161
Automatically generated rewrites of all ParamTraits and IPDLParamTraits
implementations in-tree to use IPC::Message{Reader,Writer}.
Differential Revision: https://phabricator.services.mozilla.com/D140004
We noticed a cold_view_nav_start regression on Fenix from enabling the
GPU process, and profiles showed time spent synchronously waiting for
the GPU process to launch. This occured because the compositor was
being created in nsWindow::Create, and as it requires the GPU process
to be running it had to block until launch completed. The process is
launched when the gfxPlatform is first initialized, but that was only
occuring immediately prior to creating the compositor, which did not
give it enough time to complete asynchronously.
This patch makes it so that we initialize the gfxPlatform slightly
earlier, and importantly delay creating the compositor until it is
actually required. This gives the process enough time to launch
asynchronously meaning we do not have to block.
We started deliberately creating the compositor early on Android
because of bug 1453501, to avoid a race condition where the compositor
didn't exist when RemoteLayerTreeOwner::Initialize was called, causing
us to use a basic layer manager. However, since bug 1741156 landed we
now create the compositor on-demand, meaning this is no longer a
possibility.
Delaying compositor creation can, however, uncover another race
condition. If the UICompositorControllerChild is opened on the UI
thread before the main thread is able to set its pointer to the
widget, then the java GeckoSession will never be notified that the
compositor has been opened, and composition will never be
resumed. This patch fixes this issue by setting the
UiCompositorControllerChild's widget pointer in its constructor rather
than immediately afterwards.
Differential Revision: https://phabricator.services.mozilla.com/D139842
We noticed a cold_view_nav_start regression on Fenix from enabling the
GPU process, and profiles showed time spent synchronously waiting for
the GPU process to launch. This occured because the compositor was
being created in nsWindow::Create, and as it requires the GPU process
to be running it had to block until launch completed. The process is
launched when the gfxPlatform is first initialized, but that was only
occuring immediately prior to creating the compositor, which did not
give it enough time to complete asynchronously.
This patch makes it so that we initialize the gfxPlatform slightly
earlier, and importantly delay creating the compositor until it is
actually required. This gives the process enough time to launch
asynchronously meaning we do not have to block.
We started deliberately creating the compositor early on Android
because of bug 1453501, to avoid a race condition where the compositor
didn't exist when RemoteLayerTreeOwner::Initialize was called, causing
us to use a basic layer manager. However, since bug 1741156 landed we
now create the compositor on-demand, meaning this is no longer a
possibility.
Delaying compositor creation can, however, uncover another race
condition. If the UICompositorControllerChild is opened on the UI
thread before the main thread is able to set its pointer to the
widget, then the java GeckoSession will never be notified that the
compositor has been opened, and composition will never be
resumed. This patch fixes this issue by setting the
UiCompositorControllerChild's widget pointer in its constructor rather
than immediately afterwards.
Differential Revision: https://phabricator.services.mozilla.com/D139842
The changes in this patch stack will change how IPDL generated files are built,
such that these issues are now surfaced, especially during non-unified builds.
Differential Revision: https://phabricator.services.mozilla.com/D137165
Now that we send everything (except sometimes the user value
is sanitized) we should no longer perform this check.
This is also good because it eliminates security code you
have to have (and thus accidently omitting it is a
vulnerability) and changes it to security code that happens
automatically, and is enforced by the compiler (via mandatory
ctor argument.)
Differential Revision: https://phabricator.services.mozilla.com/D138684
This simplifies the number of negations needed,
and makes things easy to understand. I think
anyway; I know that without renaming it I made
several annoying-to-diagnose negation errors...
Differential Revision: https://phabricator.services.mozilla.com/D138682
While we do so, also add a boolean argument to indicate
if we are in a _Content_ process or some other type of
subprocess, which I expect we will need later.
Differential Revision: https://phabricator.services.mozilla.com/D138681
We want to eventually crash if a blocklisted preference
is accessed. In order to do this, we do need to
populate the preference in the pref hashmap of the
subprocess; otherwise when we look up a blocklisted
pref we will just not find anything. We could try to
put the blocklist check at that point; but this won't
work for StaticPrefs; we'd also need to put the blocklist
check there.
Performing a list iteration and string comparison on
every Static Pref call is not acceptable when we can
just populate a bit and check it.
Differential Revision: https://phabricator.services.mozilla.com/D138679
If the android system kills the GPU process to free memory while the
app is in the background, then we want to avoid immediately restarting
the GPU process.
To achieve this, we make GPUProcessManager keep track of whether it is
in the foreground or background. If HandleProcessLost() gets called
while in the background then we destroy the existing compositor
sessions as before, but return early instead of immediately
relaunching the process. If the process has not been launched when the
app later gets foregrounded then we do so then.
The final part of HandleProcessLost(), which reinitializes the content
bridges and emits the "compositor-reinitialized" signal, has been
moved to a new function ReinitializeRendering(). If the GPU process
has been disabled, this gets called as-before at the end of
HandleProcessLost(). When the GPU process is enabled, however, we now
call it from OnProcessLaunchComplete(), so that it gets called
regardless of whether the process is launched immediately or after a
delay.
While we're here, rename the functions RebuildRemoteSessions() and
RebuildInProcessSessions() to DestroyRemoteCompositorSessions() and
DestroyInProcessCompositorSessions(), to better reflect what they
actually do: the "rebuilding" part occurs later on. Also update the
mega-comment documenting the restart sequence, as it was somewhat
outdated.
In case a caller of EnsureGPUReady() gets called before the foreground
signal arrives (eg in nsBaseWidget::CreateCompositorSession() due to a
refresh tick paint), make EnsureGPUReady() launch the GPU process
itself if the GPU process is enabled but not yet launched. As a
consequence, to avoid launching the GPU process unnecessarily, change
a couple callers of EnsureGPUReady() to simply check whether the
process is enabled instead.
Additionally, guard against a null pointer deref if the compositor has
been destroyed when the widget receives a memory pressure event. This
is now more likely to occur as there may be a gap between the
compositor being destroyed and recreated.
Differential Revision: https://phabricator.services.mozilla.com/D139042
This patch removes more main thread dependencies from the content side
of WebGPU. Instead of issuing a resource update for an external image,
we now use an async image pipeline in conjunction with
CompositableInProcessManager from part 1. This allows us to update the
HTMLCanvasElement bound to the WebGPU device without having to go
through the main thread, or even the content process after the swap
chain update / readback has been requested.
Differential Revision: https://phabricator.services.mozilla.com/D138887
For WebGPU, we produce the textures in the compositor process and the
content process doesn't need to be that involved except for hooking up
the texture to the display list. Currently this is done via an external
image ID.
Given that WebGPU needs to work with OffscreenCanvas, it would be best
if its display pipeline was consistent whether it was gotten from an
HTMLCanvasElement, OffscreenCanvas on the main thread, or on a worker
thread. As such, using an AsyncImagePipeline would be best.
However there is no real need to bounce the handles across process
boundaries. Hence this patch which adds CompositableInProcessManager.
This static class is responsible for collecting WebRenderImageHost
objects backed by TextureHost objects which do not leave the compositor
process. This will allow WebGPUParent to schedule compositions directly
in future patches.
Differential Revision: https://phabricator.services.mozilla.com/D138588
If the android system kills the GPU process to free memory while the
app is in the background, then we want to avoid immediately restarting
the GPU process.
To achieve this, we make GPUProcessManager keep track of whether it is
in the foreground or background. If HandleProcessLost() gets called
while in the background then we destroy the existing compositor
sessions as before, but instead of immediately relaunching the process
then reinitializing content bridges, just set a flag saying we need to
do so. If the flag is set when the app later gets foregrounded then we
continue where we left off.
While we're here, rename the functions RebuildRemoteSessions() and
RebuildInProcessSessions() to DestroyRemoteCompositorSessions() and
DestroyInProcessCompositorSessions(), to better reflect what they
actually do: the "rebuilding" part occurs later on. Also update the
mega-comment documenting the restart sequence, as it was somewhat
outdated.
In case a caller of EnsureGPUReady() gets called before the foreground
signal arrives (eg in nsBaseWidget::CreateCompositorSession() due to a
refresh tick paint), make EnsureGPUReady() launch the GPU process
itself if the GPU process is enabled but not yet launched.
Additionally, guard against a null pointer deref if the compositor has
been destroyed when the widget receives a memory pressure event. This
is now more likely to occur as there may be a gap between the
compositor being destroyed and recreated.
Differential Revision: https://phabricator.services.mozilla.com/D139042
This patch removes more main thread dependencies from the content side
of WebGPU. Instead of issuing a resource update for an external image,
we now use an async image pipeline in conjunction with
CompositableInProcessManager from part 1. This allows us to update the
HTMLCanvasElement bound to the WebGPU device without having to go
through the main thread, or even the content process after the swap
chain update / readback has been requested.
Differential Revision: https://phabricator.services.mozilla.com/D138887
For WebGPU, we produce the textures in the compositor process and the
content process doesn't need to be that involved except for hooking up
the texture to the display list. Currently this is done via an external
image ID.
Given that WebGPU needs to work with OffscreenCanvas, it would be best
if its display pipeline was consistent whether it was gotten from an
HTMLCanvasElement, OffscreenCanvas on the main thread, or on a worker
thread. As such, using an AsyncImagePipeline would be best.
However there is no real need to bounce the handles across process
boundaries. Hence this patch which adds CompositableInProcessManager.
This static class is responsible for collecting WebRenderImageHost
objects backed by TextureHost objects which do not leave the compositor
process. This will allow WebGPUParent to schedule compositions directly
in future patches.
Differential Revision: https://phabricator.services.mozilla.com/D138588
When the (off-by-default) pref webgl.enable-ahardwarebuffer is
enabled, we use AHardwareBuffers rather than SurfaceTextures for webgl
on Android. Some users have enabled this pref and their browser is now
crashing since the GPU process was enabled.
The crash occurs because we have not initialized the
AndroidHardwareBufferApi instance to load the NDK function
pointers. This is performed in gfxPlatform in the parent process, but
because the GPU process does not have a gfxPlatform we must do this in
GPUParent as well. We must also initialize the
AndroidHardwareBufferManager, as is done by gfxPlatform.
Differential Revision: https://phabricator.services.mozilla.com/D138463
This simplifies the logic around descriptors significantly, which is
especially useful considering how few places use the type. There is a
small change required on Windows to create the NamedPipe directly and
transfer around each end's handle, rather than connecting between
processes after the fact.
A named pipe has to be used, rather than an anonymous pipe, as
bidirectional communication is required.
Differential Revision: https://phabricator.services.mozilla.com/D130381
Bug 1742985 added the test helper_zoom_after_gpu_process_restart.html,
but it doesn't actually get run on any platform with the GPU process
enabled. (Due to bug 1495580 on windows, and because the GPU process
isn't yet enabled on android.)
The test kills the GPU process, then tries to wait for it to be
restarted before proceeding. However, the function
ensureGPUProcessReadyForTests doesn't always work as intended, as the
GPUProcessManager may not have yet noticed that the process has been
killed, and therefore may return immediately from EnsureGPUReady.
This patch removes the buggy ensureGPUProcessReadyForTests function,
and instead makes the test wait for the "compositor-reinitialized"
topic to be observed.
Differential Revision: https://phabricator.services.mozilla.com/D138125
`base::KillProcess`, with the `wait` parameter set to true, does a
bounded blocking wait for the process to exit by polling and sleeping in
a loop, with ad-hoc parameters. The only user of that case is the Gecko
Media Plugin code, which doesn't actually need it as discussed in bug
(comments 4-6); also, currently it's blocking the IPC I/O thread in the
parent process, which is not good for browser responsiveness.
Accordingly, this patch deletes that code and removes the parameter.
Differential Revision: https://phabricator.services.mozilla.com/D136662
Similar to PWebGL, we want PCanvasManager to manage the PWebGPU
protocol. This will allow us to reuse the machinery that works for both
the main thread, and arbitrary worker threads to create PWebGPU
protocols.
For now, the only owner is still the main thread, so it should work very
similarly as to how it does with PCompositorBridge.
This patch also introduces some quality of life changes, such as making
the protocol ref-counted, and avoiding respinning the wheel for
CanSend() for IPDL actors.
Differential Revision: https://phabricator.services.mozilla.com/D134097
On the path that we need to read the pixels from the canvas for display
purposes, instead of using a platform buffer handle, we need to take
into account the need to y-flip for WebGL. This patch ensures that we do
so.
Differential Revision: https://phabricator.services.mozilla.com/D136504
This simplifies the logic around descriptors significantly, which is
especially useful considering how few places use the type. There is a
small change required on Windows to create the NamedPipe directly and
transfer around each end's handle, rather than connecting between
processes after the fact.
A named pipe has to be used, rather than an anonymous pipe, as
bidirectional communication is required.
Differential Revision: https://phabricator.services.mozilla.com/D130381
On the path that we need to read the pixels from the canvas for display
purposes, instead of using a platform buffer handle, we need to take
into account the need to y-flip for WebGL. This patch ensures that we do
so.
Differential Revision: https://phabricator.services.mozilla.com/D136504
On the path that we need to read the pixels from the canvas for display
purposes, instead of using a platform buffer handle, we need to take
into account the need to y-flip for WebGL. This patch ensures that we do
so.
Differential Revision: https://phabricator.services.mozilla.com/D136504
Add KillGPUProcessForTests, which kills the GPU process without
generating a crash dump (unlike the existing CrashGPUProcessForTests).
Additionally add EnsureGPUProcessReadyForTests, which returns a
promise that resolves to true when the GPU process is enabled and
ready, and false if it is disabled. If called while the GPU process is
being (re)started, it will not resolve until it has finished launching
(or was disabled due to error).
Finally, make GPUProcessHost::IsConnected check whether the process
handle is valid. This ensures it returns false immediately following a
call to KillProcess but prior to the GPUChild being destroyed. This
means tests can call EnsureGPUProcessReadyForTests immediately after
KillGPUProcessForTests or CrashGPUProcessForTests, and it will
reliably wait for the new process to launch, as intended.
Depends on D135207
Differential Revision: https://phabricator.services.mozilla.com/D135328
Following a GPU process restart ZoomConstraints do not currently get
set for the newly recreated APZCTreeManagers, meaning it is no longer
possible to asynchronously zoom pages.
To solve this, we make ZoomConstraintsClient observe a new
"compositor-reinitialized" topic. We send this notification in
GPUProcessManager::HandleProcessLost() to notify
ZoomConstraintsClients for parent process documents, and in
ContentChild::RecvReinitRendering() for documents in their respective
content processes. This must be performed after the compositor has
been reinitialized so that the APZCTreeManagerChild is able to send
the constraints to the APZCTreeManagerParent in the compositor
process.
Differential Revision: https://phabricator.services.mozilla.com/D135207
Right now, if we wanted to get a snapshot of an OffscreenCanvas context
on a worker thread from the main thread, we would need to block the main
thread on the worker thread, and from the worker thread, issue a sync
IPC call get the front buffer snapshot.
This patch adds an alternative which allows the main thread to go
directly to the canvas owning thread in the compositor process to get
the snapshot, thereby bypassing the worker thread in content process
entirely. All it needs as the unique ID of the CanvasManagerChild
instance, and the protocol ID of the WebGLChild instance.
This will be used for Firefox screenshots, New Tab tiles, and printing.
Differential Revision: https://phabricator.services.mozilla.com/D130785
VsyncChild is main thread only, and we would like to reuse PVsync on the
worker threads via PBackgroundChild which already implements it. This
patch does the necessary refactoring to have multiple implementations of
PVsyncChild.
Differential Revision: https://phabricator.services.mozilla.com/D130264
Right now, if we wanted to get a snapshot of an OffscreenCanvas context
on a worker thread from the main thread, we would need to block the main
thread on the worker thread, and from the worker thread, issue a sync
IPC call get the front buffer snapshot.
This patch adds an alternative which allows the main thread to go
directly to the canvas owning thread in the compositor process to get
the snapshot, thereby bypassing the worker thread in content process
entirely. All it needs as the unique ID of the CanvasManagerChild
instance, and the protocol ID of the WebGLChild instance.
This will be used for Firefox screenshots, New Tab tiles, and printing.
Differential Revision: https://phabricator.services.mozilla.com/D130785
VsyncChild is main thread only, and we would like to reuse PVsync on the
worker threads via PBackgroundChild which already implements it. This
patch does the necessary refactoring to have multiple implementations of
PVsyncChild.
Differential Revision: https://phabricator.services.mozilla.com/D130264
Add a function to GPUProcessManager to force the GPU process to crash,
and expose it through gfxInfo. Expose this to geckoview tests via the
test-support webextension.
Add a junit test GpuCrashTest, which triggers a GPU process crash and
ensures the crash reporter was notified.
Additionally, ensure the TestCrashHandler service is stopped in
between tests, as otherwise only the first crash test to run will be
notified of the crash.
Differential Revision: https://phabricator.services.mozilla.com/D132812
GPU process crash reports are handled by calling GenerateCrashReport()
in GPUChild::ActorDestroy() if the reason is AbnormalShutdown. This
ensures we only create crash report if the process actually crashed,
and not when it was deliberately stopped.
However, sometimes actors other than GPUChild are the first to be
destroyed immediately after a crash, for example CompositorBridgeChild
or UiCompositorControllerChild. If such an actor receives an
ActorDestroy message with AbnormalShutdown as the reason, they will
call GPUProcessManager::NotifyRemoteActorDestroyed(), which leads to
GPUProcessHost::Shutdown(), which will close the PGPU channel. This
creates a race condition after a GPU process crash, where sometimes
the channel gets closed gracefully and ActorDestroy will receive a
NormalShutdown reason rather than AbnormalShutdown.
This patch adds a flag to GPUProcessHost::Shutdown() indicating
whether it is being called in response to an unexpected shutdown being
detected by another actor. If set, it sets a flag on the
GPUChild. When GPUChild::ActorDestroy() eventually gets called, it
knows to act in response to a crash if either the reason is
AbnormalShutdown or this flag has been set.
Differential Revision: https://phabricator.services.mozilla.com/D132811
Rename ContentCrashHandler.jsm to ChildCrashHandler.jsm as it is now
responsible for all types of child process crashes. Have it observe
"compositor:process-aborted" in addition to
"ipc:content-shutdown". Additionally, rename the
"GeckoView:ContentCrashReport" event it sends to
"GeckoView:ChildCrashReport".
In GPUChild::ActorDestroy, provide an out variable for
GenerateCrashReport to return the dump ID, and stuff this in to a
property bag, along with "abnormal: true", sent to
"compositor:process-aborted" observers.
In ChildCrashHandler, set the "processType" argument sent with the
GeckoView:ChildCrashReport event to BACKGROUND_CHILD for GPU process
crashes, and FOREGROUND_CHILD otherwise.
Differential Revision: https://phabricator.services.mozilla.com/D132810
In order to render text using Skia (as webrender does for blob images)
we must ensure that the Freetype library has been initialized. In the
parent process this is done by gfxPlatform, but the GPU process does
not have a gfxPlatform so we should do so in GPUParent instead. We
already did this on Gtk, but this patch makes us do so on Android too.
Differential Revision: https://phabricator.services.mozilla.com/D131233
This patch ensures that, following a GPU process crash, we
re-initialize the compositor and resume painting on Android.
nsWindow::GetWindowRenderer() is made to always reinitialize the
window renderer if there is none, like on other platforms. We
therefore no longer need to track whether webrender is being disabled,
as this is no longer a special case.
Previously we started the compositor as initially paused in
nsBaseWidget::CreateCompositorSession only if the widget did not yet
have a surface. Now we must unconditionally (re)start it as initially
paused, as even though the widget in the parent process may have a
surface, we will not have been able to send it to the GPU process
yet. We will send the surface to the compositor once control flow
returns to nsWindow::CreateLayerManager, where we will also now resume
the compositor if required.
Finally, we must ensure that we manually trigger a paint, both in the
parent and content processes. On other platforms this occurs
automatically following a GPU process loss through various refresh
driver events. On Android, however, nothing causes the refresh driver
to paint by itself, and we cannot receive input without first
initializing our APZ controllers, which does not happen until the
compositor receives a display list. We therefore must manually
schedule a paint. We do so from nsWindow::NotifyCompositorSessionLost
for the parent process, and BrowserChild::ReinitRendering for content
processes.
Differential Revision: https://phabricator.services.mozilla.com/D131232
Declare a GPU process and corresponding Service in the
AndroidManifest. This is of a new class GeckoServiceGpuProcess which
inherits from GeckoServiceChildProcess, and provides a binder
interface ICompositorSurfaceManager which allows the parent process to
set the compositor Surface for a given widget ID, and the compositor
in the GPU process to look up the Surface for a widget ID. The
ICompositorSurfaceManager interface is exposed to the parent process
through a new method getCompositorSurfaceManager() in the
IChildProcess interface.
Add a new connection type for GPU processes to GeckoProcessManager,
along with a function to look up the GPU process connection and fetch
the ICompositorSurfaceManager binder. When the GPU process is launched
we store the returned binder in the GPUProcessHost, and when each
widget's compositor is created we store a reference to the binder in
the UiCompositorControllerChild.
Each nsWindow is given a unique ID, and whenever the Surface changes
due to an Android resume event, it sends the new surface for that ID
to the GPU process (if enabled) by calling
ICompositorSurfaceManager.onSurfaceChanged().
Stop inheriting AndroidCompositorWidget from InProcessCompositorWidget
and instead inherit from CompositorWidget directly. This class holds a
reference to the Surface that will be rendered in to. The
CompositorBridgeParent notifies the CompositorWidget whenever it has
been resumed, allowing it to fetch the new Surface. For the
cross-process CompositorWidgetParent implementation it fetches that
Surface from the CompositorSurfaceManagerService, whereas the
InProcessAndroidCompositorWidget can read it directly from the real
widget.
AndroidCompositorWidget::GetClientSize() can now calculate its size
from the Surface, rather than racily reading the value from the
nsWindow. This means RenderCompositorEGL and RenderCompositorOGLSWGL
can now use GetClientSize() again rather than querying their own size
from the Surface.
With this patch, setting layers.gpu-process.enabled to true will cause
us to launch a GPU process and render from it. We do not yet
gracefully recover from a GPU process crash, nor can we render
anything using SurfaceTextures (eg video or webgl). Those will come in
future patches.
Differential Revision: https://phabricator.services.mozilla.com/D131231
On Android the APZ controller thread is the android UI thread, rather
than the Gecko main thread as on other platforms. There some places
where the main thread requires to call IAPZCTreeManager functions that
must run on the controller thread. Currently we use the function
DispatchToControllerThread() prior to calling various IAPZCTreeManager
APIs to achieve this.
This works just fine for now, as there is no GPU process on
Android. However, once we do add a GPU process we will encounter
issues:
Firstly, there will now be a cross-process APZInputBridge rather than
using an in-process APZCTreeManager. The PAPZInputBridge protocol is
managed by PGPU, and therefore must run on the main thread in the
parent process. The input we require to send over the bridge, however,
originates from the UI thread.
To solve this we can convert PAPZInputBridge to a top-level protocol,
and bind it to the UI thread on Android. We can then send input
directly from the UI thread without issues.
Secondly, the PAPZCTreeManager protocol must also run from the main
thread in the parent process, as it is managed by
PCompositorBridge. Unlike PAPZInputBridge we cannot convert
PAPZCTreeManager in to a top level protocol, as it relies on the
ordering guarantees with PCompositorBridge.
We must therefore ensure that we only dispatch IAPZCTreeManager calls
to the controller thread when using an in-process
APZCTreeManager. Out-of-process calls, on the other hand, must be
dispatched to the main thread where we can send IPDL commands from. To
do this, we move the dispatch logic away from the callsites of
IAPZCTreeManager APIs, and in to the APZCTreeManager and
APZCTreeManagerChild implementations themselves.
Differential Revision: https://phabricator.services.mozilla.com/D131120
Current code is not explicit about device recreation before session re-creation. It is actually done by nsWindow::OnPaint() before OnInProcessDeviceReset() call. But it is not explicit.
gfxWindowsPlatform::HandleDeviceReset() does d3d device re-creation if it is necessary.
Differential Revision: https://phabricator.services.mozilla.com/D130957
If GL is threadsafe, we can run on the compositor thread. This appears
to have performance benefits, possibly because the renderer thread is
too busy. If GL is not threadsafe, we must run the WebGL OOP instances
on the renderer thread.
At the time of writing, only the nouveau drivers on Linux are considered
to be not threadsafe, so most users will see WebGL running on the
compositor thread. This patch also adds prefers to override the
blocklist to either assume GL is threadsafe
(webgl.threadsafe-gl.force-enabled) and not threadsafe
(webgl.threadsafe-gl.force-disabled).
Differential Revision: https://phabricator.services.mozilla.com/D130634
This patch adds the necessary IPDL plumbing to allow us to create WebGL
instances off the main thread in the content process, and to execute
them on the Renderer thread in the compositor process.
Differential Revision: https://phabricator.services.mozilla.com/D127839
This may be another source of failing OOP iframe document print. I will try to
see whether it can be reproducible locally later.
Differential Revision: https://phabricator.services.mozilla.com/D121075
+ Begin to add video tests to ensure we ratchet towards correctness.
+ Test rec709 x (yuv420p, yuv420p10, gbrp) x (tv, pc) x codecs.
+ Just mark fuzziness for now. Better would be e.g. 16_127_233 'bad
references'.
Differential Revision: https://phabricator.services.mozilla.com/D115298
+ Begin to add video tests to ensure we ratchet towards correctness.
+ Test rec709 x (yuv420p, yuv420p10, gbrp) x (tv, pc) x codecs.
+ Just mark fuzziness for now. Better would be e.g. 16_127_233 'bad
references'.
Differential Revision: https://phabricator.services.mozilla.com/D115298
This extends on the changes in part 12a and consumes the new PortRef-based API
in all existing process types other than the fork server. The IPDL C++ unit
tests were already broken before this change, and were not updated.
Differential Revision: https://phabricator.services.mozilla.com/D112777
This extends on the changes in part 12a and consumes the new PortRef-based API
in all existing process types other than the fork server. The IPDL C++ unit
tests were already broken before this change, and were not updated.
Differential Revision: https://phabricator.services.mozilla.com/D112777
+ Begin to add video tests to ensure we ratchet towards correctness.
+ Test rec709 x (yuv420p, yuv420p10, gbrp) x (tv, pc) x codecs.
+ Just mark fuzziness for now. Better would be e.g. 16_127_233 'bad
references'.
Differential Revision: https://phabricator.services.mozilla.com/D115298
+ Begin to add video tests to ensure we ratchet towards correctness.
+ Test rec709 x (yuv420p, yuv420p10, gbrp) x (tv, pc) x codecs.
+ Just mark fuzziness for now. Better would be e.g. 16_127_233 'bad
references'.
Differential Revision: https://phabricator.services.mozilla.com/D115298
This is just cleanup while I was looking at why the test still fails
sometimes locally even with my patch. It is a bit racy because
BrowserChild's visible rect is not set by the time we do the
cross-process paint, so we clip everything...
This doesn't affect users and I have some upcoming PTO though, so will
look into it when I'm back.
Differential Revision: https://phabricator.services.mozilla.com/D116924
Originally, we would restart the GPU process a fixed number of attempts
based on the layers.gpu-process.max_restarts pref. With this patch, we
now use this pref to control how many "unstable" restarts we allow. A
restart is "stable" if and only if the process uptime exceeds the pref
layers.gpu-process.stable.min-uptime-ts and if the process renders a
total number of frames exceeding the pref
layers.gpu-process.stable.frame-threshold. This allows users to keep the
GPU process for a lot longer if they are encountering infrequent
crashes. Should the user experience the GPU process crashing quickly
and/or without rendering many frames, we will disable it as before after
a few attempts and move into the parent process.
Differential Revision: https://phabricator.services.mozilla.com/D114531
If a user is able to get D3D11, and Software WebRender hasn't been
forced on (either by the Fission experiment or our pref), then we prefer
D3D11 in late beta and release. This will allow users who start with
D3D11 in the GPU process, to fallback to Software WebRender in the GPU
process.
Differential Revision: https://phabricator.services.mozilla.com/D114286
Removes the PPluginSurface actor used for windowed plugins, as part of removing all of NPAPI plugin support. SharedDIB is then unused and is also removed.
Differential Revision: https://phabricator.services.mozilla.com/D107140
Removes the PPluginSurface actor used for windowed plugins, as part of removing all of NPAPI plugin support. SharedDIB is then unused and is also removed.
Differential Revision: https://phabricator.services.mozilla.com/D107140
Before we start a resource update, we should check if we are virtual
memory pressured (32-bit Windows only). If so, pre-emptively unmap
shared surfaces until the pressure is relieved to try to avoid OOMs
elsewhere. This only applies to the GPU process because the parent
process actively watches its own memory pressure and dispatches a
low-memory event which our expiration tracker is an observer for.
Differential Revision: https://phabricator.services.mozilla.com/D109441
Before we start a resource update, we should check if we are virtual
memory pressured (32-bit Windows only). If so, pre-emptively unmap
shared surfaces until the pressure is relieved to try to avoid OOMs
elsewhere. This only applies to the GPU process because the parent
process actively watches its own memory pressure and dispatches a
low-memory event which our expiration tracker is an observer for.
Differential Revision: https://phabricator.services.mozilla.com/D109441
Before we start a resource update, we should check if we are virtual
memory pressured (32-bit Windows only). If so, pre-emptively unmap
shared surfaces until the pressure is relieved to try to avoid OOMs
elsewhere. This only applies to the GPU process because the parent
process actively watches its own memory pressure and dispatches a
low-memory event which our expiration tracker is an observer for.
Differential Revision: https://phabricator.services.mozilla.com/D109441
Note that this patch only transforms the use of the nsDataHashtable type alias
to a directly equivalent use of nsTHashMap. It does not change the specification
of the hash key type to make use of the key class deduction that nsTHashMap
allows for in some cases. That can be done in a separate step, but requires more
attention.
Differential Revision: https://phabricator.services.mozilla.com/D106008
Inversed logic has been proven to be more difficult to read,
so use the simple positive variant.
Also add a simple sanity check for `WAYLAND_DISPLAY` so if people
set `MOZ_ENABLE_WAYLAND` in a X11 session don't get undesired behavior.
While on it, change a check for `XDG_SESSION_TYPE` to also use
`WAYLAND_DISPLAY`, improving behavior when launching FF from a TTY
or a TTY-launched session (e.g. via `weston-launch`).
`WAYLAND_DISPLAY` and `DISPLAY` are not expected to be set if
no Wayland or X11 server is available, so using them makes us behave
more predictable.
Differential Revision: https://phabricator.services.mozilla.com/D106726
GetValue is going to be removed in a subsequent patch. It is no longer needed,
because it can be replaced by functions already provided by nsBaseHashtable,
in particular Lookup and Contains.
Also, its name was confusing, since it specifically returns a pointer that
allows and is intended for modifying the entry within the hashtable, rather
than returning by-value. According to the naming rules to be set on
nsBaseHashtable, it would also needed to be renamed to "Lookup*. Removing
its uses saves this effort.
Differential Revision: https://phabricator.services.mozilla.com/D105476
This makes the naming more consistent with other functions called
Insert and/or Update. Also, it removes the ambiguity whether
Put expects that an entry already exists or not, in particular because
it differed from nsTHashtable::PutEntry in that regard.
Differential Revision: https://phabricator.services.mozilla.com/D105473
On 32-bit Windows, we see crashes related to running out of virtual
memory address space in the GPU process. Prior to this patch, we did not
report any memory status information from the GPU process to the parent,
only from the parent to the GPU process. Now if we go below the
threshold we request memory to be cleared in the parent/content
processes. This should trickle down to the GPU process by freeing shared
memory resources such as images, allowing us to unmap them out of the
GPU process sooner.
We will see similar problems on any 32-bit platform. The only other
target of note with sufficient numbers is 32-bit Android. There is no
GPU process on Android, however we only monitor the virtual memory
address space on Windows.
Differential Revision: https://phabricator.services.mozilla.com/D106286
The pref gfx.webrender.software.unaccelerated-widget.allow may be used
to allow software WebRender to be used with new windows/popups that have
transparency on Windows. Otherwise they would fallback to basic layers.
Similarly, the pref gfx.webrender.software.unaccelerated-widget.force
may be used to force software WebRender for all windows that would
fallback to basic layers.
Differential Revision: https://phabricator.services.mozilla.com/D104855
First, it should be called "Lookup" rather than "Get" because it returns
DataType (rather than UserDataType), but that would still be confusing,
since as opposed to other Lookup* methods, it does not return a DataType&
(and obviously, it can't). So "Extract" seems to be a better name, cf.
mozilla::Maybe::extract.
Differential Revision: https://phabricator.services.mozilla.com/D105471
The pref gfx.webrender.software.unaccelerated-widget.allow may be used
to allow software WebRender to be used with new windows/popups that have
transparency on Windows. Otherwise they would fallback to basic layers.
Similarly, the pref gfx.webrender.software.unaccelerated-widget.force
may be used to force software WebRender for all windows that would
fallback to basic layers.
Differential Revision: https://phabricator.services.mozilla.com/D104855
The pref gfx.webrender.software.unaccelerated-widget.allow may be used
to allow software WebRender to be used with new windows/popups that have
transparency on Windows. Otherwise they would fallback to basic layers.
Similarly, the pref gfx.webrender.software.unaccelerated-widget.force
may be used to force software WebRender for all windows that would
fallback to basic layers.
Differential Revision: https://phabricator.services.mozilla.com/D104855
We can disable WebRender because the GPU process crashed, or we
encountered a graceful runtime error in WebRender. This patch adds two
new prefs to control how that fallback works.
gfx.webrender.fallback.software-d3d11 controls if WebRender falls back
to Software WebRender + D3D11 compositing. If true, and the user is
allowed to get Software WebRender, we will fallback to Software
WebRender with the D3D11 compositor first.
gfx.webrender.fallback.software controls if WebRender falls back to
Software WebRender. If true, and the user is allowed to get Software
WebRender, we will fallback to Software WebRender without the D3D11
compositor.
gfx.webrender.fallback.basic controls if WebRender or Software
WebRender falls back to Basic. If true, it falls back to Basic.
Otherwise it continues to use Software WebRender without the D3D11
compositor. Note that this means OpenGL on Android.
This patch also means that gfx.webrender.all=true and MOZ_WEBRENDER=1
no longer disables Software WebRender. It will still prefer (Hardware)
WebRender but we want to allow fallback to Software WebRender for
configurations that forced WebRender on.
Differential Revision: https://phabricator.services.mozilla.com/D103491
We can disable WebRender because the GPU process crashed, or we
encountered a graceful runtime error in WebRender. This patch adds two
new prefs to control how that fallback works.
gfx.webrender.fallback.software-d3d11 controls if WebRender falls back
to Software WebRender + D3D11 compositing. If true, and the user is
allowed to get Software WebRender, we will fallback to Software
WebRender with the D3D11 compositor first.
gfx.webrender.fallback.software controls if WebRender falls back to
Software WebRender. If true, and the user is allowed to get Software
WebRender, we will fallback to Software WebRender without the D3D11
compositor.
gfx.webrender.fallback.basic controls if WebRender or Software
WebRender falls back to Basic. If true, it falls back to Basic.
Otherwise it continues to use Software WebRender without the D3D11
compositor. Note that this means OpenGL on Android.
This patch also means that gfx.webrender.all=true and MOZ_WEBRENDER=1
no longer disables Software WebRender. It will still prefer (Hardware)
WebRender but we want to allow fallback to Software WebRender for
configurations that forced WebRender on.
Differential Revision: https://phabricator.services.mozilla.com/D103491
We can disable WebRender because the GPU process crashed, or we
encountered a graceful runtime error in WebRender. This patch adds two
new prefs to control how that fallback works.
gfx.webrender.fallback.software-d3d11 controls if WebRender falls back
to Software WebRender + D3D11 compositing. If true, and the user is
allowed to get Software WebRender, we will fallback to Software
WebRender with the D3D11 compositor first.
gfx.webrender.fallback.software controls if WebRender falls back to
Software WebRender. If true, and the user is allowed to get Software
WebRender, we will fallback to Software WebRender without the D3D11
compositor.
gfx.webrender.fallback.basic controls if WebRender or Software
WebRender falls back to Basic. If true, it falls back to Basic.
Otherwise it continues to use Software WebRender without the D3D11
compositor. Note that this means OpenGL on Android.
This patch also means that gfx.webrender.all=true and MOZ_WEBRENDER=1
no longer disables Software WebRender. It will still prefer (Hardware)
WebRender but we want to allow fallback to Software WebRender for
configurations that forced WebRender on.
Differential Revision: https://phabricator.services.mozilla.com/D103491
On GPU Process gfxWindowsPlatform() is not created. Then some Windows specific memory reporters are not added in GPU Process.
Differential Revision: https://phabricator.services.mozilla.com/D102252
Lack of support of (hardware) WebRender should not block Software
WebRender support. This can happen when ANGLE isn't supported, for
example.
Differential Revision: https://phabricator.services.mozilla.com/D100445
There's JS running since we save the active status till we restore it,
so arbitrary things can happen, including receiving an IPC message from
the child saying that we're now really active.
If then we restore the wrong (old) status, stuff gets confused and
sadness ensues.
Screenshotting background tabs seems to work without this so it's not
clear to me why messing with the activeness state was necessary to begin
with.
Differential Revision: https://phabricator.services.mozilla.com/D101060
Not the most elegant, but reworking the PDMFactory to be fully re-initialized would be a significant change, as only the WMFDecoderModule requires it we take some shortcuts.
Differential Revision: https://phabricator.services.mozilla.com/D100308
Upon starting the RDD and GPU process; we check their decoding capabilities and send it to the parent process. It will then broadcast this information to all content children so that RemodeDecoderModule::Supports can return if a codec is supported or not.
Differential Revision: https://phabricator.services.mozilla.com/D100306
Not the most elegant, but reworking the PDMFactory to be fully re-initialized would be a significant change, as only the WMFDecoderModule requires it we take some shortcuts.
Differential Revision: https://phabricator.services.mozilla.com/D100308
Upon starting the RDD and GPU process; we check their decoding capabilities and send it to the parent process. It will then broadcast this information to all content children so that RemodeDecoderModule::Supports can return if a codec is supported or not.
Differential Revision: https://phabricator.services.mozilla.com/D100306
Lack of support of (hardware) WebRender should not block Software
WebRender support. This can happen when ANGLE isn't supported, for
example.
Differential Revision: https://phabricator.services.mozilla.com/D100445
And have it mirror in the parent process more automatically.
The docShellIsActive setter in the browser-custom-element side needs to
be there rather than in the usual DidSet() calls because the
AsyncTabSwitcher code relies on getting an exact amount of notifications
as response to that specific setter. Not pretty, but...
BrowserChild no longer sets IsActive() on the docshell itself for OOP
iframes. This fixes bug 1679521. PresShell activeness is used to
throttle rAF as well, which handles OOP iframes nicely as well.
Differential Revision: https://phabricator.services.mozilla.com/D96072
This moves parts of IPCMessageUtils.h to two new header files and adapts
the include directives as necessary. The new header files are:
- EnumSerializer.h, which defines the templates for enum serializers
- IPCMessageUtilsSpecializations.h, which defines template specializations
of ParamTraits with extra dependencies (building upon both IPCMessageUtils.h
and EnumSerializer.h)
This should minimize the dependencies pulled in by every consumer of
IPCMessageUtils.h
Differential Revision: https://phabricator.services.mozilla.com/D94459
Aside from on Windows, we do not appear to handle device resets properly
without the GPU process. This patch adds in the necessary plumbing to
handle the device reset properly. It also ensures that whenever we check
for a device reset reason, we handle all of the reasons (e.g. not just
the NV video memory purge reset reason) to ensure they are not lost, and
handles them all consistently in the same manner.
It also tracks the number of device resets for thresholding purposes
with an in process compositor. While it will only disable WebRender on
Linux at this time, it will put a note in the critical log if the
threshold was exceeded on all platforms. This may prove useful in
evaluating whether or not we should do the same everywhere.
Differential Revision: https://phabricator.services.mozilla.com/D98705
We use MOZ_WEBRENDER to force WebRender on in our testing
infrastructure. We may silently fallback to basic during our tests if we
encounter an error and the test may pass as a result. It would be best
if we asserted we don't lose WebRender while testing.
Differential Revision: https://phabricator.services.mozilla.com/D97004
If we don't disable hardware acceleration when we lose WebRender, we
will likely fallback to OpenGL compositing. This is not a supported
configuration. We already do this step if we decide against using
WebRender during configuration, but not if it was lost at runtime.
Differential Revision: https://phabricator.services.mozilla.com/D97355
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Make some ad-hoc manual updates to `testing/marionette/client/setup.py`, `testing/marionette/harness/setup.py`, and `testing/firefox-ui/harness/setup.py`, which have hard-coded regexes that break after the reformat.
5. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Make some ad-hoc manual updates to `testing/marionette/client/setup.py`, `testing/marionette/harness/setup.py`, and `testing/firefox-ui/harness/setup.py`, which have hard-coded regexes that break after the reformat.
5. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
This moves the code for converting the set of recordings into a single bitmap into the static Start function, and will allow for other consumers to skip this.
Differential Revision: https://phabricator.services.mozilla.com/D90803
We unfortunately can't use the AsyncShutdownService in either the GPU or RDD process.
So we add a little utility class AsyncBlockers that will resolve its promise once all services have deregistered from it.
We use it to temporily suspend the RDDParent or GPUParent from killing the process, up to 10s.
This allows for cleaner shutdown as the parent process doesn't guarantee the order in which processes are killed (even though it should).
Differential Revision: https://phabricator.services.mozilla.com/D90487
We unfortunately can't use the AsyncShutdownService in either the GPU or RDD process.
So we add a little utility class AsyncBlockers that will resolve its promise once all services have deregistered from it.
We use it to temporily suspend the RDDParent or GPUParent from killing the process, up to 10s.
This allows for cleaner shutdown as the parent process doesn't guarantee the order in which processes are killed (even though it should).
Differential Revision: https://phabricator.services.mozilla.com/D90487
Use ID3D11VideoProcessor for video frame rendering.
WebRenderError::VIDEO_OVERLAY does not cause disabling WebRender. It just change gfxVars::UseWebRenderDCompVideoOverlayWin() to false.
Differential Revision: https://phabricator.services.mozilla.com/D88763
We don't know why we see initialization failures in the telemetry which
makes it hard to investigate why users aren't getting WebRender and
instead fallback to basic. Let's expose the detailed error message
WebRender already generates and puts in the critical log.
Differential Revision: https://phabricator.services.mozilla.com/D89185
By moving the few things that need to be exposed to other components
to APZPublicUtils.h, APZUtils.h becomes much less widely included
(and thus changing it triggers a quicker recompile) while retaining
most of its utilities.
Differential Revision: https://phabricator.services.mozilla.com/D87404
Out-of-process WebGL needs GfxInfo to exist in the composition
process (which is the GPU process if it exists and the parent process
otherwise). This patch enables the Linux version of that component in
the GPU process; the IPC currently used to give content processes copies
of the parent's GPU info is extended to also send it to the GPU process.
Differential Revision: https://phabricator.services.mozilla.com/D85443
Out-of-process WebGL needs GfxInfo to exist in the composition
process (which is the GPU process if it exists and the parent process
otherwise). This patch enables the Linux version of that component in
the GPU process; the IPC currently used to give content processes copies
of the parent's GPU info is extended to also send it to the GPU process.
Differential Revision: https://phabricator.services.mozilla.com/D85443
This patch uses IPDL's return feature to ensure that the memory
reporter manager won't wait for a report from a child process
that has already exited.
This fixes a memory reporter hang that can happen if a child process
exits during a memory report, when the parent half of the actor is
being held alive. (If the parent half of the actor is not being held
alive, then mMemoryReportRequest will be naturally cleared when it
goes away.)
This was happening frequently on Windows Fission AWSY because that test
does a minimize memory right before it attempts to get a memory report,
and the preallocated content process exits when it sees a message to
minimize memory.
Differential Revision: https://phabricator.services.mozilla.com/D85499
The next patch converts the memory reporting architecture to use the "returns"
feature of IPDL, and mozilla::ipc::RejectCallback does not have a return
type, so this patch removes the return value.
FinishReportingCallback::Callback() needs to remain an XPCOM method
that returns NS_OK because it is called from JS during testing.
Differential Revision: https://phabricator.services.mozilla.com/D85498
* Use clearer pref names.
* Default (and only support) IPDL dispatching.
* Make DispatchCommands async-only.
* Sync ipdl command per sync webgl entrypoint.
* Eat the boilerplate cost, since there's not too many.
* Run SerializedSize off same path as Serialize.
* All shmem uploads go through normal DispatchCommands.
* Defer pruning of dead code for now so we can iterate quickly.
* Use Read/Write(begin,end) instead of (begin,size).
* This would have prevented a bug where we read/wrote N*sizeof(T)*sizeof(T).
Differential Revision: https://phabricator.services.mozilla.com/D81495
Similar to ANGLE and WebGL, we should be checking if there is a device
reset after a render pass via the glGetGraphicsResetStatus API.
Additionally, we should allow for simulating a device reset on platforms
other than Windows when using WebRender.
Differential Revision: https://phabricator.services.mozilla.com/D83937
With these changes, on my Linux analysis with ClangBuildAnalyzer, the
top two expensive headers, DOMTypes.h and TabMessageUtils.h are no longer
among the 30 most expensive headers.
Differential Revision: https://phabricator.services.mozilla.com/D82935