This shouldn't change behavior at all, afaict, but by reverting to the original
code structure I'm hoping it may avoid the release-assert that has showed up
in a few crash reports.
Differential Revision: https://phabricator.services.mozilla.com/D140562
Previously, WR would recursively walk the entire picture
tree during each of the visibility, prepare and batching
passes. With this change, the prepare pass builds a tightly
packed command buffer of draw commands for each child surface
and/or picture cache tile. The batching pass now reads the
packed command buffers, meaning that no traversal of the
picture tree is required during batching.
This serves two purposes:
(1) It makes it possible to direct primitives within a
single picture to different render tasks. This isn't
yet used, but is a straightforward addition to how
command buffers work. This functionality will be used
to split primitives across render tasks where there
are readback boundaries (e.g. backdrop-filter and also
mix-blend-mode). The existing picture graph code ensures
that we then optimally allocate, batch and draw pictures
that contain these barrier style primitives.
(2) In the common case, where only a small portion of the
scene has changed during a frame, it's a significant
optimization to avoid another picture tree traversal.
Instead, the batching code will only end up iterating
small lists of the primitives that overlap the dirty
region for the given tile / child surface.
Differential Revision: https://phabricator.services.mozilla.com/D140159
A better solution would check against the same value as reported by gfxConfig, which takes the pref as well as whether webgpu was blocked (for example due to buggy drivers) into account, but this still is good sanity check and easy to uplift.
Differential Revision: https://phabricator.services.mozilla.com/D140535
SimulateDeviceReset() works differently from actual device reset handling. It seems better to make SimulateDeviceReset() more similar to actual device reset handling.
Differential Revision: https://phabricator.services.mozilla.com/D140161
And in one case, #include "mozilla/ProfilerThreadState.h" where only `AUTO_PROFILER_THREAD_WAKE` is used.
Depends on D140172
Differential Revision: https://phabricator.services.mozilla.com/D140173
WebRenderBridgeChild::GetFontKeyForScaledFont can currently cause a IpcResourceUpdateQueue race.
If we're in the middle of a transaction building a blob image, GetFontKeyForScaledFont is called
in the blob image building code using the transaction's IpcResourceUpdateQueue as expected, such
that resource updates are sent out when the transaction is finalized.
However, TextDrawTarget calls into PushGlyphs without passing along its IpcResourceUpdateQueue,
calling GetFontKeyForScaledFont without it, and causing it to immediately send out the resource
update.
So if a blob image uses a font key and submits a resource update, but a display list is built
after that also using the font key within the transaction, the display list will fail to send
the resource update because it thinks the blob image already did, even though the blob image
transaction has not yet been finalized.
The simple fix is to just pass IpcResourceUpdateQueue from TextDrawTarget into PushGlyphs, thus
ensuring the resource updates are properly ordered.
Differential Revision: https://phabricator.services.mozilla.com/D140438
This expands the information we can request from future bug reporters. This
logging is only done when the `gfx.core-animation.specialize-video.log` pref
is set.
Depends on D135634
Differential Revision: https://phabricator.services.mozilla.com/D137084
This adds additional logic to detect when video requires a specialized video
layer. The logic from NativeLayerCA::Representation::CanSpecializeSurface has
been rolled into the mainline decision of whether or not to specialize the
video layer at all.
Depends on D137082
Differential Revision: https://phabricator.services.mozilla.com/D135634
P010 is trivially the same as NV12, but the 10-bit colors are packed into
the most significant bits instead of the least significant bits. This changes
the yuv shader to use the correct packing for P010. It treats P010 as its
own yuv format, which requires a lot of scaffolding.
Differential Revision: https://phabricator.services.mozilla.com/D140422
This fixes an issue caused by landing Bug 1679927. Now that it's possible to
get 10-bit formats in macOS, the texture pipeline has to be updated to pass
those surfaces to WebRender.
This also contains a drive-by comment on D3D11 handing of 10-bit WebRender
display items. It was not clear why specifing 16-bit color depth for P010
format was correct there, but should be specified as 10-bit color elsewhere.
This also contains another drive-by fix to comments in webrender describing
the RG16 enum.
Depends on D136046
Differential Revision: https://phabricator.services.mozilla.com/D136823
In response to a touch down and up event on Android,
NPZCSupport::FinishHandlingMotionEvent sends the events over the Input
Bridge to APZ, then dispatches the event to the main thread for the
widget to process them (which will in turn send them to the tab). It
uses android's priority event queue to do so, via
nsAppShell::PostEvent.
Meanwhile, the AsyncPanZoomController will detect these events as a
tap gesture, and call RemoteContentController::HandleTap. When the GPU
process is disabled, and APZ therefore lives in the parent process,
this calls BrowserParent::SendHandleTap to forward the tap event to
the tab. On Android, however, we ensure that we first dispatch through
the same priority queue used when dispatching the touch events to the
widget, otherwise the tap event could be sent to the tab before the
events from which it was synthesized.
With the GPU process enabled, however, APZ lives in the GPU process
and RemoteContentController::HandleTap must call
APZCTreeManagerParent::SendHandleTap to send the event back to the
parent process, which in turn calls BrowserParent::SendHandleTap to
send it to the tab. However, because IPDL uses a different event queue
than we used to schedule the widget's processing of the events, there
is no guarantee that APZCTreeManagerChild::RecvHandleTap will be
scheduled after the pending tasks on the android priority queue.
This means we sometimes end up calling BrowserParent::SendHandleTap
before BrowserParent::SendRealTouchEvent, and content can receive the
events in the incorrect order.
To fix this, we simply ensure that APZCTreeManagerChild::RecvHandleTap
runs through the android priority queue before calling
BrowserParent::SendHandleTap, ensuring the events are sent in order.
Differential Revision: https://phabricator.services.mozilla.com/D140326
Automatically generated rewrites of all ParamTraits and IPDLParamTraits
implementations in-tree to use IPC::Message{Reader,Writer}.
Differential Revision: https://phabricator.services.mozilla.com/D140004
Also include the raster scale in the PictureKey, so that WR will
invalidate when everything is the same apart from the raster
scale (was causing the included wrench reftest to not re-render).
Differential Revision: https://phabricator.services.mozilla.com/D140282
We need a null-check here in case FT_Select_Charmap failed to find any usable charmap in the font
(e.g. maybe an old font with only a MacRoman charmap, but no Unicode or Symbol table).
Differential Revision: https://phabricator.services.mozilla.com/D140179
The testcase includes a 1-second pause after the onload event fires, because the test font
includes PNG bitmaps which are asynchronously decoded. This is in theory unreliable,
but for these small images it should be plenty of time (even without any delay, the test
fails only very intermittently) -- if we're failing to decode the images even with the
1-second delay, something is probably truly broken.
(The file svg-bitmap.ttx file is not actually required to run the test, but is the source
for the svg-bitmap.ttf font file, constructed by inserting base64-encoded PNG images
into a minimal font framework.)
Differential Revision: https://phabricator.services.mozilla.com/D140057
This patch makes us use the same code path for scroll snapping of
zero-velocity OnPanEnd and followed-by-momentum OnPanEnd.
This is preferable because mFollowedByMomentum is unreliable.
Depends on D135921
Differential Revision: https://phabricator.services.mozilla.com/D135922
ScrollSnapUtils::GetSnapPointForDestination has different behavior based on
the scroll unit. More specifically, the difference is in CalcSnapPoints::AddEdge:
Unless the unit is DEVICE_PIXELS, this function will only look for snap
points in the scroll direction. For example, if we are currently at position 40,
our destination is 50, and we have snap points at 30 and 100, then we'll only
consider snap point 100, even though snap point 30 would be closer to the destination.
Furthermore, if the destination is at the same position as the current position,
then we will not snap at all.
This behavior makes some sense for discrete-tick wheel scrolling, but I don't think
it is appropriate for touchpad scrolling. ScrollSnapToDestination is called from
OnPanMomentumStart and from AttemptFling; in both these contexts I think it makes
more sense to take all snap points into account.
With this change, it becomes possible to call ScrollSnapToDestination even when
our current velocity is zero. Without this change, we wouldn't scroll snap at
all in that case.
Depends on D135920
Differential Revision: https://phabricator.services.mozilla.com/D135921
Scroll snapping uses an MSD smooth scroll to animate to the target position.
During the scroll snap animation, any synthesized momentum events
from the OS should be ignored; the scroll snap animation has taken over.
It is possible that this patch has unfortunate consequences when mixing
keyboard and touchpad scrolling. For example, pressing space to scroll down
by a page and then doing a fling with the touchpad might now have different
results.
Differential Revision: https://phabricator.services.mozilla.com/D135920
This is a preparation of Bug 1723207.
D3D11TextureIMFSampleImage is used for storing ID3D11Texture2D of IMFSample. Array index handling is added, since there are cases that hardware decoder uses array texture. D3D11TextureIMFSampleImage is expected to be used in GPU process.
Differential Revision: https://phabricator.services.mozilla.com/D140017
On Android, we must dispatch UiCompositorController::Destroy to run on
the UI thread synchronously. We were using NS_DISPATCH_SYNC to do so,
but that works by starting a nested event loop that continues to
execute tasks on the thread we have dispatched from. This means that
we can start to execute a task which calls
nsBaseWidget::CreateCompositor whilst we are midway through
nsBaseWidget::DestroyCompositor. As well as generally seeming like a
terrible idea, this also causes an assertion failure in some tests.
To avoid this use SynchronousTask rather than NS_DISPATCH_SYNC, as it
actually blocks synchronously. Additionally, do the same thing for
APZInputBridgeChild::Destroy, as it is called from the same location
and poses the same risk.
Ideally we wouldn't have to call UiCompositorControllerChild::Destroy
synchronously at all, but it was added in bug 1392705 to fix severe
crashes. It might be a good idea to re-evaluate whether it is still
required at some point in the future.
Differential Revision: https://phabricator.services.mozilla.com/D140084
On Android, we must dispatch UiCompositorController::Destroy to run on
the UI thread synchronously. We were using NS_DISPATCH_SYNC to do so,
but that works by starting a nested event loop that continues to
execute tasks on the thread we have dispatched from. This means that
we can start to execute a task which calls
nsBaseWidget::CreateCompositor whilst we are midway through
nsBaseWidget::DestroyCompositor. As well as generally seeming like a
terrible idea, this also causes an assertion failure in some tests.
To avoid this use SynchronousTask rather than NS_DISPATCH_SYNC, as it
actually blocks synchronously. Additionally, do the same thing for
APZInputBridgeChild::Destroy, as it is called from the same location
and poses the same risk.
Ideally we wouldn't have to call UiCompositorControllerChild::Destroy
synchronously at all, but it was added in bug 1392705 to fix severe
crashes. It might be a good idea to re-evaluate whether it is still
required at some point in the future.
Differential Revision: https://phabricator.services.mozilla.com/D140084
We noticed a cold_view_nav_start regression on Fenix from enabling the
GPU process, and profiles showed time spent synchronously waiting for
the GPU process to launch. This occured because the compositor was
being created in nsWindow::Create, and as it requires the GPU process
to be running it had to block until launch completed. The process is
launched when the gfxPlatform is first initialized, but that was only
occuring immediately prior to creating the compositor, which did not
give it enough time to complete asynchronously.
This patch makes it so that we initialize the gfxPlatform slightly
earlier, and importantly delay creating the compositor until it is
actually required. This gives the process enough time to launch
asynchronously meaning we do not have to block.
We started deliberately creating the compositor early on Android
because of bug 1453501, to avoid a race condition where the compositor
didn't exist when RemoteLayerTreeOwner::Initialize was called, causing
us to use a basic layer manager. However, since bug 1741156 landed we
now create the compositor on-demand, meaning this is no longer a
possibility.
Delaying compositor creation can, however, uncover another race
condition. If the UICompositorControllerChild is opened on the UI
thread before the main thread is able to set its pointer to the
widget, then the java GeckoSession will never be notified that the
compositor has been opened, and composition will never be
resumed. This patch fixes this issue by setting the
UiCompositorControllerChild's widget pointer in its constructor rather
than immediately afterwards.
Differential Revision: https://phabricator.services.mozilla.com/D139842
This cuts the time spent doing get_ct_font on macOS in half because of
various contention issues. It should also reduce overall memory usage
for large fonts because we have half as many threads loading the
fonts.
Differential Revision: https://phabricator.services.mozilla.com/D139946
We noticed a cold_view_nav_start regression on Fenix from enabling the
GPU process, and profiles showed time spent synchronously waiting for
the GPU process to launch. This occured because the compositor was
being created in nsWindow::Create, and as it requires the GPU process
to be running it had to block until launch completed. The process is
launched when the gfxPlatform is first initialized, but that was only
occuring immediately prior to creating the compositor, which did not
give it enough time to complete asynchronously.
This patch makes it so that we initialize the gfxPlatform slightly
earlier, and importantly delay creating the compositor until it is
actually required. This gives the process enough time to launch
asynchronously meaning we do not have to block.
We started deliberately creating the compositor early on Android
because of bug 1453501, to avoid a race condition where the compositor
didn't exist when RemoteLayerTreeOwner::Initialize was called, causing
us to use a basic layer manager. However, since bug 1741156 landed we
now create the compositor on-demand, meaning this is no longer a
possibility.
Delaying compositor creation can, however, uncover another race
condition. If the UICompositorControllerChild is opened on the UI
thread before the main thread is able to set its pointer to the
widget, then the java GeckoSession will never be notified that the
compositor has been opened, and composition will never be
resumed. This patch fixes this issue by setting the
UiCompositorControllerChild's widget pointer in its constructor rather
than immediately afterwards.
Differential Revision: https://phabricator.services.mozilla.com/D139842
Bug 1744851 removed the need for creating variations using a CGFont.
This removes some of the support code that was left behind and completes
the reverting of the initial fix for bug 1675185: c8875713fd6d193e6b017da0a88c264b7bdd9a78
Differential Revision: https://phabricator.services.mozilla.com/D139886
Bug 1744851 removed the need for creating variations using a CGFont.
This removes some of the support code that was left behind and completes
the reverting of the initial fix for bug 1675185: c8875713fd6d193e6b017da0a88c264b7bdd9a78
Differential Revision: https://phabricator.services.mozilla.com/D139886
The regular scroll position restoration code will restore it.
Specifically, in ScrollFrameHelper::SaveState
https://searchfox.org/mozilla-central/rev/44ae61c5eeb150cb77b6b8511ceee7ddd6892cb7/layout/generic/nsGfxScrollFrame.cpp#7376
we store the visual viewport offset. ScrollFrameHelper::ScrollToRestoredPosition will try to restore both the layout scroll offset and the visual viewport offset by calling ScrollToWithOrigin (which leads to ScrollToImpl) and PresShell::ScrollToVisual respectively.
I observed the following sequence of events when running gfx/layers/apz/test/mochitest/browser_test_background_tab_load_scroll.js in one of the cases that it fails for me locally.
-ScrollFrameHelper::NotifyApzTransaction is called after sending a transaction to webrender, this sets mAllowScrollOriginDowngrade = true.
-the scroll frame is destroyed and re-created (I'm not sure why this destroy and re-create happened, I didn't investigate), we call SaveState and RestoreState. So now mLastScrollOrigin == Other, and mAllowScrollOriginDowngrade == true.
-the test calls ScrollBy, so we get a call to ScrollToImpl with aOrigin == Relative
-aOrigin gets changed to Other here https://searchfox.org/mozilla-central/rev/9b0bdcc37419e6765223358a31a4a54d62e1cd97/layout/generic/nsGfxScrollFrame.cpp#2979
-the code just below that determines that it is not a downgrade, so we update mLastScrollOrigin to Other (not a change) and set mAllowScrollOriginDowngrade = false
-ScrollFrameHelper::NotifyApzTransaction is _not_ called (probably because we are in a background tab), so mAllowScrollOriginDowngrade remains false
-the scroll frame is now at position 20000 and the visual viewport offset (stored on the presshell) is also 20000.
-the scroll frame is destroyed and re-created (this one I investigated, it's because the font subsystem has finished loading and asks for a global reconstruct https://searchfox.org/mozilla-central/rev/9b0bdcc37419e6765223358a31a4a54d62e1cd97/layout/base/PresShell.cpp#9830), we call SaveState and RestoreState. So mLastScrollOrigin == Other, and mAllowScrollOriginDowngrade == true (the same values.
-we get a call to ScrollToRestoredPosition for the newly re-created scroll frame, we are trying to scroll from the current layout scroll position of 0 to the one we had before destruction and re-creation (20000). The visual viewport offset that is stored on the presshell has not been affected, it is still 20000.
-this gets to ScrollToImpl and with aOrigin == Restore
-we arrive at https://searchfox.org/mozilla-central/rev/9b0bdcc37419e6765223358a31a4a54d62e1cd97/layout/generic/nsGfxScrollFrame.cpp#3000 and Restore is a downgrade from Other, and mAllowScrollOriginDowngrade is false so we do not change mLastScrollOrigin (still Other) and mAllowScrollOriginDowngrade remains false
-we arrive at the code in ScrollToImpl that updates the visual viewport offset on the presshell https://searchfox.org/mozilla-central/rev/9b0bdcc37419e6765223358a31a4a54d62e1cd97/layout/generic/nsGfxScrollFrame.cpp#3232
-mLastScrollOrigin is Other, which can clobber apz, so we enter the if and add the difference between the before/after layout scroll position from this ScrollToImpl call (20000-0 = 20000) to the visual viewport offset (20000), getting 40000, which gets clamped to the max ~38000.
-and now the layout scroll position and the visual viewport offset have been detached when they shouldn't be.
If we reset the visual viewport offset when the scroll frame is destroyed we avoid the problem here.
Differential Revision: https://phabricator.services.mozilla.com/D139792
Before this change, OtherPid() would only be initialized for one side of an
actor pair when opening an in-process actor using `IToplevelProtocol::Open` or
`IToplevelProtocol::OpenOnSameThread`. This changes the function to directly
accept the other protocol, and initializes the value correctly in all cases.
Differential Revision: https://phabricator.services.mozilla.com/D137168
The changes in this patch stack will change how IPDL generated files are built,
such that these issues are now surfaced, especially during non-unified builds.
Differential Revision: https://phabricator.services.mozilla.com/D137165
This patch makes us use the same code path for scroll snapping of
zero-velocity OnPanEnd and followed-by-momentum OnPanEnd.
This is preferable because mFollowedByMomentum is unreliable.
Depends on D135921
Differential Revision: https://phabricator.services.mozilla.com/D135922
ScrollSnapUtils::GetSnapPointForDestination has different behavior based on
the scroll unit. More specifically, the difference is in CalcSnapPoints::AddEdge:
Unless the unit is DEVICE_PIXELS, this function will only look for snap
points in the scroll direction. For example, if we are currently at position 40,
our destination is 50, and we have snap points at 30 and 100, then we'll only
consider snap point 100, even though snap point 30 would be closer to the destination.
Furthermore, if the destination is at the same position as the current position,
then we will not snap at all.
This behavior makes some sense for discrete-tick wheel scrolling, but I don't think
it is appropriate for touchpad scrolling. ScrollSnapToDestination is called from
OnPanMomentumStart and from AttemptFling; in both these contexts I think it makes
more sense to take all snap points into account.
With this change, it becomes possible to call ScrollSnapToDestination even when
our current velocity is zero. Without this change, we wouldn't scroll snap at
all in that case.
Depends on D135920
Differential Revision: https://phabricator.services.mozilla.com/D135921
Scroll snapping uses an MSD smooth scroll to animate to the target position.
During the scroll snap animation, any synthesized momentum events
from the OS should be ignored; the scroll snap animation has taken over.
It is possible that this patch has unfortunate consequences when mixing
keyboard and touchpad scrolling. For example, pressing space to scroll down
by a page and then doing a fling with the touchpad might now have different
results.
Differential Revision: https://phabricator.services.mozilla.com/D135920
I had assumed in bug 1750858 that we would only need to modify RenderBufferTextureHost.
It turns out macOS will also still use the ffmpeg decoder and so RenderExternalTextureHost
must report cropped YCbCr sizes as well.
Differential Revision: https://phabricator.services.mozilla.com/D139679
RecvClearCachedResources() does not handle parent commands. They are handled by a different transaction. It caused the problem. To avoid the problem, modifying RecvClearCachedResources() as to handle parent commands.
Differential Revision: https://phabricator.services.mozilla.com/D139568
* Add support for local scale factors to a surface, allowing it to
be rasterized in root coordinate space. This allows snapping to
work across surfaces where the surface transform is a fractional
offset.
* Calculate scaling factors per rasterized surface and propagate
them. Ensures correct scale factor calculations when dealing with
nested preserve-3d contexts with 90-degree axis rotations.
* Support determining exact surface device rect for 2d surfaces
with fractional surface transforms.
* Fix line decoration cache key size calculations based on world
scaling factor.
* Remove `get_clipped_device_rect` usage for calculating clip-mask
surface allocations, use `surface.get_surface_rect` instead. The
prior method doesn't correctly account for expanded local regions
from the current dirty rect, resulting in invalidation issues in
some animated edge cases. Also unifies the way clip-mask surface
allocations work with the way general render target surface
allocations work.
Differential Revision: https://phabricator.services.mozilla.com/D138982
This patch introduces a number of subtle but important changes
to how we deal with off-screen surfaces. The overall goals are:
- Improve rendering correctness in a number of edge cases.
- Begin reducing complexity related to surfaces, scaling
factors, surface size adjustments and clipping.
- Improve CPU performance by removing some per-primitive work.
- Simplify implementation of future SVG and CSS filters by
having explicit support for picture rects + inflation regions.
- Lay the groundwork for caching child picture surfaces,
reduction of per-primitive work during visibility pass,
simplifying picture code.
Unfortunately, the nature of the changes make it impossible to
split up in to small isolated patches. Details below:
* Introduce `LocalRectKind` concept. This allows us to separate
out the bounding rect of the surface (a group of primitives
backed by a texture) from the bounding rect of the picture
compositing that surface (e.g. a drop-shadow which draws the
surface once at the local origin and once at a specific offset
+ blur-radius). This fixes a number of correctness bugs we have
related to culling, clipping, invalidation regions of complex
primitives such as drop-shadows and blur filters. Importantly,
it makes it simpler to implement (or fix) SVG filter chains,
backdrop-filter implementations.
* Establish raster roots for all off-screen surfaces. Every off-screen
surface uses the spatial node of the enclosing stacking context as
a coordinate system root, ensuring that each off-screen surface is
drawn in a 2D coordinate system, with appropriate scaling factors
applied to ensure high quality rendering. The primary goal is to make
it possible to correctly inflate and clip off-screen surfaces, removing
some correctness issues we currently have with complex filters interacting
with transforms. The initial work here doesn't reduce complexity a huge
amount, but will allow us to simplify large parts of the picture/surface
handling code in future, as well as simplify a number of shaders that
currently must handle arbitrarily complex transform matrices. This will
also allow us to simplify the implementation of features such as
mix-blend-mode and backdrop-filter, which rely on readback and UV mapping
from the parent surface.
* Remove concepts of `estimated` and `precise` local rects for pictures. This
is both a performance optimization and a code simplification. Instead, we
only determine the estimated local rect during bounding rect propagation,
and rely on the clipping regions from the tile dirty regions to reduce which
parts of the picture we allocate if drawing to an off-screen surface. This
removes some per-primitive work during the visibility pass, and also means
we can rely on the final picture bounding rect from the start of the visibility
pass. This also removes much of the complexity in `take_context` where we
previously determined surface scale factors and device pixel ratio - instead
these can be determined earlier during `propagate_bounding_rects`.
* Remove some complexity in `update_prim_visibility`. This is still recursive,
but follow up patches will aim to remove this recursion and integrate this
pass with the picture graph (similar to how `propagate_bounding_rects` works).
* Remove `PictureOptions` struct. Instead, store `inflate_if_required` with
the Blur filter enum, which is the only place that uses it.
* Remove `root_scaling_factor` from text runs - this is handled implicitly
by the surface device-pixel scale.
* Skip calling `update_clip_task` for pass-through pictures (since they have
no defined local rect).
* Improve scaling factors used for determining the render task cache size for
complex line decorations.
Differential Revision: https://phabricator.services.mozilla.com/D137569
Now that we send everything (except sometimes the user value
is sanitized) we should no longer perform this check.
This is also good because it eliminates security code you
have to have (and thus accidently omitting it is a
vulnerability) and changes it to security code that happens
automatically, and is enforced by the compiler (via mandatory
ctor argument.)
Differential Revision: https://phabricator.services.mozilla.com/D138684
This simplifies the number of negations needed,
and makes things easy to understand. I think
anyway; I know that without renaming it I made
several annoying-to-diagnose negation errors...
Differential Revision: https://phabricator.services.mozilla.com/D138682
While we do so, also add a boolean argument to indicate
if we are in a _Content_ process or some other type of
subprocess, which I expect we will need later.
Differential Revision: https://phabricator.services.mozilla.com/D138681
We want to eventually crash if a blocklisted preference
is accessed. In order to do this, we do need to
populate the preference in the pref hashmap of the
subprocess; otherwise when we look up a blocklisted
pref we will just not find anything. We could try to
put the blocklist check at that point; but this won't
work for StaticPrefs; we'd also need to put the blocklist
check there.
Performing a list iteration and string comparison on
every Static Pref call is not acceptable when we can
just populate a bit and check it.
Differential Revision: https://phabricator.services.mozilla.com/D138679
It was previously only called where we create a stacking context helper: before entering the grouping code and before pushing an active item into the wr display list, and not between the active items and the next blob group. This caused the leaf clip rect of the active item to leak into the next blob group.
Differential Revision: https://phabricator.services.mozilla.com/D136906
We need them for SVG primitives.
This patch adds a bit of plumbing to disable snapping some of the primitives and forcing the antialiasing shader feature where needed, and uses it for SVG solid rectangles and images.
Differential Revision: https://phabricator.services.mozilla.com/D139024
This expands the information we can request from future bug reporters. This
logging is only done when the `gfx.core-animation.specialize-video.log` pref
is set.
Differential Revision: https://phabricator.services.mozilla.com/D137084
This adds additional logic to detect when video requires a specialized video
layer. The logic from NativeLayerCA::Representation::CanSpecializeSurface has
been rolled into the mainline decision of whether or not to specialize the
video layer at all.
Differential Revision: https://phabricator.services.mozilla.com/D135634
This fixes an issue caused by landing Bug 1679927. Now that it's possible to
get 10-bit formats in macOS, the texture pipeline has to be updated to pass
those surfaces to WebRender.
This also contains a drive-by comment on D3D11 handing of 10-bit WebRender
display items. It was not clear why specifing 16-bit color depth for P010
format was correct there, but should be specified as 10-bit color elsewhere.
This also contains another drive-by fix to comments in webrender describing
the RG16 enum.
Differential Revision: https://phabricator.services.mozilla.com/D136823
This makes WR properly handle mPicSize when RenderBufferTextureHost is used.
The main change is that we need to take care to pass in display().Size() from
the descriptor, and then further use that to carefully limit the size of the
CbCr texture, as it doesn't necessarily maintain an appropriate half-sized
scale with respect to the Y texture if it is padded.
Given that mPicSize should now actually work, we should no longer need any
of the previous mCroppedSize mechanisms that were added to work around this,
and so they are removed in this patch.
Differential Revision: https://phabricator.services.mozilla.com/D139267
If the android system kills the GPU process to free memory while the
app is in the background, then we want to avoid immediately restarting
the GPU process.
To achieve this, we make GPUProcessManager keep track of whether it is
in the foreground or background. If HandleProcessLost() gets called
while in the background then we destroy the existing compositor
sessions as before, but return early instead of immediately
relaunching the process. If the process has not been launched when the
app later gets foregrounded then we do so then.
The final part of HandleProcessLost(), which reinitializes the content
bridges and emits the "compositor-reinitialized" signal, has been
moved to a new function ReinitializeRendering(). If the GPU process
has been disabled, this gets called as-before at the end of
HandleProcessLost(). When the GPU process is enabled, however, we now
call it from OnProcessLaunchComplete(), so that it gets called
regardless of whether the process is launched immediately or after a
delay.
While we're here, rename the functions RebuildRemoteSessions() and
RebuildInProcessSessions() to DestroyRemoteCompositorSessions() and
DestroyInProcessCompositorSessions(), to better reflect what they
actually do: the "rebuilding" part occurs later on. Also update the
mega-comment documenting the restart sequence, as it was somewhat
outdated.
In case a caller of EnsureGPUReady() gets called before the foreground
signal arrives (eg in nsBaseWidget::CreateCompositorSession() due to a
refresh tick paint), make EnsureGPUReady() launch the GPU process
itself if the GPU process is enabled but not yet launched. As a
consequence, to avoid launching the GPU process unnecessarily, change
a couple callers of EnsureGPUReady() to simply check whether the
process is enabled instead.
Additionally, guard against a null pointer deref if the compositor has
been destroyed when the widget receives a memory pressure event. This
is now more likely to occur as there may be a gap between the
compositor being destroyed and recreated.
Differential Revision: https://phabricator.services.mozilla.com/D139042
This patch separates the tracking and submission of paint/hit-test rects, ensuring that we can remove invisible parts form the blobs without breaking hit testing.
Differential Revision: https://phabricator.services.mozilla.com/D139134
Currently within DrawTargetWebGL, the Skia framebuffer and external software
surfaces are in BGRA, while the WebGL framebuffer is in RGBA. This requires
swizzling all software surfaces to RGBA on upload and swizzling surfaces
back to BGRA on readback. This can either require intermediate surfaces or
extra costly processing.
Now that CopyToSwapChain is available with support for a BGRA blit, we can
remove these complications by just treating the WebGL framebuffer as if it
was BGRA directly, so that any uploads or readbacks do not require a swizzle.
Differential Revision: https://phabricator.services.mozilla.com/D139066
For use within accelerated Canvas2D, it is expedient to have a variant of Present
that explicitly acknowledges that there is a copy from a supplied WebGL framebuffer
into the swap chain back buffer, as is usually the case when remoting, so that it
can be relied upon for format conversions or other concerns. This allows Present
to remain simple and assume rendering happened directly to the back buffer, without
need of a copy.
This backs out some of the earlier changes to Present in 1754130 in favor of
separating the new copy behavior into an explicit CopyToSwapChain interface,
which supports a format swizzle that is useful for converting between Canvas2D
and WebGL swap chain formats. The behavior of CopyToSwapChain, as noted,
assumes that there is a supplied WebGL framebuffer that must be copied to the
swap chain back buffer before any compositing should occur.
Differential Revision: https://phabricator.services.mozilla.com/D138954
* Add support for local scale factors to a surface, allowing it to
be rasterized in root coordinate space. This allows snapping to
work across surfaces where the surface transform is a fractional
offset.
* Calculate scaling factors per rasterized surface and propagate
them. Ensures correct scale factor calculations when dealing with
nested preserve-3d contexts with 90-degree axis rotations.
* Support determining exact surface device rect for 2d surfaces
with fractional surface transforms.
* Fix line decoration cache key size calculations based on world
scaling factor.
* Remove `get_clipped_device_rect` usage for calculating clip-mask
surface allocations, use `surface.get_surface_rect` instead. The
prior method doesn't correctly account for expanded local regions
from the current dirty rect, resulting in invalidation issues in
some animated edge cases. Also unifies the way clip-mask surface
allocations work with the way general render target surface
allocations work.
Differential Revision: https://phabricator.services.mozilla.com/D138982
This patch introduces a number of subtle but important changes
to how we deal with off-screen surfaces. The overall goals are:
- Improve rendering correctness in a number of edge cases.
- Begin reducing complexity related to surfaces, scaling
factors, surface size adjustments and clipping.
- Improve CPU performance by removing some per-primitive work.
- Simplify implementation of future SVG and CSS filters by
having explicit support for picture rects + inflation regions.
- Lay the groundwork for caching child picture surfaces,
reduction of per-primitive work during visibility pass,
simplifying picture code.
Unfortunately, the nature of the changes make it impossible to
split up in to small isolated patches. Details below:
* Introduce `LocalRectKind` concept. This allows us to separate
out the bounding rect of the surface (a group of primitives
backed by a texture) from the bounding rect of the picture
compositing that surface (e.g. a drop-shadow which draws the
surface once at the local origin and once at a specific offset
+ blur-radius). This fixes a number of correctness bugs we have
related to culling, clipping, invalidation regions of complex
primitives such as drop-shadows and blur filters. Importantly,
it makes it simpler to implement (or fix) SVG filter chains,
backdrop-filter implementations.
* Establish raster roots for all off-screen surfaces. Every off-screen
surface uses the spatial node of the enclosing stacking context as
a coordinate system root, ensuring that each off-screen surface is
drawn in a 2D coordinate system, with appropriate scaling factors
applied to ensure high quality rendering. The primary goal is to make
it possible to correctly inflate and clip off-screen surfaces, removing
some correctness issues we currently have with complex filters interacting
with transforms. The initial work here doesn't reduce complexity a huge
amount, but will allow us to simplify large parts of the picture/surface
handling code in future, as well as simplify a number of shaders that
currently must handle arbitrarily complex transform matrices. This will
also allow us to simplify the implementation of features such as
mix-blend-mode and backdrop-filter, which rely on readback and UV mapping
from the parent surface.
* Remove concepts of `estimated` and `precise` local rects for pictures. This
is both a performance optimization and a code simplification. Instead, we
only determine the estimated local rect during bounding rect propagation,
and rely on the clipping regions from the tile dirty regions to reduce which
parts of the picture we allocate if drawing to an off-screen surface. This
removes some per-primitive work during the visibility pass, and also means
we can rely on the final picture bounding rect from the start of the visibility
pass. This also removes much of the complexity in `take_context` where we
previously determined surface scale factors and device pixel ratio - instead
these can be determined earlier during `propagate_bounding_rects`.
* Remove some complexity in `update_prim_visibility`. This is still recursive,
but follow up patches will aim to remove this recursion and integrate this
pass with the picture graph (similar to how `propagate_bounding_rects` works).
* Remove `PictureOptions` struct. Instead, store `inflate_if_required` with
the Blur filter enum, which is the only place that uses it.
* Remove `root_scaling_factor` from text runs - this is handled implicitly
by the surface device-pixel scale.
* Skip calling `update_clip_task` for pass-through pictures (since they have
no defined local rect).
* Improve scaling factors used for determining the render task cache size for
complex line decorations.
Differential Revision: https://phabricator.services.mozilla.com/D137569
This patch removes more main thread dependencies from the content side
of WebGPU. Instead of issuing a resource update for an external image,
we now use an async image pipeline in conjunction with
CompositableInProcessManager from part 1. This allows us to update the
HTMLCanvasElement bound to the WebGPU device without having to go
through the main thread, or even the content process after the swap
chain update / readback has been requested.
Differential Revision: https://phabricator.services.mozilla.com/D138887
For WebGPU, we produce the textures in the compositor process and the
content process doesn't need to be that involved except for hooking up
the texture to the display list. Currently this is done via an external
image ID.
Given that WebGPU needs to work with OffscreenCanvas, it would be best
if its display pipeline was consistent whether it was gotten from an
HTMLCanvasElement, OffscreenCanvas on the main thread, or on a worker
thread. As such, using an AsyncImagePipeline would be best.
However there is no real need to bounce the handles across process
boundaries. Hence this patch which adds CompositableInProcessManager.
This static class is responsible for collecting WebRenderImageHost
objects backed by TextureHost objects which do not leave the compositor
process. This will allow WebGPUParent to schedule compositions directly
in future patches.
Differential Revision: https://phabricator.services.mozilla.com/D138588
If the android system kills the GPU process to free memory while the
app is in the background, then we want to avoid immediately restarting
the GPU process.
To achieve this, we make GPUProcessManager keep track of whether it is
in the foreground or background. If HandleProcessLost() gets called
while in the background then we destroy the existing compositor
sessions as before, but instead of immediately relaunching the process
then reinitializing content bridges, just set a flag saying we need to
do so. If the flag is set when the app later gets foregrounded then we
continue where we left off.
While we're here, rename the functions RebuildRemoteSessions() and
RebuildInProcessSessions() to DestroyRemoteCompositorSessions() and
DestroyInProcessCompositorSessions(), to better reflect what they
actually do: the "rebuilding" part occurs later on. Also update the
mega-comment documenting the restart sequence, as it was somewhat
outdated.
In case a caller of EnsureGPUReady() gets called before the foreground
signal arrives (eg in nsBaseWidget::CreateCompositorSession() due to a
refresh tick paint), make EnsureGPUReady() launch the GPU process
itself if the GPU process is enabled but not yet launched.
Additionally, guard against a null pointer deref if the compositor has
been destroyed when the widget receives a memory pressure event. This
is now more likely to occur as there may be a gap between the
compositor being destroyed and recreated.
Differential Revision: https://phabricator.services.mozilla.com/D139042
Rather than using a separate function
APZCTreeManager::AddInputBlockCallback(), allow an optional callback
argument to be passed to APZInputBridge::ReceiveInputEvent().
This avoids a race condition where the input block is processed before
the callback has a chance to be added to the input queue, meaning the
callback will never be fired. With the GPU process enabled the added
latency of adding the callback cross-process meant this occured
frequently, causing intermittent junit test failures.
This works similarily to before, with the in-process APZCTreeManager
implementation simply calling InputQueue::AddInputBlockCallback() and
the InputQueue will fire the callback when the input block has been
processed. For the cross-process implementation, APZInputBridgeChild
maintains a local map of the callbacks, and APZInputBridgeParent adds
an intermediate callback to its local InputQueue. This intermediate
callback will call APZInputBridgeParent::SendCallInputBlockCallback(),
which tells the APZInputBridgeChild to fire the real callback.
Care must be taken in both APZInputBridgeChild and Parent to only add
their respective callbacks if we determine that it is required, eg not
already handled by the root APZ.
Differential Revision: https://phabricator.services.mozilla.com/D138801
As Pango do, we get font feature settings for specfied from user's
fontconfig. This will not affect users that don't use this feature,
because they will not add fontfeatures setting in their fontconfig.
There is an example config I use to test:
<match target="pattern">
<test name="family" compare="eq" ignore-blanks="true">
<string>Iosevka</string>
</test>
<edit name="fontfeatures" mode="append">
<string>ss10 on</string> <!-- using SS10 variant shape -->
<string>cv49 1</string> <!-- straight y -->
<string>cv83 2</string> <!-- underline, not too high, not too low -->
</edit>
</match>
Signed-off-by: Coelacanthus <coelacanthus@outlook.com>
Differential Revision: https://phabricator.services.mozilla.com/D138757
This patch removes more main thread dependencies from the content side
of WebGPU. Instead of issuing a resource update for an external image,
we now use an async image pipeline in conjunction with
CompositableInProcessManager from part 1. This allows us to update the
HTMLCanvasElement bound to the WebGPU device without having to go
through the main thread, or even the content process after the swap
chain update / readback has been requested.
Differential Revision: https://phabricator.services.mozilla.com/D138887
For WebGPU, we produce the textures in the compositor process and the
content process doesn't need to be that involved except for hooking up
the texture to the display list. Currently this is done via an external
image ID.
Given that WebGPU needs to work with OffscreenCanvas, it would be best
if its display pipeline was consistent whether it was gotten from an
HTMLCanvasElement, OffscreenCanvas on the main thread, or on a worker
thread. As such, using an AsyncImagePipeline would be best.
However there is no real need to bounce the handles across process
boundaries. Hence this patch which adds CompositableInProcessManager.
This static class is responsible for collecting WebRenderImageHost
objects backed by TextureHost objects which do not leave the compositor
process. This will allow WebGPUParent to schedule compositions directly
in future patches.
Differential Revision: https://phabricator.services.mozilla.com/D138588
Record the trace of each instance into "$WGPU_TRACE/$IDX/" where $IDX is a monotonically increasing integer per instance.
Differential Revision: https://phabricator.services.mozilla.com/D138814
Record the trace of each instance into "$WGPU_TRACE/$IDX/" where $IDX is a monotonically increasing integer per instance.
Differential Revision: https://phabricator.services.mozilla.com/D138814
Most of the support for presenting a WebGLFramebuffer to a swap chain existed as part of the
mechanism for opaque WebXR framebuffer support. However, such "opaque" framebuffer are meant
to be opaque in the sense that their attachments can't be inspected or changed, which does
not provide the requisite level of control for efficiently implementing Canvas2D snapshots.
To this end, the existing Present mechanism is slightly extended to allow presenting to the
swap chain already present in WebGLFramebuffer without the existence of a corresponding
MozFramebuffer.
This also fixes a bug in that AsWebgl() was no longer being utilized in CanvasRenderer, such
that a new mechanism that routed GetFrontBuffer() was needed to fix the code rot.
There are also some efforts to remove a couple redundant copies I noticed in profiles along
the way.
Differential Revision: https://phabricator.services.mozilla.com/D138119
Now that each DrawTargetWebgl shares the same WebGL context, we can efficiently draw snapshots
of one DrawTargetWebgl to another without requiring any readback or error-prone driver-provided
shared context/resource mechanism. We just need to simply pass the WebGL texture from one target
to the other, and use it like any other texture.
This provides SourceSurfaceWebgl to store and pass along that WebGL texture. It is largely
modeled off of SourceSurfaceSkia in terms of its copy-on-write behavior. There are three
noteworthy state changes that it must track from DrawTargetWebgl - when the framebuffer
contents is changing, when the framebuffer is being destroyed, and when any cached texture
handle separate from a framebuffer is also being destroyed. It will copy, orphan, or read
back data as appropriate to handle each case.
If it needs to be mapped, it just forces a read back of the data into a CPU surface that
can be mapped as requested.
Differential Revision: https://phabricator.services.mozilla.com/D138118
Most of the support for presenting a WebGLFramebuffer to a swap chain existed as part of the
mechanism for opaque WebXR framebuffer support. However, such "opaque" framebuffer are meant
to be opaque in the sense that their attachments can't be inspected or changed, which does
not provide the requisite level of control for efficiently implementing Canvas2D snapshots.
To this end, the existing Present mechanism is slightly extended to allow presenting to the
swap chain already present in WebGLFramebuffer without the existence of a corresponding
MozFramebuffer.
This also fixes a bug in that AsWebgl() was no longer being utilized in CanvasRenderer, such
that a new mechanism that routed GetFrontBuffer() was needed to fix the code rot.
There are also some efforts to remove a couple redundant copies I noticed in profiles along
the way.
Differential Revision: https://phabricator.services.mozilla.com/D138119
Now that each DrawTargetWebgl shares the same WebGL context, we can efficiently draw snapshots
of one DrawTargetWebgl to another without requiring any readback or error-prone driver-provided
shared context/resource mechanism. We just need to simply pass the WebGL texture from one target
to the other, and use it like any other texture.
This provides SourceSurfaceWebgl to store and pass along that WebGL texture. It is largely
modeled off of SourceSurfaceSkia in terms of its copy-on-write behavior. There are three
noteworthy state changes that it must track from DrawTargetWebgl - when the framebuffer
contents is changing, when the framebuffer is being destroyed, and when any cached texture
handle separate from a framebuffer is also being destroyed. It will copy, orphan, or read
back data as appropriate to handle each case.
If it needs to be mapped, it just forces a read back of the data into a CPU surface that
can be mapped as requested.
Differential Revision: https://phabricator.services.mozilla.com/D138118