This provides an implementation of linear filtering via the linear_blit
routine. That required factoring out textureLinear helper functions and
reusing them to do the actual filtering of the source texture.
Some collateral cleanups involved changing IntRect to use a start/end
format instead of origin/size, cleaner ways of getting offset texture
sampler pointers, and fixing clamping of linear filter coordinates.
Differential Revision: https://phabricator.services.mozilla.com/D70968
The FrameTexturesUpdated event is intended to be issued when any
necessary texture cache updates have been completed for the current
frame. Originally we would simply check if the pending_texture_updates
vector in Renderer is empty. However the updates themselves could be
nops, and not trigger a timely frame render to occur, causing the
notifications to be delayed indefinitely. Now we track if there are
actually any pending specifically texture cache updates and only defer
the notification if true.
A consequence of deferring the notification indefinitely was revealed
when animating images just outside the current view. We would not
release the buffers for recycling even though we had no reason to hold
onto them. This could cause the image decoder to allocate new buffers
while the old buffers would not get released. It was not a true memory
leak because as soon as a frame is rendered, it would release all of the
old buffers, but it could easily cause a out of memory crash to occur if
the user was not interacting with the browser while in this state.
Differential Revision: https://phabricator.services.mozilla.com/D70770
Previously we were using 256 bytes (which happens to be equal to 64
pixels for RGBA8 textures). A recent change to the GPU Cache uncovered
the fact that the requirement is actually 64 pixels, eg 1024 bytes for
RGBAF32 textures.
Differential Revision: https://phabricator.services.mozilla.com/D70915
--HG--
extra : moz-landing-system : lando
These just wrap regular std Sender/Receiver without providing any value. Serialize/Deserialize was implement manually for MsgSender/MsgReceiver to assert. Serde being amazing, it provides with annotations to not require the traits to be implemented on some enum variants and assert at runtime which functionally equivalent but less error-prone than the fake trait implementations.
Removing the rest of channel.rs is coming in a followup.
Differential Revision: https://phabricator.services.mozilla.com/D70021
--HG--
extra : moz-landing-system : lando
Only drop targets from the render target pool when the size of
the pool is larger than an arbitrary threshold (this is 32 MB
for now), _and_ the render target hasn't been used in the last
60 frames of rendering.
This reduces the number of allocation thrashing of textures in
the render target pool on most pages.
Differential Revision: https://phabricator.services.mozilla.com/D70782
--HG--
extra : moz-landing-system : lando
The WR code that computed the sticky_offset didn't properly combine the offsets
from the top- and bottom- sticky calculations if an item had both. This patch
fixes the calculation, which makes the remaining test failure (in the
configuration without any dynamic toolbar) pass.
Depends on D70679
Differential Revision: https://phabricator.services.mozilla.com/D70680
--HG--
extra : moz-landing-system : lando
These assertions don't hold in some perfectly legitimate cases, such as when
items have both top and bottom sticky behaviours. The actual behaviour still
seems ok, so let's just drop the assertions.
Differential Revision: https://phabricator.services.mozilla.com/D70459
--HG--
extra : moz-landing-system : lando
Previously, the prim origin needed to be stored in the prim
instance, to avoid picture cache invalidations. With support
for external scroll offset, this is no longer necessary.
This simplifies some of the code paths, and reduces the size
of primitive instances.
Differential Revision: https://phabricator.services.mozilla.com/D70740
--HG--
extra : moz-landing-system : lando
On some (primarily older, integrated) drivers, we see significant
time in CPU stalls during updates to the vertex data textures.
As a short term fix, this patch creates an array of vertex data
textures, and rotates which set of them are in use each frame.
There are better long-term options (such as porting the GPU cache
scatter method, or perhaps using UBO/SSBOs here), but this is a
simple workaround for now.
Differential Revision: https://phabricator.services.mozilla.com/D70775
--HG--
extra : moz-landing-system : lando
On some (mostly older, integrated) GPUs, the normal GPU texture cache
update path doesn't work well when running on ANGLE, causing CPU stalls
inside D3D and/or the GPU driver. To reduce the number of code paths we
have active that require testing, we will enable the GPU cache scatter
update path on all devices running with ANGLE. We want a better solution
long-term, but for now this is a significant performance improvement on
HD4600 era GPUs, and shouldn't hurt performance in a noticeable way on
other systems running under ANGLE.
Differential Revision: https://phabricator.services.mozilla.com/D70764
--HG--
extra : moz-landing-system : lando
Previously, it was possible for a tile that had a valid scroll root
to have an empty valid (and dirty) rect due to the picture cache
clip rect, in some situations.
This could result in the tile not being tagged as off-screen, which
means it is added to the queue of tiles to be updated. On most
platforms this is benign, but the BeginDraw method of DirectComposition
fails if the dirty rect is empty.
This patch fixes the logic so that tiles that meet these conditions
are correctly tagged as not visible, and skipped from update queue.
Differential Revision: https://phabricator.services.mozilla.com/D70616
--HG--
extra : moz-landing-system : lando
In wr/swgl/src/gl.cc, force_clear constrains skip_end to be no less than y0, but
skip_end is a horizontal position, not a vertical position, so skip_start is the
correct lower bound.
Differential Revision: https://phabricator.services.mozilla.com/D70462
--HG--
extra : moz-landing-system : lando
Picture primitives are special, so it doesn't make sense to have
the normal common primitive data for them. For example, the
bounding rect of a picture is determined during frame building.
Removing the common data for picture primitives simplifies the
code and makes it impossible to accidentally access an invalid
bounding rect for picture primitives.
Differential Revision: https://phabricator.services.mozilla.com/D70295
--HG--
extra : moz-landing-system : lando
Most `Internable` implementations give `PrimitiveSceneData` as their
`InternData` associated type, the type of data associated with the handle in the
scene builder thread. However, nothing in the scene builder code, or anywhere in
WebRender, actually uses the contents of `PrimitiveSceneData`, so it can be
replaced with `()` with no effect on the code other than memory savings.
Differential Revision: https://phabricator.services.mozilla.com/D70121
--HG--
extra : moz-landing-system : lando
This fixes a case where the backing surface of a picture cache tile
changes, but was still being considered a no-op frame, which skips
the composite as an optimization if nothing has changed.
In this case, the tile surface changes from a rasterized texture
to a solid color that doesn't require a picture cache texture.
The patch ensures that the composite surface descriptors that are
used to detect no-op frame compositions include the backing
surface for each of the tiles that make up this virtual surface.
Differential Revision: https://phabricator.services.mozilla.com/D70138
--HG--
extra : moz-landing-system : lando
Previously, primitive dependency checking would invalidate a tile if
the spatial node index for a given primitive changed.
However, if a new display list is sent that changes the shape of
the spatial node tree this may cause unnecessary invalidations.
For example, a new display list that inserts a new spatial node at
the start of the tree could result in spatial node indices being
different, even though the values of the transforms was the same.
This patch changes the invalidation logic for spatial nodes to
compare the transforms by value, rather than index, meaning that
invalidations are avoided if the shape of the spatial tree has
changed, but the values are consistent.
Differential Revision: https://phabricator.services.mozilla.com/D69913
--HG--
extra : moz-landing-system : lando
This implements Z plane clipping, gl_FragCoord.zw interpolation, and vertex attribute perspective-correction with support from the glsl-to-cxx compiler by using PERSPECTIVE feature in a WR shader.
Differential Revision: https://phabricator.services.mozilla.com/D69429
--HG--
extra : moz-landing-system : lando
The `sampleInUvRect` function in `gfx/wr/webrender/res/cs_svg_filter.glsl` calls
`texture`, passing the optional third argument that specifies a bias to the
sampler's level-of-detail calculation, used for mipmap interpolation. However,
the `texture` override in `gfx/wr/swgl/src/glsl.h` that accepts this parameter
is stubbed out, with an `assert(0)`. This causes twelve tests in
`gfx/wr/wrench/reftests/filters` to crash.
Nothing in Firefox uses mipmaps, and Servo doesn't use Software WebRender, so it
should be fine for that call to `texture` to simply not pass the level-of-detail
bias.
This patch makes that change.
Differential Revision: https://phabricator.services.mozilla.com/D69867
--HG--
extra : moz-landing-system : lando
The picture cache code retains a set of tiles that are currently
off-screen but might be needed again soon, depending on how the
page is scrolled.
However, off-screen tiles were being skipped during draw processing,
which meant that the texture cache request method was not being
called on these tiles. This would often result in the texture
cache eagerly evicting these seemingly unused surface tiles.
This patch re-arranges the occlusion and visibility processing
code for tiles, so that if a tile has been retained in the
picture cache grid, the texture surface is always requested,
even if that tile is currently off-screen. This prevents the
texture cache from evicting tiles that we want to retain for now.
Differential Revision: https://phabricator.services.mozilla.com/D69760
--HG--
extra : moz-landing-system : lando
this is an attempt to handle tasks outside of the device bounds,
that belong to surfaces not establishing raster roots.
I suspect that the scaling we are now setting up in adjust_scale_for_max_surface_size
doesn't work properly, since the function was assumed to only affect the raster-rooted
surfaces. But it does fix the crash we have.
Differential Revision: https://phabricator.services.mozilla.com/D69654
--HG--
extra : moz-landing-system : lando
we expect that the children spatial node goes after the parent.
If it's not the case, we can still fall back to a full world transform,
but it's not going to be correct with regards to flattening.
As we don't know the circumstances and are unable to reproduce the issue,
making this fallback could be a reasonable thing to do for now.
Differential Revision: https://phabricator.services.mozilla.com/D69289
--HG--
extra : moz-landing-system : lando
Few of the counters actually have anything to do with IPC although they all relate to events of layout transactions.
Depends on D69414
Differential Revision: https://phabricator.services.mozilla.com/D69415
--HG--
extra : moz-landing-system : lando
Instead of collecting so-called ipc counters when receving the SetDisplayList on the render backend, pass the information through the scene builder thread and update the profile on the render backend after the scene is swapped. This prevents ipc counters to be displayed while the transaction is still being processed by the scene builder thread.
Differential Revision: https://phabricator.services.mozilla.com/D69414
--HG--
extra : moz-landing-system : lando
It moves when DL building + IPC + scene building takes more than 100ms.
Depends on D69247
Differential Revision: https://phabricator.services.mozilla.com/D69254
--HG--
extra : moz-landing-system : lando
This removes the WebRender side of the previous slow frame indicator and replace it with a simple implementation that only looks at the CPU time on the render backend and renderer thread involved for building a frame.
A followup patch will add a separate indicator for when the displaylist/ipc/scene bits take too long.
Differential Revision: https://phabricator.services.mozilla.com/D69247
--HG--
extra : moz-landing-system : lando
Before this patch:
- Consume time merely is the time it takes to push something into a vector (always displays zero).
- Total IPC time and the DisplayList IPC graph measure the time between api.set_display_list and the render backend picking the message up, plus the time it took to build the display list (but doesn't take into account the time it took for actual IPC in between).
- Send time is only the time between api.set_display_list and the render backend picking the message up but doesn't take into account the time it took between the content thread sending the DL and the compositor thread forwarding it.
After this patch:
- Content send time measures the time between the content thread sending the display list and the compositor forwarding it (actual IPC).
- Api send time measures the time between the compostor thread forwarding the DL and the render backend picking it up.
- Consume time is removed.
- Total send time is the sum of content and api times.
- Display list build times and display list IPC (total send time) are on separate graphs.
Depends on D69227
Differential Revision: https://phabricator.services.mozilla.com/D69228
--HG--
extra : moz-landing-system : lando