Instead of setting a ceiling on how much we are willing to allocate, we
should always try to reallocate a buffer the same size that we used
before. If we cannot, we should fail gracefully and just not preallocate
at all. This should help with profiles where we consistently need a
larger payload buffer than the previous maximum, while minimizing OOM
crashes on the other side, where our preallocations fail and/or are more
than the next display list update actually requires.
Differential Revision: https://phabricator.services.mozilla.com/D154951
This patch refactors how clip chains are internally represented and used
during scene and frame building. The intent is to make clip processing
during frame building more efficient and consistent. Additionally, this
work enables follow ups to cache the result of clip-chain builds between
frame and scene builds.
These changes will significantly reduce the cost of the visibility pass
for the common case when not much content has changed. In this patch,
the public API for clipping remains (mostly) the same, in order to allow
landing and stabilising this work without major changes to Gecko. However,
a longer term goal is to make the public WR clip API more closely match
the internal representation, to reduce work done during scene building.
Clips on a primitive can be categorized into two buckets. The first are
local clips that are specific to the primitive and move with it. These
could essentially be considered part of the definition of the primitive
itself. The second are a hierarchy of clips that apply to one or more
items, and may move independently of the primitive(s) they clip. These
clips are things like scroll regions, stacking context clips, iframe
clip regions etc. On (real world) pages, the clip hierarchy is typically
quite shallow, with a small number of clips that are shared by a large
number of primitives.
Finding clips that are shared between primitives is both required (for
things such as determining which picture cache slice a primitive can
be assigned to, while applying the shared clips during composition), and
also a potential optimization (processing shared clips only once and
caching this clip state similar primitives).
The public clip-chain API has two complexities that make the above
difficult and time consuming for WR to determine. It was possible to
express a clipping hierarchy both via the legacy clip parenting path
(via `ClipId` definitions) and also via clip-chains (the `parent`
field of a `ClipChain`). Second, clip-chains themselves can define
an arbitrary number and ordering of clips. Clips can also implicitly
apply to primitives via parent stacking contexts and iframes, but must
sometimes be removed (when an intermediate surface is created) for
performance reasons.
The new internal representation provided by this patch introduces a
`ClipTree` structure which is built during scene building by accumulating
the set of clips that apply to a primitive from all explicit and implicit
sources, and grafting this on to the existing clip-tree structure.
This provides WR a simple way to determine which clips are shared between
primitive (by checking ancestry) and reduces the size of the internal
representation (by sharing clips where possible rather than duplicating).
Interning is still used to identify parts of the clip-tree that define
the same clipping state.
Specific changes in this patch:
* Remove legacy `ClipId` style parenting support (in conjunction with
previous patches)
* Remove the public API ability to specify the clip on a primitive via
`ClipId` (it must now be a clip-chain)
* Remove `combined_local_clip_rect` from `PrimitiveInstance`, reducing
the size of the structure significantly
* Introduce `ClipTree` used during frame building, which is created by
`ClipTreeBuilder` during scene building
* Separate out per-primitive clip concept (`ClipTreeLeaf`) from clipping
hierarchy (`ClipTreeNode`). In future, more elements will be moved to
the `ClipTreeLeaf` and the state of each `ClipTreeNode` will be cached)
* Simplify the logic to disable / remove clips during frame building that
are applied by parent surface(s)
* Port hit-testing to be based on `ClipTree` which is simpler, faster and
also resolves some edge case correctness bugs
* Use a simpler and faster method to find shared clips during picture
cache slice assignment of primitives
* Update wrench to use the public clip-chain API definition changes
This patch already introduces some real-world optimizations (for example,
`displaylist_mutate` becomes 6% faster overall), but mostly sets things
up for follow up patches to be able to cache clip-state between frames,
which should result in much larger wins.
Differential Revision: https://phabricator.services.mozilla.com/D151987
This is mostly just changing a small number of structs and function
params (most of the work has been done in previous patches).
Differential Revision: https://phabricator.services.mozilla.com/D150987
Also make the parent in ClipTemplate an Option, so that the
semantics are a bit clearer (follow up patches will remove this
parent field entirely).
Differential Revision: https://phabricator.services.mozilla.com/D150980
## Summary
Pass the fixed position element animation id through webrender, returning the
the animation id in the hit-test result if the element is a fixed position
element. This animation id then can be used to lookup the relevant Hit-Testing
Tree Node, which can be used to find the fixed (or sticky) position side bits.
## Motivation
Sticky content can be currently stuck to the root content or not, based on the
scroll position. As a result, when hit testing sticky content, APZ needs both
the sticky position side bits and additional information to determine if the
element is currently stuck to the root content. This is needed to fix the
hit-testing of sticky position content when a APZ transform is being applied,
such as overscroll and hiding the dynamic toolbar.
## Implementation
The information needed to determine if a element is currently stuck to the root
content and the fixed/sticky position side bits is already stored in the
hit-testing tree node. Any hit test result should have a corresponding
hit-testing tree node entry. When a hit-test result contains a animation id and
a hit-testing tree node is found, we can store a pointer to this node and use
this to check the fixed/sticky position side bits. Something similar is already
done for hit test results when a scrollbar is hit.
Differential Revision: https://phabricator.services.mozilla.com/D148648
We formerly published webrender to crates.io, but haven't done so in
several years. However, the in-tree version number still matches the
version published on crates.io, causing cargo-vet to flag that this is
something that should potentially be audited. We could silence that on
the cargo-vet side, but then if we ever starting publishing it again
we'd miss the nudge to certify the audit (which would be useful to
anyone consuming it). So bumping the versions to a not-yet-published
number is a good way to correctly articulate the situation.
Differential Revision: https://phabricator.services.mozilla.com/D150055
We formerly published webrender to crates.io, but haven't done so in
several years. However, the in-tree version number still matches the
version published on crates.io, causing cargo-vet to flag that this is
something that should potentially be audited. We could silence that on
the cargo-vet side, but then if we ever starting publishing it again
we'd miss the nudge to certify the audit (which would be useful to
anyone consuming it). So bumping the versions to a not-yet-published
number is a good way to correctly articulate the situation.
Differential Revision: https://phabricator.services.mozilla.com/D150055
We formerly published webrender to crates.io, but haven't done so in
several years. However, the in-tree version number still matches the
version published on crates.io, causing cargo-vet to flag that this is
something that should potentially be audited. We could silence that on
the cargo-vet side, but then if we ever starting publishing it again
we'd miss the nudge to certify the audit (which would be useful to
anyone consuming it). So bumping the versions to a not-yet-published
number is a good way to correctly articulate the situation.
Differential Revision: https://phabricator.services.mozilla.com/D150055
Change to derive from nsDisplayEffectsBase, since the backdrop-filter
can still have a visual bounds (and effect) even if the child
stacking context has no items. Also use the same approach to get
the bounding rect and implement GetBounds as nsDisplayFilters uses.
Differential Revision: https://phabricator.services.mozilla.com/D145295
This is required to be done in WR rather than Gecko since a backdrop
filter may exist in a child iframe display list but sample from a
backdrop root in a parent iframe (if the iframe has transparency
enabled). In this case the parent display list has no knowledge
that a backdrop filter primitive is present.
Differential Revision: https://phabricator.services.mozilla.com/D146149
Although not needed right now (checkerboarding backgrounds get
a slice anyway due to being a different scroll root), this will
be important for the upcoming work to make backdrop filter
roots implicit. This allows WR to know when slicing up a content
slice if the prim is relevant to the backdrop root.
Differential Revision: https://phabricator.services.mozilla.com/D146145
Add support for backdrop-filter primitive in WR. Each backdrop
filter primitive establishes a render task sub-graph. The primitive
instance registers itself as a resolve source to receive the
pixels from the backdrop root.
Also add a number of basic wrench tests for various backdrop-filter
use cases.
Differential Revision: https://phabricator.services.mozilla.com/D143334
The shared font namespace was getting allocated by the RenderBackend, while
we otherwise allocated namespaces for documents on the client. We need to make
sure that if namespace allocation happens on the client, that it also happens
on the client for the shared font namespace. This adds to RendererOptions to
make passing in a shared font namespace allocated by the client possible.
Differential Revision: https://phabricator.services.mozilla.com/D142381
We schedule additional composites as long as there is a CompositeUntil timestamp. These composites need a special tag and shouldn't carry the tags of whatever led us to check the timestamps. In addition, we should add the ASYNC_IMAGE tag when GetAndResetWillGenerateFrame returns true even if some other reason led us to check it.
Depends on D142233
Differential Revision: https://phabricator.services.mozilla.com/D142234
Also differentiate between skipped composites (too many pending frames) and discarded composites (paused or no display list).
Differential Revision: https://phabricator.services.mozilla.com/D142232
This replaces the sharing of SharedFontInstanceMap with a new structure
SharedFontResources that can be used to trade a mechanism between threads
of a single Renderer instance for de-duplicating fonts and font instances.
SharedFontResources stores maps of FontTemplates and FontInstances as well
as a new FontKeyMap and FontInstanceKeyMap which handles the mapping of
namespace-local font keys to a shared key. The shared key then maps to
the real, de-duplicated resource (template or instance) which has a lifetime
beyond that of any individual namespace that may refer to it. Reference
counting is used to track the lifetime of the shared key so that when no key
map entries refer to the shared key any longer, it will then expire and
be cleaned up. This does cause some complications with clearing a namespace
in that rather than simply crawling through a table looking for resources
with a given IdNamespace, we have to check for shared keys that have expired
when clearing a namespace caused the last references to their mappings
to be removed.
Given that ApiResources handles the up-front addition of font templates
and instances, while ResourceCache within the RenderBackend handles deletion,
most of these mappings have to be shared between threads, which is why they
live within SharedFontResources. When resource updates are processed by
either ApiResources or ResourceCache, we create a shared key as necessary to
add the font resource, and then delete the shared font resource when a resource
update caused the last reference to the resource's shared key to expire.
This only tries to de-duplicate fonts within a single Renderer (window). Since
each Renderer has its own texture cache and dependent glyph cache, sharing
across multiple windows would require extra complication with storing font
bitmaps outside of the texture cache and outside the Renderer instance itself.
For the sake of simplicity and to better understand how de-duplication impacts
performance, this patch only tries to address sharing within a single window.
Differential Revision: https://phabricator.services.mozilla.com/D140561
P010 is trivially the same as NV12, but the 10-bit colors are packed into
the most significant bits instead of the least significant bits. This changes
the yuv shader to use the correct packing for P010. It treats P010 as its
own yuv format, which requires a lot of scaffolding.
Differential Revision: https://phabricator.services.mozilla.com/D140422
This fixes an issue caused by landing Bug 1679927. Now that it's possible to
get 10-bit formats in macOS, the texture pipeline has to be updated to pass
those surfaces to WebRender.
This also contains a drive-by comment on D3D11 handing of 10-bit WebRender
display items. It was not clear why specifing 16-bit color depth for P010
format was correct there, but should be specified as 10-bit color elsewhere.
This also contains another drive-by fix to comments in webrender describing
the RG16 enum.
Depends on D136046
Differential Revision: https://phabricator.services.mozilla.com/D136823
We need them for SVG primitives.
This patch adds a bit of plumbing to disable snapping some of the primitives and forcing the antialiasing shader feature where needed, and uses it for SVG solid rectangles and images.
Differential Revision: https://phabricator.services.mozilla.com/D139024
This fixes an issue caused by landing Bug 1679927. Now that it's possible to
get 10-bit formats in macOS, the texture pipeline has to be updated to pass
those surfaces to WebRender.
This also contains a drive-by comment on D3D11 handing of 10-bit WebRender
display items. It was not clear why specifing 16-bit color depth for P010
format was correct there, but should be specified as 10-bit color elsewhere.
This also contains another drive-by fix to comments in webrender describing
the RG16 enum.
Differential Revision: https://phabricator.services.mozilla.com/D136823
This change mitigates the gap between the external_scroll_offset informed from
the main-thread and scroll_offset informed from APZ.
Some wrench reftests for this change are in the next commit.
Differential Revision: https://phabricator.services.mozilla.com/D133444
Depends on D136163
I have a feeling this isn't exactly the right way to pass this info, since the old WR code must have known about the perspective node without using my new flag.
Differential Revision: https://phabricator.services.mozilla.com/D136180