We can easily use Maybe<DataSourceSurface::ScopedMap> instead of
allocated the map on the heap. This does require some minor changes to
ScopedMap to properly support moves, but should be much more efficient.
In FrameAnimator::GetCompositedFrame, we call SurfaceCache::Lookup even
when we use the composited frame directly and leave the lookup result
unused. The only value in performing the lookup could be to mark the
surface as used to avoid expiring it too soon, but
FrameAnimator::RequestRefresh should already be doing enough to keep it
alive, if the image isn't locked in the first place.
In FrameAnimator::RequestRefresh and AdvanceFrame, we currently create
several RawAccessFrameRef objects to the same frames, either to get
timeouts or perform the blending. With some tweaking, we can avoid
requesting the same frame more than once. This will avoid mutex locks on
the surface provider and the frame itself.
DrawableSurface only exposes DrawableFrameRef to its users. This is
sufficient for the drawing related code in general, but FrameAnimator
really needs RawAccessFrameRef to the underlying pixel data (which may
be paletted). While one can get a RawAccessFrameRef from a
DrawableFrameRef, it requires yet another lock of the imgFrame's mutex.
We can avoid this extra lock if we just allow the callers to get the
right data type in the first place.
RawAccessFrameRef ensures there is a valid data pointer to the pixel
data for the frame. It is a common pattern for users of
RawAccessFrameRef to follow up with a request for the data pointer
shortly after creation. We can avoid an extra lock by exposing this data
pointer from RawAccessFrameRef, and populating it via
imgFrame::LockImageData.
We currently choose to set the animation parameters (blend method, blend
rect, disposal method, timeout) in imgFrame::Finish instead of
imgFrame::InitForDecoder. The decoders themselves already have access to
the necessary information at the time InitForDecoder is called, so there
is no reason to do this. Moving the configuration to initialization will
allow us to relax the mutex protection on these parameters.
This part simply reorganizes imgFrame, and subsequent parts will
introduce the necessary changes to SurfacePipe and decoders.
We should avoiding creating a DrawTarget to create a new
DataSourceSurface when the original surface produced by
RasterImage::GetFrameAtSize matches our requirements in
imgTools::EncodeScaledImage. We should also be using Skia instead of
Cairo.
This patch also fixes a few error conditions where we would not have
unmapped the surface properly.
nsGIFDecoder2::YieldPixel is sufficiently complex that the optimizer
does not appear to inline it with the rest of the templated methods. As
such there is a high cost to calling it. This patch modifies it to yield
a requested number of pixels before exiting, allowing us to amortize the
cost of calling across a row instead of a pixel. Based on profiling,
this will significantly reduce the time require to decode a frame.
It has been observed in profiling that the templated methods that write
pixels to an image buffer do not always inline methods properly, leading
to a high cost of writing a single pixel if it is less than trivial. As
such, there is a new SurfacePipe method, WritePixelBlocks, which
requests pixels in blocks. The provided lambda will write up to the
requested number of pixels into the given buffer. WritePixelBlocks
itself will request enough pixels to fill the row, advance the row if
complete and iterate until it is complete or we need more data.
Regardless of the size of an encoded image, SourceBuffer::Compact would
try to consolidate all of the chunks into a single chunk. If an image is
quite large, it can be actively harmful to do this, because we want a
very large contiguous chunk of memory for no real reason, and spend
extra time on the main thread doing the memcpy/consolidation.
Instead we now cap out the chunk size at 20MB. If we start allocating
chunks of this size, we will not perform compacting when we have
received all of the data. (Save for realloc'ing the last chunk since it
probably isn't full.)
On a related note, if we hit an out-of-memory condition in the middle of
appending data to the SourceBuffer, we would swallow the error. This is
because nsIInputStream::ReadSegments will succeed if any data was
written. This leaves the SourceBuffer out of sync. We now propogate this
error up properly to the higher levels.
fixup
All animated images on a page are currently registered with the refresh
driver and advance with the tick refresh. These animations may not even
be in view, and if they are large and thus cause redecoding, cause a
marked increase in CPU usage for no benefit to the user.
This patch adds an additional flag, mCompositedFrameRequested, to the
AnimationState used by FrameAnimator. It is set to true each time the
current animated image frame is requested via
FrameAnimator::GetCompositedFrame. It is set to false each time the
frame is advanced in FrameAnimator::AdvanceFrame (via
FrameAnimator::RequestRefresh). If it is true when
FrameAnimator::RequestRefresh is called, then it will advance the
animation according to the normal rules. If it is false, then it will
set the current animation time to the current time instead of advancing.
This should not cause the animation to fall behind anymore or skip
frames more than it does today. This is because if
FrameAnimator::GetCompositedFrame is not called, then the internal state
of the animation is advancing ahead of what the user sees. Once it is
called, the new frame is far ahead of the previously displayed frame.
The only difference now is that we will display the previous frame for
slightly longer until the next refresh tick.
Note that if an animated image is layerized (should not happen today) or
otherwise uses an ImageContainer, this optimization fails. While we know
whether or not we have an image container, we do not know if anything is
actively using it.
We can discard frames from an animated image if the memory footprint
exceeds the threshold. This will cause us to redecode frames on demand
instead. However decoders can fail to produce the same results on
subsequent runs due to differences in memory pressure, etc. If this
happens our state can get inconsistent. In particular, if we keep
failing on the first frame, we end up in an infinite loop on the decoder
thread.
Since we don't have the owning image to signal, as we had to release our
reference to it after the first pass, we can do little but stop decoding.
From the user's perspective, the animation will come to a stop.
If an imgCacheValidator object is destroyed without calling
imgCacheValidator::OnStartRequest, or imgRequest::Init fails in
OnStartRequest, we left the bound proxies hanging on an update. Now we
cancel the new request, and bind the validating proxies to said request
to ensure their listeners fail gracefully.
We can discard frames from an animated image if the memory footprint
exceeds the threshold. This will cause us to redecode frames on demand
instead. However decoders can fail to produce the same results on
subsequent runs due to differences in memory pressure, etc. If this
happens our state can get inconsistent. In particular, if we keep
failing on the first frame, we end up in an infinite loop on the decoder
thread.
Since we don't have the owning image to signal, as we had to release our
reference to it after the first pass, we can do little but stop decoding.
From the user's perspective, the animation will come to a stop.
Many of these could probably be fuzzed but in the interests of getting
the reftest suite turned on sooner I'm doing a blanket fails-if. This
covers all the reftests where there is more fuzz with webrender on
windows than any of existing annotations account for. In some cases the
fuzz is only a few pixels more than the equivalent Linux fuzz already
annotated, but I'll clean that up in a future bug.
MozReview-Commit-ID: IaKarbnL46d
--HG--
extra : rebase_source : 71889340305b0b12fa8eace722e42bb3faf14419
NullPrincipal::Create() (will null OA) may cause an OriginAttributes bypass.
We change Create() so OriginAttributes is no longer optional, and rename
Create() with no arguments to make it more explicit about what the caller is doing.
MozReview-Commit-ID: 7DQGlgh1tgJ
When we shutdown the decode pool threads, it does not do a simple join
with the main thread. It will actually process the main thread event
loop, which can cause a bad series of events. The refresh tick could
still be running and advancing our animated images, causing the animated
decoders to continue running, which in turn prevents the decoder threads
from finishing shutting down, and the main thread from joining them.
Now we check on each frame whether or not the decoder should just stop
decoding more frames because the decode pool has started shutdown. If it
has, it will stop immediately.
We should only attempt to discard animation image frames after passing
the frame threshold on the very first pass on the animation. Redecodes
are already in the correct state, as it will discard frames as it
advances the animation. This patch makes it clear what it should be
doing when, but there should be no functional change.