Using ipc::Shmem causes unbounded shmem use growth until e.g. a Worker
yields to the event loop. If a Worker never yields, Shmems sent to WebGLParent
are never released.
Specifically the manager (PCanvasManager) for WebGLParent calls
DestroySharedMemory, which sends/enqueues for WebGLChild's manager a
matching call to ShmemDestroyed. However, while WebGLChild refuses to spin its
event loop (such as a no-return WASM Worker), the ShmemDestroyed events
will just pile up. Closing e.g. the tab frees the shmems, but they accumulate
unbounded until the Worker yields to the event loop.
This is true for other users of ipc::Shmem (or RaiiShmem) as well, but
entrypoints other than DispatchCommands are rarer and can be handled
later similarly.
Differential Revision: https://phabricator.services.mozilla.com/D162946
Using ipc::Shmem causes unbounded shmem use growth until e.g. a Worker
yields to the event loop. If a Worker never yields, Shmems sent to WebGLParent
are never released.
Specifically the manager (PCanvasManager) for WebGLParent calls
DestroySharedMemory, which sends/enqueues for WebGLChild's manager a
matching call to ShmemDestroyed. However, while WebGLChild refuses to spin its
event loop (such as a no-return WASM Worker), the ShmemDestroyed events
will just pile up. Closing e.g. the tab frees the shmems, but they accumulate
unbounded until the Worker yields to the event loop.
This is true for other users of ipc::Shmem (or RaiiShmem) as well, but
entrypoints other than DispatchCommands are rarer and can be handled
later similarly.
Differential Revision: https://phabricator.services.mozilla.com/D162946
When rendering large and/or fullscreen Canvas2Ds, excessive time can be spent
in calls to TexImage/ReadPixels copying into and out of Shmems to the separate
buffer for DrawTargetSkia. To alleviate this, we can make the DrawTargetSkia
directly wrap the Shmem, so that calls to TexImage/ReadPixels then directly
read or write to this without any separate copy. We modify RawTexImage to use
the IPDL SendTexImage path so that Shmems can be sent via SurfaceDescriptor.
Since SendTexImage is nominally async (which is beneficial), we rely on a
call to GetError later to verify that the Shmem processing is completely before
we further modify the DrawTargetSkia. We further add a ReadPixelsIntoShmem IPDL
call to allow sending the Shmem in the other direction directly.
Differential Revision: https://phabricator.services.mozilla.com/D151286
When rendering large and/or fullscreen Canvas2Ds, excessive time can be spent
in calls to TexImage/ReadPixels copying into and out of Shmems to the separate
buffer for DrawTargetSkia. To alleviate this, we can make the DrawTargetSkia
directly wrap the Shmem, so that calls to TexImage/ReadPixels then directly
read or write to this without any separate copy. We modify RawTexImage to use
the IPDL SendTexImage path so that Shmems can be sent via SurfaceDescriptor.
Since SendTexImage is nominally async (which is beneficial), we rely on a
call to GetError later to verify that the Shmem processing is completely before
we further modify the DrawTargetSkia. We further add a ReadPixelsIntoShmem IPDL
call to allow sending the Shmem in the other direction directly.
Differential Revision: https://phabricator.services.mozilla.com/D151286
When rendering large and/or fullscreen Canvas2Ds, excessive time can be spent
in calls to TexImage/ReadPixels copying into and out of Shmems to the separate
buffer for DrawTargetSkia. To alleviate this, we can make the DrawTargetSkia
directly wrap the Shmem, so that calls to TexImage/ReadPixels then directly
read or write to this without any separate copy. We modify RawTexImage to use
the IPDL SendTexImage path so that Shmems can be sent via SurfaceDescriptor.
Since SendTexImage is nominally async (which is beneficial), we rely on a
call to GetError later to verify that the Shmem processing is completely before
we further modify the DrawTargetSkia. We further add a ReadPixelsIntoShmem IPDL
call to allow sending the Shmem in the other direction directly.
Differential Revision: https://phabricator.services.mozilla.com/D151286
Currently CopyToSwapChain creates spurious copies of the back buffer when SharedSurfaces aren't exportable (= ToSurfaceDescriptor returns Nothing from SharedSurface_Basic). These then later get read back into a CPU memory buffer when PresentFrontBufferToCompositor is used to send the buffer to RemoteTextureMap. This has associated performance and memory costs.
Conceptually, we want Present/CopyToSwapChain to just do the right thing and automatically push buffers to RemoteTextureMap, rather than secondarily needing a hidden call to PresentFrontBufferToCompositor. Then we can get rid of the need to create front buffers whose only purpose is to shuttle data to PresentFrontBufferToCompositor which then shuttles it RemoteTextureMap.
This patch achieves this by refactoring the guts of PresentFrontBufferToCompositor into Present/CopyToSwapChain. The remote texture ids are sent along inside SwapChainOptions if async present is enabled. Those remote texture ids are cached in ClientWebGLContext so that GetFrontBuffer can return them without any subsequent need for an IPDL call.
On the parent side, CopyToSwapChain will now notice if async present is to be used and if a SurfaceFactory does not generate SharedSurfaces that can be exported. In that case it cuts out the middle man and reads from the WebGLFramebuffer's back buffer directly into a CPU buffer to send to RemoteTextureMap.
This also adds a forceAsyncPresent option to SwapChainOptions so that in the future we can have a separate pref for Accelerated Canvas2D that will allow enabling async present independent of the global WebGL pref.
Differential Revision: https://phabricator.services.mozilla.com/D150720
The async front buffer posting is going to be enabled by another bug.
Async IPC was added for async front buffer posting for out-of-process WebGL.
Client does not use TextureClient for storing SurfaceDescriptor.
It works basically same way as to in-process WebGL around nsDisplayCanvas, WebRenderCanvasData, WebRenderCommandBuilder and WebRenderBridgeParent.
SharedSurfaces of SurfaceDescriptorD3D10 are kept alive during their usage. It is for keeping a shread handle valid.
Copied data buffers of SharedShurface_Basics are kept alive during their usage. It is for keeping RenderBufferTextureHost valid.
Differential Revision: https://phabricator.services.mozilla.com/D150197
The async front buffer posting is going to be enabled by another bug.
Async IPC was added for async front buffer posting for out-of-process WebGL.
Client does not use TextureClient for storing SurfaceDescriptor.
It works basically same way as to in-process WebGL around nsDisplayCanvas, WebRenderCanvasData, WebRenderCommandBuilder and WebRenderBridgeParent.
SharedSurfaces of SurfaceDescriptorD3D10 are kept alive during their usage. It is for keeping a shread handle valid.
Copied data buffers of SharedShurface_Basics are kept alive during their usage. It is for keeping RenderBufferTextureHost valid.
Differential Revision: https://phabricator.services.mozilla.com/D150197
Also:
* Add label for WebGLContext::Create.
* *Remove* label for WebGLContext::Draw*, since the labels for these are
showing up in draw-call-heavy profiles. (~1.5%)
* WebGLContext::UniformData continues to not have a label, since that
would likely add another 2-3% labelling overhead.
* Add missing !mHost check to RecvTexImage.
Differential Revision: https://phabricator.services.mozilla.com/D134883
Right now, if we wanted to get a snapshot of an OffscreenCanvas context
on a worker thread from the main thread, we would need to block the main
thread on the worker thread, and from the worker thread, issue a sync
IPC call get the front buffer snapshot.
This patch adds an alternative which allows the main thread to go
directly to the canvas owning thread in the compositor process to get
the snapshot, thereby bypassing the worker thread in content process
entirely. All it needs as the unique ID of the CanvasManagerChild
instance, and the protocol ID of the WebGLChild instance.
This will be used for Firefox screenshots, New Tab tiles, and printing.
Differential Revision: https://phabricator.services.mozilla.com/D130785
Right now, if we wanted to get a snapshot of an OffscreenCanvas context
on a worker thread from the main thread, we would need to block the main
thread on the worker thread, and from the worker thread, issue a sync
IPC call get the front buffer snapshot.
This patch adds an alternative which allows the main thread to go
directly to the canvas owning thread in the compositor process to get
the snapshot, thereby bypassing the worker thread in content process
entirely. All it needs as the unique ID of the CanvasManagerChild
instance, and the protocol ID of the WebGLChild instance.
This will be used for Firefox screenshots, New Tab tiles, and printing.
Differential Revision: https://phabricator.services.mozilla.com/D130785
Previously we called GetFrontBufferSize, alloc'd a buffer, and called
FrontBufferSnapshotInto to read into the buffer.
Now, call FrontBufferSnapshotInto({}) to get size, and then alloc and
pass pointer to newly alloc'd data into
FrontBufferSnapshotInto(Some(ptr)).
Using the same function for both means that logic can't diverge and
cause mismatch bugs.
Differential Revision: https://phabricator.services.mozilla.com/D113611
Previously we called GetFrontBufferSize, alloc'd a buffer, and called
FrontBufferSnapshotInto to read into the buffer.
Now, call FrontBufferSnapshotInto({}) to get size, and then alloc and
pass pointer to newly alloc'd data into
FrontBufferSnapshotInto(Some(ptr)).
Using the same function for both means that logic can't diverge and
cause mismatch bugs.
Differential Revision: https://phabricator.services.mozilla.com/D113611
* Use clearer pref names.
* Default (and only support) IPDL dispatching.
* Make DispatchCommands async-only.
* Sync ipdl command per sync webgl entrypoint.
* Eat the boilerplate cost, since there's not too many.
* Run SerializedSize off same path as Serialize.
* All shmem uploads go through normal DispatchCommands.
* Defer pruning of dead code for now so we can iterate quickly.
* Use Read/Write(begin,end) instead of (begin,size).
* This would have prevented a bug where we read/wrote N*sizeof(T)*sizeof(T).
Differential Revision: https://phabricator.services.mozilla.com/D81495
* Majorly simplity CanvasRenderer
* Replace GLScreenBuffer with trivial GLSwapChain
* Use descriptor structs so that future SharedSurface changes aren't so painful
to propagate
* Mortgage/strip out more OffscreenCanvas code for now
Differential Revision: https://phabricator.services.mozilla.com/D75055
* Majorly simplity CanvasRenderer
* Replace GLScreenBuffer with trivial GLSwapChain
* Use descriptor structs so that future SharedSurface changes aren't so painful
to propagate
* Mortgage/strip out more OffscreenCanvas code for now
Differential Revision: https://phabricator.services.mozilla.com/D75055
* Majorly simplity CanvasRenderer
* Replace GLScreenBuffer with trivial GLSwapChain
* Use descriptor structs so that future SharedSurface changes aren't so painful
to propagate
* Mortgage/strip out more OffscreenCanvas code for now
Differential Revision: https://phabricator.services.mozilla.com/D75055
* Majorly simplity CanvasRenderer
* Replace GLScreenBuffer with trivial GLSwapChain
* Use descriptor structs so that future SharedSurface changes aren't so painful
to propagate
* Mortgage/strip out more OffscreenCanvas code for now
Differential Revision: https://phabricator.services.mozilla.com/D75055