Bug 1479145 - Give RGB textures a 32-byte aligned stride on macOS in order to improve texture upload efficiency on certain drivers. r=mattwoodrow

In particular, it looks like this alignment is required by the Intel driver on
macOS if you want to avoid CPU copies.

It was already known that the efficiency gains from client storage only
materialize if you follow certain restrictions:
 - The textures need to use the TEXTURE_RECTANGLE_ARB texture target.
 - The textures' format, internalFormat and type need to be chosen from a small
   list of supported configurations. Unsupported configurations will trigger
   format conversions on the CPU.
 - The GL_TEXTURE_STORAGE_HINT_APPLE may need to be set to shared or cached.
 - glTextureRangeAPPLE may or may not make a difference.

It now appears that the stride alignment is another requirement:

When uploading textures which otherwise comply with the above requirements, the
Intel driver will still make copies using the CPU if the texture's stride is not
32-byte aligned. These CPU copies are reflected in a high CPU usage (as observed
in Activity Monitor) and they show up in profiles as time spent inside
_platform_memmove under glrUpdateTexture.

However, when uploading 32-byte stride aligned textures which comply with the
above requirements, this CPU usage goes away. There might still be hardware
copies behind the scenes, but they no longer take up CPU time.

Differential Revision: https://phabricator.services.mozilla.com/D25316

--HG--
extra : moz-landing-system : lando
This commit is contained in:
Markus Stange 2019-03-29 20:11:12 +00:00
Родитель c3a3d09307
Коммит baea6b435f
1 изменённых файлов: 5 добавлений и 0 удалений

Просмотреть файл

@ -22,7 +22,12 @@ namespace ImageDataSerializer {
using namespace gfx;
int32_t ComputeRGBStride(SurfaceFormat aFormat, int32_t aWidth) {
#ifdef XP_MACOSX
// Some drivers require an alignment of 32 bytes for efficient texture upload.
return GetAlignedStride<32>(aWidth, BytesPerPixel(aFormat));
#else
return GetAlignedStride<4>(aWidth, BytesPerPixel(aFormat));
#endif
}
int32_t GetRGBStride(const RGBDescriptor& aDescriptor) {