In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
isalloc_validate is the function behind malloc_usable_size. If for some
reason malloc_usable_size is called before mozjemalloc is initialized,
this can lead to an unexpected crash.
The chance of this actually happening is rather slim on Linux
and Windows (although still possible), and impossible on Mac, due to the
fact the earlier something can end up calling it is after the
mozjemalloc zone is registered, which happens after initialization.
... except with bug 1399921, which reorders that initialization, and
puts the zone registration first. There's then a slim chance for the
zone allocator to call into zone_size, which calls malloc_usable_size,
to determine whether a pointer allocated by some other zone belongs to
mozjemalloc's.
And it turns out that does happen, during the startup of the
plugin-container process on OSX 10.10 (but not more recent versions).
--HG--
extra : rebase_source : 331d093b03add7b2c2ce440593f5aeccaaf4dd1f
Now that this is a C++ file, and that the function names are not
mangled, we can just use the actual C++ names.
We do however need to replace MOZ_MEMORY_API, which implies extern "C",
with MFBT_API.
Also use the correct type for the size given to operator new. It
happened to work before because the generated code would just jump to
malloc without touching any register, but on aarch64, unsigned int was
the wrong type.
--HG--
extra : rebase_source : 8045f30e9c609dd7d922c77d85ac017638df6961
And since the build system doesn't handle transitions from foo.c to
foo.cpp properly without a clobber, we work around the issue by
switching to unified sources.
--HG--
rename : memory/build/mozmemory_wrap.c => memory/build/mozmemory_wrap.cpp
extra : rebase_source : 3f074b4ccab255bb0eb16841f79582060fafbc86
It happens to work because of mozglue.def, but we might as well have the
right annotations (which will also make things correct when building this
file to C++)
--HG--
extra : rebase_source : 61056dc21c9c29bab62ad5d648e94dd56dc53b14
This used to be necessary because those functions might be prefixed with
__wrap_, and linked against with -Wl,--wrap, but that's not been the case
since bug 1077366. Furthermore, mozmem_malloc_impl nowadays only does
something on Windows, and those operators are only exposed on Android.
--HG--
extra : rebase_source : ca34442bfbc5fc8be20ffcfacb9afa0f2f818b82
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
There is a lot of churn involved in adding new API surface to
mozjemalloc, and mozmemory.h is one. Instead of declaring
everything manually in there, "generate" the declarations through
malloc_decls.h.
--HG--
extra : rebase_source : 1416fa972319c419112c4a8b16759d90692db5b2
mozalloc_abort is not marked as a noreturn function on ARM, so clang
complains when abort calls mozalloc_abort, which calls MOZ_CRASH, which
calls abort(). We know this is OK, so just disable the warning.
The bin-unused count in memory reports indicates how much memory is
used by runs of small and sub-page allocations that is not actually
allocated. This is generally thought as an indicator of fragmentation.
While this is generally true, with the use of thread local arenas by
stylo, combined with how stylo allocates memory, it ends up also being
an indicator of wasted memory.
For instance, over the lifetime of an AWSY iteration, there are only a
few allocations that ends up in the bucket for 2048 allocated bytes. In
the "worst" case, there's only one. But the run size for such
allocations is 132KiB. Which means just because we're allocating one
buffer of size between 1024 and 2048 bytes, we end up wasting 130+KiB.
Per thread.
Something similar happens with size classes of 512 and 1024, where the
run size is respectively 32KiB and 64KiB, and where there's at most a
handful of allocations of each class ever happening per thread.
Overall, an allocation log from a full AWSY iteration reveals that there
are only 448 of 860700 allocations happening on the stylo arenas that
involve sizes above (and excluding) 512 bytes, so 0.05%.
While there are improvements that can be done to mozjemalloc so that it
doesn't waste more than one page per sub-page size class, they are
changes that are too low-level to land at this time of the release
cycle. However, considering the numbers above and the fact that the
stylo arenas are only really meant to avoid lock contention during the
heavy parallel work involved, a short term, low risk, strategy is to
just delegate all sub-page (> 512, < 4096) and large (>= 4096) to the
main arena. Technically speaking, only sub-page allocations are causing
this waste, but it's more consistent to just delegate everything above
512 bytes.
This should save 132KiB + 64KiB = 196KiB per stylo thread.
--HG--
extra : rebase_source : c7233d60305365e76aa124045b1c9492068d9415
Until bug 1361258, there was only ever one mozjemalloc arena, and the
number of dirty pages we allow to be kept dirty, fixed to 1MB per arena,
was, in fact, 1MB for an entire process.
With stylo using thread local arenas, we now can have multiple arenas
per process, multiplying that number of dirty pages.
While those dirty pages may be reused later on, when other allocations
end up filling them later on, the fact that a relatively large number of
them is kept around for each stylo thread (in proportion to the amount of
memory ever allocated by stylo), combined with the fact that the memory
use from stylo depends on the workload generated by the pages being
visited, those dirty pages may very well not be used for a rather long
time. This is less of a problem with the main arena, used for most
everything else.
So, for each arena except the main one, we decrease the number of dirty
pages we allow to be kept around to 1/8 of the current value. We do this
by introducing a per-arena configuration of that maximum number.
--HG--
extra : rebase_source : 75eebb175b3746d5ca1c371606cface50ec70f2f
Those macros are one more thing that needs to be added when the
mozjemalloc API surface is increased, but after bug 1399350, nothing
actually needs them, so remove them.
--HG--
extra : rebase_source : 2bf62cc6c179540482722a72b0d0c134d2ac2a19
The files relevant to the memory allocator are currently spread between
memory/mozjemalloc and memory/build, and the distinction was
historically from sharing some Mozilla-specific things between
mozjemalloc and jemalloc3. That distinction is not useful anymore, so
we fold everything together.
As we will likely rename the allocator at some point in the future, it
is preferable to move away from the mozjemalloc directory rather than in
its direction.
--HG--
rename : memory/mozjemalloc/Makefile.in => memory/build/Makefile.in
rename : memory/mozjemalloc/mozjemalloc.cpp => memory/build/mozjemalloc.cpp
rename : memory/mozjemalloc/mozjemalloc.h => memory/build/mozjemalloc.h
rename : memory/mozjemalloc/mozjemalloc_types.h => memory/build/mozjemalloc_types.h
rename : memory/mozjemalloc/rb.h => memory/build/rb.h
Mozjemalloc uses its own doubly linked list, which, being inherited from
C code, doesn't do much type checking, and, in practice, is rather
similar to DoublyLinkedList, so use the latter instead.
--HG--
extra : rebase_source : 9eb7334b6dde05f9af0eaea4184e532c69d0264e
clang warns about this code in mozmemory_wrap.c in the reimplementation
of vasprintf, complaining that `fmt` cannot be null:
if (str == NULL || fmt == NULL) {
^~~ ~~~~
clang is apparently exploiting knowledge about the requirements of
vasprintf here, but defensive programming on the part of our
reimplementation seems like the wiser course. Let's turn off the
warning.
Previously being a C codebase, mozjemalloc was using typedefs, avoiding
long "struct foo" uses. C++ doesn't need typedefs to achieve that, so we
can remove all that. We however keep a few typedefs in headers that are
still included from other C source files.
--HG--
extra : rebase_source : d0d807bcb76078c0ce36e4554b10803bfb36ddbb
Some system libraries call malloc_zone_free directly instead of free,
and sometimes they do that with the wrong zone. When that happens, we
circle back, trying to find the right zone, and call malloc_zone_free
with the right one, but when we can't find one, we crash, which matches
what the system free() would do. Except in one case where the pointer
we're being passed is NULL, in which case we can't trace it back to any
zone, but shouldn't crash (system free() explicitly doesn't crash in
that case).
--HG--
extra : rebase_source : 17efdcd80f1a53be7ab6b7293bfb6060a9aa4a48
Practically speaking, with code inlining, this doesn't change anything,
but will allow, in a subsequent change, to easily divert the exported
allocation functions (malloc, etc.) in order to inject replace-malloc
in the pipeline.
The added macro magic is copied from replace-malloc.c.
At the same time, reformat the functions we're touching.
--HG--
extra : rebase_source : f615909101f832f3cc471e23a3cc44a60886843f
Because malloc_decls.h is meant to be included multiple times, it
shouldn't actually declare things itself.
--HG--
extra : rebase_source : 9d6f9b2c61407265377845963a19ace2614160f4
valloc is supposed to page-align data, but mozjemalloc's definition of
pagesize is statically compiled in and might not match the actual page
size at runtime (because of MALLOC_STATIC_SIZES). We change valloc in
mozjemalloc to use the runtime page size.
--HG--
extra : rebase_source : c5b1b56e783b311ac1620a87d910e019e3f18b49
While MOZ_MEMORY_BSD is set for kFreeBSD, FreeBSD and NetBSD, the only
use of MOZ_MEMORY_BSD is along a check for __GLIBC__ which can only be
true on GNU/kFreeBSD, which doesn't have a XP_ macro, but which we
usually match with __FreeBSD_kernel__.
--HG--
extra : rebase_source : 2f143f8ed641f4e338024a58e20c62b2b30e653a
For some reason, we don't have a XP_ macro for Android, and we generally use
ANDROID or MOZ_WIDGET_ANDROID. The former is more appropriate.
--HG--
extra : rebase_source : 50576584f194aaf6b090fd530598ce9954e170b3
Back when it was added (for Windows CE, in bug 488608), mozjemalloc was
C and all the supported compilers didn't support C99 bools. Now
mozjemalloc is C++, and all the supported compilers support C99 bools
for the cases where the type is used from C.
--HG--
extra : rebase_source : b9c710a0c48dc36cb473af59e3119131d13523ce
Back when mozjemalloc was considered third-party, and before bug
1365194, mozjemalloc was calling abort() and that was redirectory to
MOZ_CRASH through a moz_abort() function. Bug 1365194 changed that so
that moz_abort is called directly instead of abort, but the indirection
is actually not necessary anymore.
So we just kill moz_abort, which is unused anywhere else, and use
MOZ_CRASH directly.
Note this will (obviously) change crash signatures involving moz_abort.
--HG--
extra : rebase_source : 67698ffd8c5e52e62b9a0b7f28efb0352c8fe8ce