This was done in bug 736564 for the xulrunner SDK, which later became
the firefox SDK, which is now gone. So we don't actually need to keep it
separate anymore (except for logalloc/replay, which still needs to link
it directly, so we keep the library definition intact so it can be
referenced ; we just don't DIST_INSTALL it anymore, and always make it
linked into mozglue)
--HG--
extra : rebase_source : e4d0627ec907fe0139df5c0b2b9f7d04b43c7c78
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
isalloc_validate is the function behind malloc_usable_size. If for some
reason malloc_usable_size is called before mozjemalloc is initialized,
this can lead to an unexpected crash.
The chance of this actually happening is rather slim on Linux
and Windows (although still possible), and impossible on Mac, due to the
fact the earlier something can end up calling it is after the
mozjemalloc zone is registered, which happens after initialization.
... except with bug 1399921, which reorders that initialization, and
puts the zone registration first. There's then a slim chance for the
zone allocator to call into zone_size, which calls malloc_usable_size,
to determine whether a pointer allocated by some other zone belongs to
mozjemalloc's.
And it turns out that does happen, during the startup of the
plugin-container process on OSX 10.10 (but not more recent versions).
--HG--
extra : rebase_source : 331d093b03add7b2c2ce440593f5aeccaaf4dd1f
Now that this is a C++ file, and that the function names are not
mangled, we can just use the actual C++ names.
We do however need to replace MOZ_MEMORY_API, which implies extern "C",
with MFBT_API.
Also use the correct type for the size given to operator new. It
happened to work before because the generated code would just jump to
malloc without touching any register, but on aarch64, unsigned int was
the wrong type.
--HG--
extra : rebase_source : 8045f30e9c609dd7d922c77d85ac017638df6961
And since the build system doesn't handle transitions from foo.c to
foo.cpp properly without a clobber, we work around the issue by
switching to unified sources.
--HG--
rename : memory/build/mozmemory_wrap.c => memory/build/mozmemory_wrap.cpp
extra : rebase_source : 3f074b4ccab255bb0eb16841f79582060fafbc86
It happens to work because of mozglue.def, but we might as well have the
right annotations (which will also make things correct when building this
file to C++)
--HG--
extra : rebase_source : 61056dc21c9c29bab62ad5d648e94dd56dc53b14
This used to be necessary because those functions might be prefixed with
__wrap_, and linked against with -Wl,--wrap, but that's not been the case
since bug 1077366. Furthermore, mozmem_malloc_impl nowadays only does
something on Windows, and those operators are only exposed on Android.
--HG--
extra : rebase_source : ca34442bfbc5fc8be20ffcfacb9afa0f2f818b82
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
There is a lot of churn involved in adding new API surface to
mozjemalloc, and mozmemory.h is one. Instead of declaring
everything manually in there, "generate" the declarations through
malloc_decls.h.
--HG--
extra : rebase_source : 1416fa972319c419112c4a8b16759d90692db5b2
The bin-unused count in memory reports indicates how much memory is
used by runs of small and sub-page allocations that is not actually
allocated. This is generally thought as an indicator of fragmentation.
While this is generally true, with the use of thread local arenas by
stylo, combined with how stylo allocates memory, it ends up also being
an indicator of wasted memory.
For instance, over the lifetime of an AWSY iteration, there are only a
few allocations that ends up in the bucket for 2048 allocated bytes. In
the "worst" case, there's only one. But the run size for such
allocations is 132KiB. Which means just because we're allocating one
buffer of size between 1024 and 2048 bytes, we end up wasting 130+KiB.
Per thread.
Something similar happens with size classes of 512 and 1024, where the
run size is respectively 32KiB and 64KiB, and where there's at most a
handful of allocations of each class ever happening per thread.
Overall, an allocation log from a full AWSY iteration reveals that there
are only 448 of 860700 allocations happening on the stylo arenas that
involve sizes above (and excluding) 512 bytes, so 0.05%.
While there are improvements that can be done to mozjemalloc so that it
doesn't waste more than one page per sub-page size class, they are
changes that are too low-level to land at this time of the release
cycle. However, considering the numbers above and the fact that the
stylo arenas are only really meant to avoid lock contention during the
heavy parallel work involved, a short term, low risk, strategy is to
just delegate all sub-page (> 512, < 4096) and large (>= 4096) to the
main arena. Technically speaking, only sub-page allocations are causing
this waste, but it's more consistent to just delegate everything above
512 bytes.
This should save 132KiB + 64KiB = 196KiB per stylo thread.
--HG--
extra : rebase_source : c7233d60305365e76aa124045b1c9492068d9415
Until bug 1361258, there was only ever one mozjemalloc arena, and the
number of dirty pages we allow to be kept dirty, fixed to 1MB per arena,
was, in fact, 1MB for an entire process.
With stylo using thread local arenas, we now can have multiple arenas
per process, multiplying that number of dirty pages.
While those dirty pages may be reused later on, when other allocations
end up filling them later on, the fact that a relatively large number of
them is kept around for each stylo thread (in proportion to the amount of
memory ever allocated by stylo), combined with the fact that the memory
use from stylo depends on the workload generated by the pages being
visited, those dirty pages may very well not be used for a rather long
time. This is less of a problem with the main arena, used for most
everything else.
So, for each arena except the main one, we decrease the number of dirty
pages we allow to be kept around to 1/8 of the current value. We do this
by introducing a per-arena configuration of that maximum number.
--HG--
extra : rebase_source : 75eebb175b3746d5ca1c371606cface50ec70f2f
Those macros are one more thing that needs to be added when the
mozjemalloc API surface is increased, but after bug 1399350, nothing
actually needs them, so remove them.
--HG--
extra : rebase_source : 2bf62cc6c179540482722a72b0d0c134d2ac2a19
The files relevant to the memory allocator are currently spread between
memory/mozjemalloc and memory/build, and the distinction was
historically from sharing some Mozilla-specific things between
mozjemalloc and jemalloc3. That distinction is not useful anymore, so
we fold everything together.
As we will likely rename the allocator at some point in the future, it
is preferable to move away from the mozjemalloc directory rather than in
its direction.
--HG--
rename : memory/mozjemalloc/Makefile.in => memory/build/Makefile.in
rename : memory/mozjemalloc/mozjemalloc.cpp => memory/build/mozjemalloc.cpp
rename : memory/mozjemalloc/mozjemalloc.h => memory/build/mozjemalloc.h
rename : memory/mozjemalloc/mozjemalloc_types.h => memory/build/mozjemalloc_types.h
rename : memory/mozjemalloc/rb.h => memory/build/rb.h
clang warns about this code in mozmemory_wrap.c in the reimplementation
of vasprintf, complaining that `fmt` cannot be null:
if (str == NULL || fmt == NULL) {
^~~ ~~~~
clang is apparently exploiting knowledge about the requirements of
vasprintf here, but defensive programming on the part of our
reimplementation seems like the wiser course. Let's turn off the
warning.
Some system libraries call malloc_zone_free directly instead of free,
and sometimes they do that with the wrong zone. When that happens, we
circle back, trying to find the right zone, and call malloc_zone_free
with the right one, but when we can't find one, we crash, which matches
what the system free() would do. Except in one case where the pointer
we're being passed is NULL, in which case we can't trace it back to any
zone, but shouldn't crash (system free() explicitly doesn't crash in
that case).
--HG--
extra : rebase_source : 17efdcd80f1a53be7ab6b7293bfb6060a9aa4a48
Because malloc_decls.h is meant to be included multiple times, it
shouldn't actually declare things itself.
--HG--
extra : rebase_source : 9d6f9b2c61407265377845963a19ace2614160f4
jemalloc_ptr_info() gives info about any pointer, such as whether it's within a
live or free allocation, and if so, info about that allocation. It's useful for
debugging.
moz_malloc_enclosing_size_of() uses jemalloc_ptr_info() to measure the size of
an allocation from an interior pointer. It's useful for memory reporting,
especially for Rust code.
--HG--
extra : rebase_source : caa19cccf8c2d1f79cf004fe6a408775de5a7b22
Back when it was added (for Windows CE, in bug 488608), mozjemalloc was
C and all the supported compilers didn't support C99 bools. Now
mozjemalloc is C++, and all the supported compilers support C99 bools
for the cases where the type is used from C.
--HG--
extra : rebase_source : b9c710a0c48dc36cb473af59e3119131d13523ce
Back when mozjemalloc was considered third-party, and before bug
1365194, mozjemalloc was calling abort() and that was redirectory to
MOZ_CRASH through a moz_abort() function. Bug 1365194 changed that so
that moz_abort is called directly instead of abort, but the indirection
is actually not necessary anymore.
So we just kill moz_abort, which is unused anywhere else, and use
MOZ_CRASH directly.
Note this will (obviously) change crash signatures involving moz_abort.
--HG--
extra : rebase_source : 67698ffd8c5e52e62b9a0b7f28efb0352c8fe8ce
Bug 1186064 removed most of it when we started requiring VS 2015u2, but
the "frex" function exported through mozglue.def.in was only used
through the MSVCRT being patched by fixcrt.py, which is not done anymore.
So the "frex" export is not used anymore, and so the "dumb_free_thunk"
function is not used anymore as well.
--HG--
extra : rebase_source : 879c469c317c8b6749410a4a476d6c951c9a1d0f
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
call directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : ee0b29e1275176f52e64f4648dfa7ce25d61292e
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
calls directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : 45b9b98499760a7f946878d41d2fdaadb6dff4d6
Replace-malloc libraries, such as DMD, don't really need to care about
the details of implementing all the variants of aligned memory
allocation functions. Currently, by defining MOZ_REPLACE_ONLY_MEMALIGN
before including replace_malloc.h, they get predefined functions.
Instead of making that an opt-in at build time, we make the
replace-malloc initialization just fill the replace-malloc
malloc_table_t with implementations that rely on the replace_memalign
the library provides.
--HG--
extra : rebase_source : 0842a67d9bc27a9a86c33d14d98b9c25f39982fb
Until now, the malloc implementation functions would call the
replace-malloc functions if they exist, and fallback to the real
allocator in no such function exists. Instead of doing this, we now
fill the empty slots in the malloc_table_t with the real allocator
functions.
--HG--
extra : rebase_source : b54634f23188906939e4dc01fc5a3007de0f3f2c
We make replace_malloc_init_funcs called on all platforms and fill out a
malloc_table_t for the replace-malloc functions with what comes from
dlsym/GetProcAddress on Android/Windows, and from the dynamically linked
weak symbols replace_* on other platforms.
replace_malloc.h contains definitions of *_impl_t types for each of the
functions in the malloc_table_t, which is redundant with the
replace_*_impl_t types we were creating, so we remove those typedefs,
except for the two functions (init and get_bridge) that don't have such
a typedef. Those functions don't appear in malloc_table_t.
--HG--
extra : rebase_source : 3705a99ee07f63dbaa66973eef19ddab224e0911
We want, in a subsequent patch, to have replace_malloc_init_funcs be
called on all platforms (including those relying on the replace-malloc
library being loaded already) and perform more initialization.
To prepare for that, we move the non-platform-specific pieces out.
--HG--
extra : rebase_source : 239ed363ee168bf4f8a96e0a1ca52981cb941b71
All the _impl functions in replace-malloc.c are largely identical. This
replaces all of them with macro expansions.
--HG--
extra : rebase_source : 67a1809b0b0fc4645ea5041154fa3a6dcb6cce6b
This makes no significant difference in practice in the macro
expansions, but will help down the line.
--HG--
extra : rebase_source : 6d61c1f28c558321478d7e5f26390d27ae8ae3ac
MOZ_REPLACE_JEMALLOC was only defined when building jemalloc4 as a
replace-malloc library.
--HG--
extra : rebase_source : fa5c402da07fa96448c170b6db99629469691efe
This avoids many additions of `extern "C"` in C++ code and will avoid
having to do the same to mozjemalloc once built as C++.
--HG--
extra : rebase_source : af55696262f40a9dd16a19c29edcb9bb307d4957