This effectively means malloc_hook_table_t is now C++ only, which is not
a big problem.
This also makes some functions use a return construct with functions
that don't return a value (such as free). While that is not allowed in
ISO C, it's allowed in C++, so the simplification is welcome (although,
retrospectively, it turns out C compilers don't complain about it
without -pedantic).
--HG--
extra : rebase_source : defd88ca3f6d478e61a4b970393dba60fb6ca81d
This was done in bug 736564 for the xulrunner SDK, which later became
the firefox SDK, which is now gone. So we don't actually need to keep it
separate anymore (except for logalloc/replay, which still needs to link
it directly, so we keep the library definition intact so it can be
referenced ; we just don't DIST_INSTALL it anymore, and always make it
linked into mozglue)
--HG--
extra : rebase_source : e4d0627ec907fe0139df5c0b2b9f7d04b43c7c78
It is one of the moving parts when adding new memory allocation APIs.
It was added in bug 1168719 and the only thing that actually used it
was the sampling-based memory profiler, which was removed in bug
1385953. We however keep the replace-malloc bridge entry point so that
something else, in the future, may still provide the feature.
--HG--
extra : rebase_source : dd4a226171429e2a4ab5666b0873e7b945f161e6
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
isalloc_validate is the function behind malloc_usable_size. If for some
reason malloc_usable_size is called before mozjemalloc is initialized,
this can lead to an unexpected crash.
The chance of this actually happening is rather slim on Linux
and Windows (although still possible), and impossible on Mac, due to the
fact the earlier something can end up calling it is after the
mozjemalloc zone is registered, which happens after initialization.
... except with bug 1399921, which reorders that initialization, and
puts the zone registration first. There's then a slim chance for the
zone allocator to call into zone_size, which calls malloc_usable_size,
to determine whether a pointer allocated by some other zone belongs to
mozjemalloc's.
And it turns out that does happen, during the startup of the
plugin-container process on OSX 10.10 (but not more recent versions).
--HG--
extra : rebase_source : 331d093b03add7b2c2ce440593f5aeccaaf4dd1f
Now that this is a C++ file, and that the function names are not
mangled, we can just use the actual C++ names.
We do however need to replace MOZ_MEMORY_API, which implies extern "C",
with MFBT_API.
Also use the correct type for the size given to operator new. It
happened to work before because the generated code would just jump to
malloc without touching any register, but on aarch64, unsigned int was
the wrong type.
--HG--
extra : rebase_source : 8045f30e9c609dd7d922c77d85ac017638df6961
And since the build system doesn't handle transitions from foo.c to
foo.cpp properly without a clobber, we work around the issue by
switching to unified sources.
--HG--
rename : memory/build/mozmemory_wrap.c => memory/build/mozmemory_wrap.cpp
extra : rebase_source : 3f074b4ccab255bb0eb16841f79582060fafbc86
It happens to work because of mozglue.def, but we might as well have the
right annotations (which will also make things correct when building this
file to C++)
--HG--
extra : rebase_source : 61056dc21c9c29bab62ad5d648e94dd56dc53b14
This used to be necessary because those functions might be prefixed with
__wrap_, and linked against with -Wl,--wrap, but that's not been the case
since bug 1077366. Furthermore, mozmem_malloc_impl nowadays only does
something on Windows, and those operators are only exposed on Android.
--HG--
extra : rebase_source : ca34442bfbc5fc8be20ffcfacb9afa0f2f818b82
In bug 1361258, we unified the initialization sequence on mac, and
chose to make the zone registration happen after jemalloc
initialization.
The order between jemalloc init and zone registration shouldn't actually
matter, because jemalloc initializes the first time the allocator is
actually used.
On the other hand, in some build setups (e.g. with light optimization),
the initialization of the thread_arena thread local variable can happen
after the forced jemalloc initialization because of the order the
corresponding static initializers run. In some levels of optimization,
the thread_arena initializer resets the value the jemalloc
initialization has set, which subsequently makes choose_arena() return
a bogus value (or hit an assertion in ThreadLocal.h on debug builds).
So instead of initializing jemalloc from a static initializer, which
then registers the zone, we instead register the zone and let jemalloc
initialize itself when used, which increases the chances of the
thread_arena initializer running first.
--HG--
extra : rebase_source : 4d9a5340d097ac8528dc4aaaf0c05bbef40b59bb
There is a lot of churn involved in adding new API surface to
mozjemalloc, and mozmemory.h is one. Instead of declaring
everything manually in there, "generate" the declarations through
malloc_decls.h.
--HG--
extra : rebase_source : 1416fa972319c419112c4a8b16759d90692db5b2
mozalloc_abort is not marked as a noreturn function on ARM, so clang
complains when abort calls mozalloc_abort, which calls MOZ_CRASH, which
calls abort(). We know this is OK, so just disable the warning.
The bin-unused count in memory reports indicates how much memory is
used by runs of small and sub-page allocations that is not actually
allocated. This is generally thought as an indicator of fragmentation.
While this is generally true, with the use of thread local arenas by
stylo, combined with how stylo allocates memory, it ends up also being
an indicator of wasted memory.
For instance, over the lifetime of an AWSY iteration, there are only a
few allocations that ends up in the bucket for 2048 allocated bytes. In
the "worst" case, there's only one. But the run size for such
allocations is 132KiB. Which means just because we're allocating one
buffer of size between 1024 and 2048 bytes, we end up wasting 130+KiB.
Per thread.
Something similar happens with size classes of 512 and 1024, where the
run size is respectively 32KiB and 64KiB, and where there's at most a
handful of allocations of each class ever happening per thread.
Overall, an allocation log from a full AWSY iteration reveals that there
are only 448 of 860700 allocations happening on the stylo arenas that
involve sizes above (and excluding) 512 bytes, so 0.05%.
While there are improvements that can be done to mozjemalloc so that it
doesn't waste more than one page per sub-page size class, they are
changes that are too low-level to land at this time of the release
cycle. However, considering the numbers above and the fact that the
stylo arenas are only really meant to avoid lock contention during the
heavy parallel work involved, a short term, low risk, strategy is to
just delegate all sub-page (> 512, < 4096) and large (>= 4096) to the
main arena. Technically speaking, only sub-page allocations are causing
this waste, but it's more consistent to just delegate everything above
512 bytes.
This should save 132KiB + 64KiB = 196KiB per stylo thread.
--HG--
extra : rebase_source : c7233d60305365e76aa124045b1c9492068d9415
Until bug 1361258, there was only ever one mozjemalloc arena, and the
number of dirty pages we allow to be kept dirty, fixed to 1MB per arena,
was, in fact, 1MB for an entire process.
With stylo using thread local arenas, we now can have multiple arenas
per process, multiplying that number of dirty pages.
While those dirty pages may be reused later on, when other allocations
end up filling them later on, the fact that a relatively large number of
them is kept around for each stylo thread (in proportion to the amount of
memory ever allocated by stylo), combined with the fact that the memory
use from stylo depends on the workload generated by the pages being
visited, those dirty pages may very well not be used for a rather long
time. This is less of a problem with the main arena, used for most
everything else.
So, for each arena except the main one, we decrease the number of dirty
pages we allow to be kept around to 1/8 of the current value. We do this
by introducing a per-arena configuration of that maximum number.
--HG--
extra : rebase_source : 75eebb175b3746d5ca1c371606cface50ec70f2f
Those macros are one more thing that needs to be added when the
mozjemalloc API surface is increased, but after bug 1399350, nothing
actually needs them, so remove them.
--HG--
extra : rebase_source : 2bf62cc6c179540482722a72b0d0c134d2ac2a19
The files relevant to the memory allocator are currently spread between
memory/mozjemalloc and memory/build, and the distinction was
historically from sharing some Mozilla-specific things between
mozjemalloc and jemalloc3. That distinction is not useful anymore, so
we fold everything together.
As we will likely rename the allocator at some point in the future, it
is preferable to move away from the mozjemalloc directory rather than in
its direction.
--HG--
rename : memory/mozjemalloc/Makefile.in => memory/build/Makefile.in
rename : memory/mozjemalloc/mozjemalloc.cpp => memory/build/mozjemalloc.cpp
rename : memory/mozjemalloc/mozjemalloc.h => memory/build/mozjemalloc.h
rename : memory/mozjemalloc/mozjemalloc_types.h => memory/build/mozjemalloc_types.h
rename : memory/mozjemalloc/rb.h => memory/build/rb.h
Mozjemalloc uses its own doubly linked list, which, being inherited from
C code, doesn't do much type checking, and, in practice, is rather
similar to DoublyLinkedList, so use the latter instead.
--HG--
extra : rebase_source : 9eb7334b6dde05f9af0eaea4184e532c69d0264e
clang warns about this code in mozmemory_wrap.c in the reimplementation
of vasprintf, complaining that `fmt` cannot be null:
if (str == NULL || fmt == NULL) {
^~~ ~~~~
clang is apparently exploiting knowledge about the requirements of
vasprintf here, but defensive programming on the part of our
reimplementation seems like the wiser course. Let's turn off the
warning.
Previously being a C codebase, mozjemalloc was using typedefs, avoiding
long "struct foo" uses. C++ doesn't need typedefs to achieve that, so we
can remove all that. We however keep a few typedefs in headers that are
still included from other C source files.
--HG--
extra : rebase_source : d0d807bcb76078c0ce36e4554b10803bfb36ddbb
Some system libraries call malloc_zone_free directly instead of free,
and sometimes they do that with the wrong zone. When that happens, we
circle back, trying to find the right zone, and call malloc_zone_free
with the right one, but when we can't find one, we crash, which matches
what the system free() would do. Except in one case where the pointer
we're being passed is NULL, in which case we can't trace it back to any
zone, but shouldn't crash (system free() explicitly doesn't crash in
that case).
--HG--
extra : rebase_source : 17efdcd80f1a53be7ab6b7293bfb6060a9aa4a48
Practically speaking, with code inlining, this doesn't change anything,
but will allow, in a subsequent change, to easily divert the exported
allocation functions (malloc, etc.) in order to inject replace-malloc
in the pipeline.
The added macro magic is copied from replace-malloc.c.
At the same time, reformat the functions we're touching.
--HG--
extra : rebase_source : f615909101f832f3cc471e23a3cc44a60886843f
Because malloc_decls.h is meant to be included multiple times, it
shouldn't actually declare things itself.
--HG--
extra : rebase_source : 9d6f9b2c61407265377845963a19ace2614160f4
valloc is supposed to page-align data, but mozjemalloc's definition of
pagesize is statically compiled in and might not match the actual page
size at runtime (because of MALLOC_STATIC_SIZES). We change valloc in
mozjemalloc to use the runtime page size.
--HG--
extra : rebase_source : c5b1b56e783b311ac1620a87d910e019e3f18b49
While MOZ_MEMORY_BSD is set for kFreeBSD, FreeBSD and NetBSD, the only
use of MOZ_MEMORY_BSD is along a check for __GLIBC__ which can only be
true on GNU/kFreeBSD, which doesn't have a XP_ macro, but which we
usually match with __FreeBSD_kernel__.
--HG--
extra : rebase_source : 2f143f8ed641f4e338024a58e20c62b2b30e653a
For some reason, we don't have a XP_ macro for Android, and we generally use
ANDROID or MOZ_WIDGET_ANDROID. The former is more appropriate.
--HG--
extra : rebase_source : 50576584f194aaf6b090fd530598ce9954e170b3
Back when it was added (for Windows CE, in bug 488608), mozjemalloc was
C and all the supported compilers didn't support C99 bools. Now
mozjemalloc is C++, and all the supported compilers support C99 bools
for the cases where the type is used from C.
--HG--
extra : rebase_source : b9c710a0c48dc36cb473af59e3119131d13523ce
Back when mozjemalloc was considered third-party, and before bug
1365194, mozjemalloc was calling abort() and that was redirectory to
MOZ_CRASH through a moz_abort() function. Bug 1365194 changed that so
that moz_abort is called directly instead of abort, but the indirection
is actually not necessary anymore.
So we just kill moz_abort, which is unused anywhere else, and use
MOZ_CRASH directly.
Note this will (obviously) change crash signatures involving moz_abort.
--HG--
extra : rebase_source : 67698ffd8c5e52e62b9a0b7f28efb0352c8fe8ce
This patch moves measurement of ComputedValues objects from Rust to C++.
Measurement now happens (a) via DOM elements and (b) remaining elements via
the frame tree. Likewise for the style structs hanging off ComputedValues
objects.
Here is an example of the output.
> ├──27,600,448 B (26.49%) -- active/window(https://en.wikipedia.org/wiki/Barack_Obama)
> │ ├──12,772,544 B (12.26%) -- layout
> │ │ ├───4,483,744 B (04.30%) -- frames
> │ │ │ ├──1,653,552 B (01.59%) ── nsInlineFrame
> │ │ │ ├──1,415,760 B (01.36%) ── nsTextFrame
> │ │ │ ├────431,376 B (00.41%) ── nsBlockFrame
> │ │ │ ├────340,560 B (00.33%) ── nsHTMLScrollFrame
> │ │ │ ├────302,544 B (00.29%) ── nsContinuingTextFrame
> │ │ │ ├────156,408 B (00.15%) ── nsBulletFrame
> │ │ │ ├─────73,024 B (00.07%) ── nsPlaceholderFrame
> │ │ │ ├─────27,656 B (00.03%) ── sundries
> │ │ │ ├─────23,520 B (00.02%) ── nsTableCellFrame
> │ │ │ ├─────16,704 B (00.02%) ── nsImageFrame
> │ │ │ ├─────15,488 B (00.01%) ── nsTableRowFrame
> │ │ │ ├─────13,776 B (00.01%) ── nsTableColFrame
> │ │ │ └─────13,376 B (00.01%) ── nsTableFrame
> │ │ ├───3,412,192 B (03.28%) -- servo-style-structs
> │ │ │ ├──1,288,224 B (01.24%) ── Display
> │ │ │ ├────742,400 B (00.71%) ── Position
> │ │ │ ├────308,736 B (00.30%) ── Font
> │ │ │ ├────226,512 B (00.22%) ── Background
> │ │ │ ├────218,304 B (00.21%) ── TextReset
> │ │ │ ├────214,896 B (00.21%) ── Text
> │ │ │ ├────130,560 B (00.13%) ── Border
> │ │ │ ├─────81,408 B (00.08%) ── UIReset
> │ │ │ ├─────61,440 B (00.06%) ── Padding
> │ │ │ ├─────38,176 B (00.04%) ── UserInterface
> │ │ │ ├─────29,232 B (00.03%) ── Margin
> │ │ │ ├─────21,824 B (00.02%) ── sundries
> │ │ │ ├─────20,080 B (00.02%) ── Color
> │ │ │ ├─────20,080 B (00.02%) ── Column
> │ │ │ └─────10,320 B (00.01%) ── Effects
> │ │ ├───2,227,680 B (02.14%) -- computed-values
> │ │ │ ├──1,182,928 B (01.14%) ── non-dom
> │ │ │ └──1,044,752 B (01.00%) ── dom
> │ │ ├───1,500,016 B (01.44%) ── text-runs
> │ │ ├─────492,640 B (00.47%) ── line-boxes
> │ │ ├─────326,688 B (00.31%) ── frame-properties
> │ │ ├─────301,760 B (00.29%) ── pres-shell
> │ │ ├──────27,648 B (00.03%) ── pres-contexts
> │ │ └─────────176 B (00.00%) ── style-sets
The 'servo-style-structs' and 'computed-values' sub-trees are new. (Prior to
this patch, ComputedValues under DOM elements were tallied under the the
'dom/element-nodes' sub-tree, and ComputedValues not under DOM element were
ignored.) 'servo-style-structs/sundries' aggregates all the style structs that
are smaller than 8 KiB.
Other notable things done by the patch are as follows.
- It significantly changes the signatures of the methods measuring nsINode and
its subclasses, in order to handle the tallying of style structs separately
from element-nodes. Likewise for nsIFrame.
- It renames the 'layout/style-structs' sub-tree as
'layout/gecko-style-structs', to clearly distinguish it from the new
'layout/servo-style-structs' sub-tree.
- It adds some FFI functions to access various Rust-side data structures from
C++ code.
- There is a nasty hack used twice to measure Arcs, by stepping backwards from
an interior pointer to a base pointer. It works, but I want to replace it
with something better eventually. The "XXX WARNING" comments have details.
- It makes DMD print a line to the console if it sees a pointer it doesn't
recognise. This is useful for detecting when we are measuring an interior
pointer instead of a base pointer, which is bad but easy to do when Arcs are
involved.
- It removes the Rust code for measuring CVs, because it's now all done on the
C++ side.
MozReview-Commit-ID: BKebACLKtCi
--HG--
extra : rebase_source : 4d9a8c6b198a0ff025b811759a6bfa9f33a260ba
When recycling chunks, mozjemalloc tries to avoid keeping around more
than 128MB worth of chunks around, but it doesn't actually look at the
size of the chunks that are recycled, so if chunk larger than 128MB is
recycled, it is kept as a whole, going well over the limit.
The chunks are still properly reused, and further recycling doesn't
occur, but that can limit other mmap users from getting enough address
space.
With this change, mozjemalloc now doesn't keep more than 128MB, by
splitting the chunks it recycles if they are too large.
Note this was not a problem on Windows, where chunks larger than 1MB are
never recycled (per CAN_RECYCLE).
--HG--
extra : rebase_source : 6765fd30b78ca5ddc7d55aac861355d960e47828
When recycling chunks, mozjemalloc tries to avoid keeping around more
than 128MB worth of chunks around, but it doesn't actually look at the
size of the chunks that are recycled, so if chunk larger than 128MB is
recycled, it is kept as a whole, going well over the limit.
The chunks are still properly reused, and further recycling doesn't
occur, but that can limit other mmap users from getting enough address
space.
With this change, mozjemalloc now doesn't keep more than 128MB, by
splitting the chunks it recycles if they are too large.
Note this was not a problem on Windows, where chunks larger than 1MB are
never recycled (per CAN_RECYCLE).
--HG--
extra : rebase_source : f81e94c6960ba253037f77de1a39c9ff74651e80
The current default is 24, which is equal to the maximum number of stack frames
that DMD will record. And that's a terrible value because it splits up too many
related stack traces into separate records. There is no single best value, but
8 is a much better default.
--HG--
extra : rebase_source : c423fc4fe0e490ff6d58fa8f7116bc01c86a366e
Just one caller (in DMD) actually looks at it, and that's in an unimportant way
-- if the return value was false, mLength would be zero anyway.
--HG--
extra : rebase_source : 0463ab3765744742a9e854964342d631095fa55f
This patch does he following.
- Avoids some unnecessary casting.
- Renames the |bp| parameter as |aBp|.
- Makes the no-op FramePointerStackWalk() signature match the real one.
(Clearly it's dead code in all built configurations!)
--HG--
extra : rebase_source : 3fe606d1ff9b063294f4028ff884c20661ed9e0a
MozStackWalk() is different on Windows to the other platforms. It has two extra
arguments, which can be used to walk the stack of a different thread.
This patch makes those differences clearer. Instead of having a single function
and forbidding those two arguments on non-Windows, it removes those arguments
from MozStackWalk, and splits off MozStackWalkThread() which retains them. This
also allows those arguments to have more appropriate types (HANDLE instead of
uintptr_t; CONTEXT* instead of than void*) and names (aContext instead of
aPlatformData).
The patch also removes unnecessary reinterpret_casts for the aClosure argument
at a couple of MozStackWalk() callsites.
--HG--
extra : rebase_source : 111ab7d6426d7be921facc2264f6db86c501d127
Bug 1186064 removed most of it when we started requiring VS 2015u2, but
the "frex" function exported through mozglue.def.in was only used
through the MSVCRT being patched by fixcrt.py, which is not done anymore.
So the "frex" export is not used anymore, and so the "dumb_free_thunk"
function is not used anymore as well.
--HG--
extra : rebase_source : 879c469c317c8b6749410a4a476d6c951c9a1d0f
It causes abort() on errors from munmap/VirtualFree (which in practice
don't happen except maybe in case of memory corruption), and was only
enabled on debug builds.
--HG--
extra : rebase_source : fb0c8c82dc1e2d1f14a588a82144cf82e4f4f773
Since bug 1378258 remove malloc_print_stats, there are a bunch of
allocator stats that are now unused, reducing the memory footprint of
allocator metadata.
--HG--
extra : rebase_source : 337ef3b647c20119334b6576d591006f6bb3dd16
When initializing a new chunk for use as an arena, we started by zeroing
out the chunk (if that wasn't the case) and then initializing a new
arena chunk in there. It turns out this can have a noticeable overhead,
especially when e.g. the new arena chunk is used for a large allocation
filled out by something that is realloc()ated.
OTOH, the chunk recycle code only ever keeps zeroed or arena chunks
around (there is a "recycled" type too, but in practice, at the moment,
this means they were arena chunks before). Arena chunks that were
recycled were totally emptied, so all the runs they may contain will
contain zeroed-out or poisoned data. They also contain a header, that is
overwritten by the new arena chunk initialization.
This means we can get away with reusing non-zeroed recycled chunks
without zeroing them, as long as the arena chunk header marks the runs
as madvised instead of zeroed.
Code-wise, this would benefit from getting a ChunkType out of
chunk_alloc, but this would require more refactoring than I'm willing to
do at the moment.
Before returning a chunk, chunk_recycle calls pages_commit (when
MALLOC_DECOMMIT is enabled), which is guaranteed to zero the chunk.
The code further zeroing the chunk afterwards, which is now moved out to
chunk_alloc callers, never took advantage of that fact, duplicating the
effort of zeroing the chunk on Windows.
By indicating to the callers that the chunk has already been zeroed, we
allow callers to skip zeroing on their own.
The current code only allows chunk_calloc() callers to tell whether they
want zeroed memory or not, but some might be okay either way, assuming
they act accordingly afterwards. So move the zeroing out of chunk_alloc.
Many functions in the mozjemalloc codebase like to return the opposite
boolean one would tend to expect. Pages_purge is one of them, and this
reverses the logic to match expectations.
Also make it static.
It turns out that not recycling some kinds of chunk can lead to the
recycle queue being starved in some scenarios. When that happens, we end
up mmap()ing new memory, but that turns out to be significantly slower.
So instead of not recycling huge chunks, we force-clean them, before
madvising so that the pages can still be reclaimed in case of memory
pressure.
--HG--
extra : rebase_source : 2dbd028daca92c9cd7c8079eb3dc5a0cfa06495b
This avoids MozStackWalk(), which has become unusably slow on Mac due to
changes in libunwind, and gets us back to decent speed.
The code for getting the frame pointer and stack end was copied from the Gecko
Profiler, which also uses FramePointerStackWalk() on Mac.
--HG--
extra : rebase_source : 58c32c2df8716c7c8123a4a8fb692182d066caca
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
call directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : ee0b29e1275176f52e64f4648dfa7ce25d61292e
MOZ_REPLACE_MALLOC_LINKAGE was added back when there were problems with
getting weak references working properly for replace-malloc.
Versions of OSX < 10.6 needed flat namespace, but aren't supported
anymore.
Versions of Xcode < 4.5 required flat namespace + a dummy library in
order to produce proper weak references. There is virtually nobody still
building with such an ancient toolchain.
Keeping those around doesn't /really/ hurt, except recent versions of
Xcode don't expose dyldinfo in /usr/bin, used for the configure test.
Consequently, MOZ_REPLACE_MALLOC_LINKAGE ended up being set to use the
dummy library setup, which, by using flat namespace, now causes harm in
bug 1356701, causing bug 1378332.
--HG--
extra : rebase_source : e3edc1f2cf905943c33fafeb631f2f88fc87167e
At the same time:
- replace "(nullptr)" with "nullptr",
- replace "x == nullptr" with "!x",
- replace "x != nullptr" with "x".
--HG--
extra : rebase_source : 5bc5577d0de53244afae9ebadb84962f38306368
malloc_print_stats is designed as a end-of-process dump of malloc stats,
but it has limited value in Firefox itself. It has some usefulness as
something to invoke from a debugger while Firefox is running (after
manually setting opt_print_stats to true), but it also lacks information
about live allocations, which would make it more useful.
All in all, if we want some stats about jemalloc allocations in a more
useful way, we'd be better off augmenting the jemalloc_stats() API and
make the data available to e.g. about:memory, than keeping
malloc_print_stats.
--HG--
extra : rebase_source : f6fbc97dfddd52c8620e37e2e189ae0087a845b0
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
calls directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : 45b9b98499760a7f946878d41d2fdaadb6dff4d6
Setting MALLOC_XMALLOC enables a runtime toggle that allows to make
allocations abort() on OOM instead of returning NULL. In other words,
it enables a toggle that allows to turn all allocations into infallible
allocations.
The toggle however still defaults to being disabled, which means even
when MALLOC_XMALLOC is defined when building mozjemalloc, there is no
change in behavior, unless a MALLOC_OPTIONS is set.
Even if this were useful to anyone (MALLOC_XMALLOC is only defined on
debug builds, limiting the usefulness), this is something
replace-malloc, in Firefox, is meant to be used for.
So let's remove this feature, and possibly add an equivalent
replace-malloc later if deemed necessary.
--HG--
extra : rebase_source : 1ccc893e9a8e842c3fa90e91f076a528df2f7dfe
Replace-malloc libraries, such as DMD, don't really need to care about
the details of implementing all the variants of aligned memory
allocation functions. Currently, by defining MOZ_REPLACE_ONLY_MEMALIGN
before including replace_malloc.h, they get predefined functions.
Instead of making that an opt-in at build time, we make the
replace-malloc initialization just fill the replace-malloc
malloc_table_t with implementations that rely on the replace_memalign
the library provides.
--HG--
extra : rebase_source : 0842a67d9bc27a9a86c33d14d98b9c25f39982fb
Until now, the malloc implementation functions would call the
replace-malloc functions if they exist, and fallback to the real
allocator in no such function exists. Instead of doing this, we now
fill the empty slots in the malloc_table_t with the real allocator
functions.
--HG--
extra : rebase_source : b54634f23188906939e4dc01fc5a3007de0f3f2c
We make replace_malloc_init_funcs called on all platforms and fill out a
malloc_table_t for the replace-malloc functions with what comes from
dlsym/GetProcAddress on Android/Windows, and from the dynamically linked
weak symbols replace_* on other platforms.
replace_malloc.h contains definitions of *_impl_t types for each of the
functions in the malloc_table_t, which is redundant with the
replace_*_impl_t types we were creating, so we remove those typedefs,
except for the two functions (init and get_bridge) that don't have such
a typedef. Those functions don't appear in malloc_table_t.
--HG--
extra : rebase_source : 3705a99ee07f63dbaa66973eef19ddab224e0911
We want, in a subsequent patch, to have replace_malloc_init_funcs be
called on all platforms (including those relying on the replace-malloc
library being loaded already) and perform more initialization.
To prepare for that, we move the non-platform-specific pieces out.
--HG--
extra : rebase_source : 239ed363ee168bf4f8a96e0a1ca52981cb941b71
All the _impl functions in replace-malloc.c are largely identical. This
replaces all of them with macro expansions.
--HG--
extra : rebase_source : 67a1809b0b0fc4645ea5041154fa3a6dcb6cce6b
This makes no significant difference in practice in the macro
expansions, but will help down the line.
--HG--
extra : rebase_source : 6d61c1f28c558321478d7e5f26390d27ae8ae3ac
MOZ_REPLACE_JEMALLOC was only defined when building jemalloc4 as a
replace-malloc library.
--HG--
extra : rebase_source : fa5c402da07fa96448c170b6db99629469691efe
It's a Solaris-only optimization that was used as a workaround for some
infinite loop in pages_map (bug 457189). In the meanwhile, the way
pages_map works has changed in such a way the infinite loop can't happen
anymore.
Specifically, the original problem was that pages_map would try to
allocate something larger than a chunk, then deallocate it, and
reallocate at a hinted aligned address near the address of the now
deallocated mapping, hopefully getting an aligned chunk. But Solaris
would ignore the hint and the chunk would never be aligned, causing the
infinite loop.
What the code does now it over-allocate, find an aligned chunk in there,
and deallocate what ends up not being needed. Which leaves no room for
the original infinite loop.
We thus remove the workaround and put Solaris in par with other Unix
platforms.
--HG--
extra : rebase_source : 9ce4509d9c5ac6123cedabf87c5738672af35d1b
In Gecko code, MOZ_RELEASE_ASSERT means assertions that happen on all
types of builds.
In mozjemalloc, RELEASE_ASSERT means assertions that happen when
MOZ_JEMALLOC_HARD_ASSERTS is set, otherwise normal assertions. Which is
confusing. On the other hand, it's very similar to
MOZ_DIAGNOSTIC_ASSERT, and we may just want to use that instead.
Moreover, with release promotion, the check setting
MOZ_JEMALLOC_HARD_ASSERTS means releases (promoted from beta) would end
up with those asserts while they're not meant to, so
MOZ_DIAGNOSTIC_ASSERT is actually closer to the intent. It however means
we'd lose the beta population running those assertions.
--HG--
extra : rebase_source : 606ab47af8d9ee793b13b806acadb9781c99a078
Support for them was only enabled on debug builds, and required an
opt-in through e.g. MALLOC_OPTIONS to actually enable at runtime.
--HG--
extra : rebase_source : 60c27585c2901a73eb790cec124b880c93da6ef7
While it makes sense for a global system allocator, it's not really
interesting for Firefox to respect that. Plus, newer versions of
jemalloc, which are more likely to be used as a global system allocator
use a different format for the options passed through that file.
--HG--
extra : rebase_source : 0f2283cf5660bc437bd097bf48c2b5989fa08011
Those options, complementing the MALLOC_OPTIONS environment variable,
were always empty since the removal of b2g.
--HG--
extra : rebase_source : 0e34cfce0b537ebb8ed3902bd46180dc205cb0e4
Android NDK defines SYS_mmap2 but it expands to a nonexistent symbol.
mmap2 may not be supported in any case in some AArch64 kernels, so we
should just use mmap.
MozReview-Commit-ID: 5Vjuja5fLIL
The source file is renamed too, because the build system doesn't handle
sources changing suffix very well (at least not without a clobber).
The _GNU_SOURCE define is removed because GCC/Clang set it by default in
C++ mode.
--HG--
rename : memory/mozjemalloc/jemalloc.c => memory/mozjemalloc/mozjemalloc.cpp
extra : rebase_source : f57dbb0a66b25e70fe8c724e9250cc0d8b14f1c1
The hack dates back from the originally imported jemalloc code, which
couldn't assume it's built for Firefox. Now, we can assume that, which
means the code is always built with hidden visibility by default,
removing the need for the explicit hidden visibility.
Correspondingly, when building on Solaris with GCC, the default
visibility should also prevent the inlining, making the noinline
attribute redundant. And the Sun Studio path is useless since the
compiler is not supported anymore.
--HG--
extra : rebase_source : dab0ac68af56b1f9432d312665d4ff3df01fb58a
This avoids many additions of `extern "C"` in C++ code and will avoid
having to do the same to mozjemalloc once built as C++.
--HG--
extra : rebase_source : af55696262f40a9dd16a19c29edcb9bb307d4957
MOZ_JEMALLOC_API makes those symbols exported, but we're going to make
MOZ_JEMALLOC_API include `extern "C"`, which GCC warns about in this
case (can't use extern on a variable that is initialized).
While we could get around this in some way, there is not much use for
those variables being exported altogether: the only reason they are is
to allow an override when linking mozjemalloc into executables, but
doing that in Firefox requires patching the build system or passing some
specific LDFLAGS. People who really need to do that might as well apply
a patch.
They also allow run-time override through LD_PRELOAD, but one might as
well use the MALLOC_OPTIONS environment variable for _malloc_options.
As for _malloc_message, it doesn't seem very useful to override, and
probably noone ever overrode it at runtime.
Note, we may want to remove them in a followup.
--HG--
extra : rebase_source : f2dbe5dbf0bbdb369cd7c6255f624f16b8e37209
Using -Dabort=moz_abort actually makes the build fail in some libstdc++
headers when building as C++.
--HG--
extra : rebase_source : 77828d5c42f231372a8e75f5e3cd6af135d1d5e8
The source file is renamed too, because the build system doesn't handle
sources changing suffix very well (at least not without a clobber).
The _GNU_SOURCE define is removed because GCC/Clang set it by default in
C++ mode.
--HG--
rename : memory/mozjemalloc/jemalloc.c => memory/mozjemalloc/mozjemalloc.cpp
extra : rebase_source : f57dbb0a66b25e70fe8c724e9250cc0d8b14f1c1
The hack dates back from the originally imported jemalloc code, which
couldn't assume it's built for Firefox. Now, we can assume that, which
means the code is always built with hidden visibility by default,
removing the need for the explicit hidden visibility.
Correspondingly, when building on Solaris with GCC, the default
visibility should also prevent the inlining, making the noinline
attribute redundant. And the Sun Studio path is useless since the
compiler is not supported anymore.
--HG--
extra : rebase_source : dab0ac68af56b1f9432d312665d4ff3df01fb58a
This avoids many additions of `extern "C"` in C++ code and will avoid
having to do the same to mozjemalloc once built as C++.
--HG--
extra : rebase_source : af55696262f40a9dd16a19c29edcb9bb307d4957
MOZ_JEMALLOC_API makes those symbols exported, but we're going to make
MOZ_JEMALLOC_API include `extern "C"`, which GCC warns about in this
case (can't use extern on a variable that is initialized).
While we could get around this in some way, there is not much use for
those variables being exported altogether: the only reason they are is
to allow an override when linking mozjemalloc into executables, but
doing that in Firefox requires patching the build system or passing some
specific LDFLAGS. People who really need to do that might as well apply
a patch.
They also allow run-time override through LD_PRELOAD, but one might as
well use the MALLOC_OPTIONS environment variable for _malloc_options.
As for _malloc_message, it doesn't seem very useful to override, and
probably noone ever overrode it at runtime.
Note, we may want to remove them in a followup.
--HG--
extra : rebase_source : f2dbe5dbf0bbdb369cd7c6255f624f16b8e37209
Using -Dabort=moz_abort actually makes the build fail in some libstdc++
headers when building as C++.
--HG--
extra : rebase_source : 77828d5c42f231372a8e75f5e3cd6af135d1d5e8
It's always unset, and Firefox has the logalloc replace-malloc library
for something similar.
--HG--
extra : rebase_source : cfe66c004df0d6e5db749f01feb9af591e3d1569
MOZ_MEMORY is always defined when building mozjemalloc. Due to the
origin of the code, this was all FreeBSD-specific code, and if we want
to add FreeBSD support, we will probably need to add some of it, but I'd
rather avoid keeping the difference between FreeBSD and other posix
systems if we can.
--HG--
extra : rebase_source : 3afe0136e35e25361e9e1802a9738d82b97e99e5
jemalloc_stats, as well as pre/post-fork hooks are using the `arenas`
list along the `narenas` count to iterate over all arenas setup by
mozjemalloc. Up until previous commit, that was used for automatic
multiple arenas support, which is now removed.
But mozjemalloc still supports running with multiple arenas, in the form
of opted-in, per-thread arenas. After bug 1361258, those arenas weren't
tracked, and now that `arenas` only contains the default arena, we can
now fill it with those thread-local arenas.
Keeping the automatic multiple arenas support, which we don't use and
don't really plan to, would have meant using a separate list for them.
--HG--
extra : rebase_source : f4eb55a65df8cdebff84ca709738f906d0c3c6f5
As per explained 2 commits earlier, we remove the support for multiple
arenas. We however keep the `arenas` list and the `narenas` count to
use it them to track the opted-in per-thread arenas.
--HG--
extra : rebase_source : 6e05cddd3dd385a0cd6a22fb028ab311b0c00678
mozjemalloc had an optimization that shortcuts using mutexes when the
program is single-threaded. But with code evolution, the check whether
the program had multiple threads running was meant to be true all the
time.
In order to simplify the code, we just remove those checks and dead code
they were hiding in some cases.
--HG--
extra : rebase_source : 3c7a256bffef50761f6fcd6ec876ebabfcf3fdae
After bug 1361258, mozjemalloc uses a main arena for all allocations,
and specific threads can opt-in to use a different arena for that thread
only. Going forward, this is how we expect to support scaling across
different threads that require lots of concurrent allocations.
To simplify the mozjemalloc code, we'll remove the support for multiple
arenas.
This is the first step towards that, removing the support for arena
balancing. Everything behind the MALLOC_BALANCE macro has never been used
in practice.
--HG--
extra : rebase_source : e7ab669312f1e26a91375d11f5ad488e870bd354
NO_TLS used to be hardcoded on mac because up to 10.6, __thread was not
supported. Until recently, we still supported for 10.6, and it's not the
case anymore, so we could make mac builds use __thread.
Unfortunately, on OSX, __thread circles back calling malloc to allocate
storage on first access, so we have an infinite loop problem here.
Fortunately, pthread_keys don't have this property, so we can use that
instead. It doesn't appear to have significantly more overhead (and TLS
overhead is small anyways compared to the amount of work involved in
allocating memory with mozjemalloc).
At the same time, we uniformize the initialization sequence between
mozjemalloc and mozjemalloc+replace-malloc, such that we have less
occasions for surprises when riding the trains (replace-malloc being
nightly only), ensuring the zone registration happens at the end of
mozjemalloc's initialization.
The function, when passed `true`, creates a new arena with no attachment
in the global list of arenas, and assigns it to the current thread.
When passed `false`, it restores the default arena.
Some details are left out because they don't matter yet, as the sole
initial use of the API is going to invoke the function when stylo rayon
threads start up, which happens exactly once per thread, and at thread
exit time, which happens at shutdown, if ever.
This simplifies things, and leaves those details to followup(s):
- Arenas can't simply be killed when the function is called with `false`
again (or when the thread dies) because they may still contain valid
allocations that could have been passed to other threads. Those arenas
should be kept until they are empty.
- jemalloc_stats doesn't know about them and will under-report memory
usage.
- pre/post fork hooks don't know about them and will not force-unlock
their locks. In practice, until those arenas are used for something
else than the style system, this can't lead to the dead-locks that
these hooks help prevent because nothing should be touching pointers
allocated through them after fork.
Add MOZ_FORMAT_PRINTF to the appropriate spots in DMD and fix up the
one (trivial) error that this pointed out.
MozReview-Commit-ID: LS0UWV5YRoM
--HG--
extra : rebase_source : eb09be39df61a51acd46ed72a1461c495727af79
1. The current asynchronous behavior is pointless, because we still remove the
hashtable entry synchronously, which deletes the value, and it's the value
we're using.
2. Trying to asynchronously delete the value is difficult, and not currently
needed because we can't get a memory-pressure notification while we're using
the value, and hence can't expire it from the expiration tracker.
Note: we can't get this memory-pressure notification because the stage 2 of
mozalloc_handle_oom() to reclaim memory when OOM is not implemented yet.
In order to avoid the possibility of a deadlock if the DMD state lock is
currently acquired when forking a |pthread_atfork| hook is added to wait for
and acquire the lock prior to forking, then release it after forking.
Adapted from
4e2e3dd9cf
and
d9f7b2a430
As per the latter commit, it would seem unlocking, in fork() child
processes, mutexes that were locked in the parent process is not really
well supported on OSX 10.12. The addition of the zone_reinit_lock
function in 10.12 supports this idea.
--HG--
extra : rebase_source : b3b58558cc195d63200078085c7e9b6c9b8d83ff
This picks the same changes as the ones we just did to
memory/build/zone.c, plus a oneliner for sparc64.
--HG--
extra : rebase_source : ca917461472e8188fbfc6e2a971b228b68014e38
Some system libraries are using malloc_default_zone() and then using
some of the malloc_zone_* API. Under normal conditions, those functions
check the malloc_zone_t/malloc_introspection_t struct for the values
that are allowed to be NULL, so that a NULL deref doesn't happen.
As of OSX 10.12, malloc_default_zone() doesn't return the actual default
zone anymore, but returns a fake, wrapper zone. The wrapper zone defines
all the possible functions in the malloc_zone_t/malloc_introspection_t
struct (almost), and calls the function from the registered default zone
(jemalloc in our case) on its own. Without checking whether the pointers
are NULL.
This means that a system library that calls e.g.
malloc_zone_batch_malloc(malloc_default_zone(), ...) ends up trying to
call jemalloc_zone.batch_malloc, which is NULL, and crash follows.
So as of OSX 10.12, the default zone is required to have all the
functions available (really, the same as the wrapper zone), even if they
do nothing.
This is arguably a bug in libsystem_malloc in OSX 10.12, but jemalloc
still needs to work in that case.
[Adapted from
c6943acb3c]
--HG--
extra : rebase_source : 7d7a5b47fa18f56183e99c3655aee003c9be161e
The SDK jemalloc is built against might be not be the latest for various
reasons, but the resulting binary ought to work on newer versions of
OSX.
In order to ensure this, we need the fullest definitions possible, so
copy what we need from the latest version of malloc/malloc.h available
on opensource.apple.com.
[Adapted from
c68bb41793]
--HG--
extra : rebase_source : ab19c478b568ea24095a3be62c39fb81efc1920a
We have been using a different zone allocator between mozjemalloc and
replace-malloc for a long time. Jemalloc 4 uses the same as
replace-malloc, albeit as part of the jemalloc upstream code base.
We've been bitten many times in the past with Apple changes breaking the
zone allocator, and each time we've had to make changes to the three
instances, although two of them are similar and the changes there are
straightforward.
It also turns out that the way the mozjemalloc zone allocator is set up,
when a new version of OSX appears with a new version of the system zone
allocator, Firefox ends up using the system allocator, because the zone
allocator version is not supported.
So, we use the same zone allocator for both replace-malloc and
mozjemalloc, making everything on par with jemalloc 4.
--HG--
extra : rebase_source : 9c0e245b5f82bb71294370d607e690c05cc89fbc
The intent here is to reuse the zone allocator for mozjemalloc, to avoid
all the shortcomings of mozjemalloc using a different one. This change
only moves the replace-malloc zone allocator out of replace-malloc.c, to
make changes for mozjemalloc integration clearer.
--HG--
rename : memory/build/replace_malloc.c => memory/build/zone.c
extra : rebase_source : 8b98efaa4a88862f2967c855b511e92beb9c4031
Somehow, we never called those hooks when replace-malloc is enabled. I'd
expect this to cause random deadlocks when forking, and I'm surprised
this hasn't surfaced. Maybe it actually causes some intermittent oranges
on automation, who knows.
This also brings consistency with what is done for jemalloc 4, and with
the mozjemalloc implementation, too, that we're going to replace with
this one in a subsequent changeset.
--HG--
extra : rebase_source : 059567d17f928098db8367e9081b631ced351110
When built with --enable-debug, jemalloc4 makes headers define functions that
are normally inlined, and that prevents unified sources from working.
--HG--
extra : rebase_source : 9490a0a8312e9be18e639384e3450a9c924e3daf
extra : source : a67867200ec31a040bb6bf8320bde20beb34aa3e
Bug 1300948 added thread-ids to logalloc logs, and logalloc_munge.py was
munging them all as if they were all under the same namespace. But when
filtering munged logs to only contain logs from a given process,
thread-ids then don't necessarily start from 1, which would be the
desired outcome.
So use a different pool of thread-ids for each process.
Bug 1207519 added maybe_pod_* methods to the allocation policy
classes, but did not add them to InfallibleAllocPolicy.
I think the idea of these methods is that the callers are explicitly
opting into fallible behavior, so they will deal with any errors that
occur. For instance, in js::HashTable, this is used to try to shrink
the hash table when there are a lot of unused entries. If the shrink
fails, it just continues to use the existing block of memory.
However, having fallible methods in a supposedly infallible class is
weird, so for now, just use the infallible version.
MozReview-Commit-ID: 97D66Z4oLfl
--HG--
extra : rebase_source : 9a3e0b50a5602a8e4772da88c16e1715c25da7df
Code that uses InfallibleAllocPolicy presumably wants for operations
to always succeed. However, Vector and HashTable can end up detecting
that growing the data structure will fail due to integer overflow, and
then will call reportAllocOverflow() and fail. I think these cases
should crash.
In addition, pod_malloc and pod_realloc should crash rather than
returning NULL when they detect overflow.
This calls mozalloc_abort rather than MOZ_CRASH directly to avoid
circular #includes, because Assertions.h includes nsTraceRefcnt.h
which includes nscore.h which includes mozalloc.h.
MozReview-Commit-ID: 1g99BXLceQI
--HG--
extra : rebase_source : 927d842588c1f85a50a7a1c50a5546d5f688555f
The patch is generated from following command:
rgrep -l unused.h|xargs sed -i -e s,mozilla/unused.h,mozilla/Unused.h,
MozReview-Commit-ID: AtLcWApZfES
--HG--
rename : mfbt/unused.h => mfbt/Unused.h
This removes the unnecessary setting of c-basic-offset from all
python-mode files.
This was automatically generated using
perl -pi -e 's/; *c-basic-offset: *[0-9]+//'
... on the affected files.
The bulk of these files are moz.build files but there a few others as
well.
MozReview-Commit-ID: 2pPf3DEiZqx
--HG--
extra : rebase_source : 0a7dcac80b924174a2c429b093791148ea6ac204
The Win32 has non-malloc based heap allocation functions. They are not
widely used in the Gecko code base, but there are a few places that do
use them. They could be replaced with uses of malloc/free/realloc or
new/delete, but there is now an entirely separate problem that
requires wrapping those functions: the Rust static libraries are using
those functions to allocate memory.
Since bug 1253512 landed, it's possible for DeadBlocks to lack an allocation
stack.
--HG--
extra : rebase_source : 0efc60192ed0992d2f68838d95586cd888765586
Since the introduction of the STL wrappers, they have included
mozalloc.h, and multiple times, we've hit header reentrancy problems,
and worked around them as best as we could.
Taking a step back, all mozalloc.h does is:
- declare moz_* allocator functions.
- define inline implementations of various operator new/delete variants.
The first only requires the functions to be declared before they are used,
so mozalloc.h only needs to be included before anything that would use
those functions.
The second doesn't actually require a specific order, as long as the
declaration for those functions comes before their use, and they are
either declared in <new> or implicitly by the C++ compiler.
So all in all, it doesn't matter that mozalloc.h is included before the
wrapped STL headers. What matters is that it's included when STL headers
are included. So arrange things such that mozalloc.h is included after
the first wrapped STL header is fully preprocessed (and all its includes
have been included).
It's an annotation that is used a lot, and should be used even more, so a
shorter name is better.
MozReview-Commit-ID: 1VS4Dney4WX
--HG--
extra : rebase_source : b26919c1b0fcb32e5339adeef5be5becae6032cf
Bug 691003 made the minimum allocation size word sized for Linux and OS X, we
now need to do a similar change on Windows as well. For Windows the requirement
is 8-bytes on 32-bit and 16-bytes on 64-bit.
Due to the change in part 1, DMD now prints an entry for every live block,
which increases the output file size significantly in the default case. However,
a lot of those entries are identical and so can be aggregated via the existing
"num" property.
This patch does that, reducing output size by more than half.
--HG--
extra : rebase_source : 6a3709d068f2fb9bbfe3344d7406f05af896380b
DMD currently uses a very hacky form of "sampling" by default to avoid
recording stack traces for all blocks. This makes DMD run faster than when it
records all stack traces.
This patch changes the sampling method used; in fact, it avoids "sampling" at
all. The existence of all heap blocks is now recorded exactly, but by default
we only record an allocation stack for each heap block if a Bernoulli trial
succeeds. This choice works well because getting the stack trace is ~100x
slower than recording the block's existence.
Overall, this approach is simpler and it also gives better output -- the choice
of which blocks to record allocation stacks for is mathematically sound, no
stack trace gets blamed for allocations it didn't do, and block counts and
sizes are now always exact.
Other specific things changed.
- All notion of sampling is removed from the various data structures.
- The --sample-below option is removed in favour of --stacks={partial,full}.
- The format of the JSON output file has changed.
- The names of various test files have changed to reflect concept changes.
--HG--
rename : memory/replace/dmd/test/full-empty-cumulative-expected.txt => memory/replace/dmd/test/complete-empty-cumulative-expected.txt
rename : memory/replace/dmd/test/full-empty-dark-matter-expected.txt => memory/replace/dmd/test/complete-empty-dark-matter-expected.txt
rename : memory/replace/dmd/test/full-empty-live-expected.txt => memory/replace/dmd/test/complete-empty-live-expected.txt
rename : memory/replace/dmd/test/full-unsampled1-dark-matter-expected.txt => memory/replace/dmd/test/complete-full1-dark-matter-expected.txt
rename : memory/replace/dmd/test/full-unsampled1-live-expected.txt => memory/replace/dmd/test/complete-full1-live-expected.txt
rename : memory/replace/dmd/test/full-unsampled2-cumulative-expected.txt => memory/replace/dmd/test/complete-full2-cumulative-expected.txt
rename : memory/replace/dmd/test/full-unsampled2-dark-matter-expected.txt => memory/replace/dmd/test/complete-full2-dark-matter-expected.txt
rename : memory/replace/dmd/test/full-sampled-live-expected.txt => memory/replace/dmd/test/complete-partial-live-expected.txt
extra : rebase_source : 47d287405dc5e9075f08addaba49e879c2c6e23f
Bug 691003 made the minimum allocation size word sized for Linux and OS X, we
now need to do a similar change on Windows as well. For Windows the requirement
is 8-bytes on 32-bit and 16-bytes on 64-bit.
This reduces memory usage by up to 3 MiB per process.
MozReview-Commit-ID: Gfs9PIJM4br
--HG--
extra : rebase_source : 69ac5acf7f7f0c802a047f5bf72838e5c4d1f123
It's rare anyone would see it, and it just duplicates the info present in |mach
run|.
--HG--
extra : rebase_source : fbb1716616ca1ff007af4202757586627c8612b4
This requires moving the --enable-dmd code earlier, before MOZ_PROFILING starts
being used.
--HG--
extra : rebase_source : acfdc6c4c82436c0a1834e11ddc567e37318da60
MOZ_DEBUG_DEFINES are essentially defines used everywhere. So treat them as
feeding the initial value for DEFINES in each moz.build sandbox. This allows
the kind overrides that was done in the past by resetting MOZ_DEBUG_DEFINES
in Makefiles.
DONTBUILD because it only changes comments.
This will hopefully prevent confusion like that in bug 1215903.
--HG--
extra : rebase_source : f0a601d77b5f42b4fbe090693234f934e3becc42
The bulk of this commit was generated with a script, executed at the top
level of a typical source code checkout. The only non-machine-generated
part was modifying MFBT's moz.build to reflect the new naming.
CLOSED TREE makes big refactorings like this a piece of cake.
# The main substitution.
find . -name '*.cpp' -o -name '*.cc' -o -name '*.h' -o -name '*.mm' -o -name '*.idl'| \
xargs perl -p -i -e '
s/nsRefPtr\.h/RefPtr\.h/g; # handle includes
s/nsRefPtr ?</RefPtr</g; # handle declarations and variables
'
# Handle a special friend declaration in gfx/layers/AtomicRefCountedWithFinalize.h.
perl -p -i -e 's/::nsRefPtr;/::RefPtr;/' gfx/layers/AtomicRefCountedWithFinalize.h
# Handle nsRefPtr.h itself, a couple places that define constructors
# from nsRefPtr, and code generators specially. We do this here, rather
# than indiscriminantly s/nsRefPtr/RefPtr/, because that would rename
# things like nsRefPtrHashtable.
perl -p -i -e 's/nsRefPtr/RefPtr/g' \
mfbt/nsRefPtr.h \
xpcom/glue/nsCOMPtr.h \
xpcom/base/OwningNonNull.h \
ipc/ipdl/ipdl/lower.py \
ipc/ipdl/ipdl/builtin.py \
dom/bindings/Codegen.py \
python/lldbutils/lldbutils/utils.py
# In our indiscriminate substitution above, we renamed
# nsRefPtrGetterAddRefs, the class behind getter_AddRefs. Fix that up.
find . -name '*.cpp' -o -name '*.h' -o -name '*.idl' | \
xargs perl -p -i -e 's/nsRefPtrGetterAddRefs/RefPtrGetterAddRefs/g'
if [ -d .git ]; then
git mv mfbt/nsRefPtr.h mfbt/RefPtr.h
else
hg mv mfbt/nsRefPtr.h mfbt/RefPtr.h
fi
--HG--
rename : mfbt/nsRefPtr.h => mfbt/RefPtr.h
This commit was generated using the following script, executed at the
top level of a typical source code checkout.
# Don't modify select files in mfbt/ because it's not worth trying to
# tease out the dependencies currently.
#
# Don't modify anything in media/gmp-clearkey/0.1/ because those files
# use their own RefPtr, defined in their own RefCounted.h.
find . -name '*.cpp' -o -name '*.h' -o -name '*.mm' -o -name '*.idl'| \
grep -v 'mfbt/RefPtr.h' | \
grep -v 'mfbt/nsRefPtr.h' | \
grep -v 'mfbt/RefCounted.h' | \
grep -v 'media/gmp-clearkey/0.1/' | \
xargs perl -p -i -e '
s/mozilla::RefPtr/nsRefPtr/g; # handle declarations in headers
s/\bRefPtr</nsRefPtr</g; # handle local variables in functions
s#mozilla/RefPtr.h#mozilla/nsRefPtr.h#; # handle #includes
s#mfbt/RefPtr.h#mfbt/nsRefPtr.h#; # handle strange #includes
'
# |using mozilla::RefPtr;| is OK; |using nsRefPtr;| is invalid syntax.
find . -name '*.cpp' -o -name '*.mm' | xargs sed -i -e '/using nsRefPtr/d'
# RefPtr.h used |byRef| for dealing with COM-style outparams.
# nsRefPtr.h uses |getter_AddRefs|.
# Fixup that mismatch.
find . -name '*.cpp' -o -name '*.h'| \
xargs perl -p -i -e 's/byRef/getter_AddRefs/g'
The flags added in toolkit/locales/Makefile.in turn out not to be actually
used, so just remove that.
The remaining uses of XULPPFLAGS are to set debug flags depending on whether
MOZ_DEBUG is set or not. Just set a dedicated variable with the right value
from configure.
Various bits depend on RefPtr.h to provide RefCounted<T> and RefPtr<T>.
It will be easier to manage an automatic conversion from RefPtr<T> to
nsRefPtr<T> if we split out the dependency on RefCounted<T> first.
Jemalloc 4 purges dirty pages regularly during free() when the ratio of dirty
pages compared to active pages is higher than 1 << lg_dirty_mult. We set
lg_dirty_mult in jemalloc_config to limit RSS usage, but it also has an impact
on performance.
So instead of enforcing a high ratio to force more pages being purged, we keep
jemalloc's default ratio of 8, and force a regular purge of all dirty pages,
after cycle collection.
Keeping jemalloc's default ratio avoids cycle-collection-triggered purge to
have to go through really all dirty pages when there are a lot, in which case
the normal jemalloc purge during free() will already have kicked in. It also
takes care of everything that doesn't run the cycle collector still having
a level of purge, like plugins in the plugin-container.
At the same time, since jemalloc_purge_freed_pages does nothing with jemalloc 4,
repurpose the MEMORY_FREE_PURGED_PAGES_MS telemetry probe to track the time
spent in this cycle-collector-triggered purge.
The patch removes 455 occurrences of FAIL_ON_WARNINGS from moz.build files, and
adds 78 instances of ALLOW_COMPILER_WARNINGS. About half of those 78 are in
code we control and which should be removable with a little effort.
--HG--
extra : rebase_source : 82e3387abfbd5f1471e953961d301d3d97ed2973