In PLDHashTable the equivalent functions have a "Shallow" prefix, which makes
it clear that they don't measure things hanging off the table. This patch makes
mozilla::Hash{Set,Map} do likewise.
MozReview-Commit-ID: 3kwCJynhW7d
--HG--
extra : rebase_source : 9c03d11f376a9fd4cfd5cfcdc0c446c00633b210
Also use mozilla::HashNumber where appropriate.
MozReview-Commit-ID: BTq0XDS5UfQ
--HG--
extra : rebase_source : 28f45a9b27e831e99620a2b575f373003f1301f2
Summary: GTest is permafailing on Windows because of timeout.
Reviewers: glandium
Reviewed By: glandium
Bug #: 1474254
Differential Revision: https://phabricator.services.mozilla.com/D2043
--HG--
extra : rebase_source : cea68e50f96a1788bf15dc6ca7859e5f698a6209
This adds an '--allocation-filter' param that can be used to limit output to
only allocations that include the filter in their stack. For example:
dmd.py --allocation-filter fontconfig dmd.json.gz
limits its output to just allocations that have an instance of 'fontconfig' in
in one of their stack frames.
--HG--
extra : rebase_source : 9ed34d5c7a6a2b76e96455e66b2e8196686ade16
This was done automatically replacing:
s/mozilla::Move/std::move/
s/ Move(/ std::move(/
s/(Move(/(std::move(/
Removing the 'using mozilla::Move;' lines.
And then with a few manual fixups, see the bug for the split series..
MozReview-Commit-ID: Jxze3adipUh
Bug 1364359 is to fix a leaked arena. Until that is fixed; it is unsafe to
call moz_dispose_arena more than once.
MozReview-Commit-ID: KIby1RLtrPK
--HG--
extra : rebase_source : 6ea41001e9f0c4d5eb24ee678d6c1c0218991ac3
This behavior matches what gets used in mozalloc.h to define these
wrappers, and is particularly necessary for newer versions of clang to
not complain about our definitions.
That NDK bug has been fixed since r8c, and we now require something more
recent than that. This effectively reverts the changes from bug 720621
and bug 734832.
--HG--
extra : rebase_source : 9ff76a790ec4135dc0172cfd0f11fc1ecef7df64
Base allocator commit stats were added in bug 515556, along other commit
stats, but they have actually been wrong since then: the committed count
is updated with the difference between pbase_next_addr and
base_next_decommitted *after* the latter is set to the former, making
the difference always 0.
--HG--
extra : rebase_source : a2aed523314549a37a61bd4ab300c98f198f9252
Since TreeNode::{Left,Right,Color} is always a valid call to make, we
don't need to check if for nullity before calling those functions.
This effectively kind of reverts some parts of bug 1412722.
--HG--
extra : rebase_source : 3deb316f463b51fdbb3aebc2e57e437018b3a829
The code before bug 1412722 benefitted from the sentinel being an actual
node object, and some code paths ended up checking its color (always
black) or getting its right and left node, that always returned to the
sentinel.
When TreeNode currently contains a nullptr, all those lead to a null
deref if the calling code is not doing the right checks, which happens
to be the case in at least some places. Instead of relying on the
callers doing the right thing, make the TreeNode do the right thing when
it contains a nullptr, effectively acting as the sentinel in that case.
We additionally ensure that nothing in the calling code will be trying
to change the color or left/right pointers on the sentinel, which is an
extra safe net compared to the code before bug 1412722.
--HG--
extra : rebase_source : 09ab0bf8682092ef6d9a0a5921be3da787d0d548
This will allow the upcoming changes to add some safety back to the code
after bug 1412722.
--HG--
extra : rebase_source : 5033b8034cabaf5a7fdd578459588d5099402d02
Since TreeNode::{Left,Right,Color} is always a valid call to make, we
don't need to check if for nullity before calling those functions.
This effectively kind of reverts some parts of bug 1412722.
--HG--
extra : rebase_source : 172f1c042bdbb4d500e1afb4d57774ab76826876
The code before bug 1412722 benefitted from the sentinel being an actual
node object, and some code paths ended up checking its color (always
black) or getting its right and left node, that always returned to the
sentinel.
When TreeNode currently contains a nullptr, all those lead to a null
deref if the calling code is not doing the right checks, which happens
to be the case in at least some places. Instead of relying on the
callers doing the right thing, make the TreeNode do the right thing when
it contains a nullptr, effectively acting as the sentinel in that case.
We additionally ensure that nothing in the calling code will be trying
to change the color or left/right pointers on the sentinel, which is an
extra safe net compared to the code before bug 1412722.
--HG--
extra : rebase_source : ac61ea259ac49bf76e2f8f6f54dda991498d4664
This will allow the upcoming changes to add some safety back to the code
after bug 1412722.
--HG--
extra : rebase_source : c906e9b3168fe738cba8a3de3fdf4efee1f0d4df
- Turn MOZ_DIAGNOSTIC_ASSERTs related to double-frees into
MOZ_RELEASE_ASSERTs with a crash message making them more identifiable
than the asserted condition.
- In huge_dalloc, MOZ_RELEASE_ASSERT early, instead of letting
RedBlackTree::Remove end up crashing because the node is not in the
tree.
--HG--
extra : rebase_source : e051caaf371e88a9db6b5153f58c8a4aa4cde757
Before bug 1412722, which removed the sentinels, the code looked like:
if (rbp_r_c->Right()->Left()->IsBlack()) {
At that point in the code, rbp_r_c is the root node of the tree. If
rbp_r_c->Right() was the sentinel, ->Right()->Left() would be the
sentinel too, and the sentinel is black. Which means the condition would
be true.
The code after was:
if (rbp_r_c->Right() && (!rbp_r_c->Right()->Left() ||
rbp_r_c->Right()->Left()->IsBlack())) {
The second half correctly deals with the case of
rbp_r_c->Right()->Left() being the sentinel. But the first half now
makes things different: ->Right() being null would correspond to the
previous case where it was the sentinel, and the test would not return
true in that case when it would have before. When ->Right() is not null,
things are normal again.
The correct check is to make the branch taken when ->Right() is null.
Now, looking under which conditions we may get in that branch wrongly:
- The root node's right link must be empty, which means a very small tree.
- The comparison between the removed key and the root node must indicate
the key is greater than the value of the root node.
- There's another case where the comparison result (rbp_r_cmp) can be
eGreater, when it is reassigned under one of the branches under the
eEqual test, and that branch is only taken when ->Right() on the root
node was non-null, which was the non-broken case.
So it would seem we can't reach that code when rbp_r_c->Right() is null
anyways, so it /should/ practically make no difference. Better safe than
sorry, though. It's hard to tell anything from crash stats, because
since the templatization in bug 1403444, all crashes fit in one bucket,
when there used to be 5 functions before :(
While here, add a missing include in rb.h.
--HG--
extra : rebase_source : 2ebcb84345ad52059b0c081b9e2e1af1d0bbb7bc
We're not actually using it, and it messes up with the zone allocator in
mozjemalloc after fork(). See the lengthy analysis in
https://bugzilla.mozilla.org/show_bug.cgi?id=1424709#c34 and following.
--HG--
extra : rebase_source : c58e13b897dde7b32d83c43fbb2a04a0db3a5dc9
This was done using the following script:
37e3803c7a/processors/chromeutils-import.jsm
MozReview-Commit-ID: 1Nc3XDu0wGl
--HG--
extra : source : 12fc4dee861c812fd2bd032c63ef17af61800c70
extra : intermediate-source : 34c999fa006bffe8705cf50c54708aa21a962e62
extra : histedit_source : b2be2c5e5d226e6c347312456a6ae339c1e634b0
This was done using the following script:
37e3803c7a/processors/chromeutils-import.jsm
MozReview-Commit-ID: 1Nc3XDu0wGl
--HG--
extra : source : 12fc4dee861c812fd2bd032c63ef17af61800c70
This was done using the following script:
37e3803c7a/processors/chromeutils-import.jsm
MozReview-Commit-ID: 1Nc3XDu0wGl
--HG--
extra : rebase_source : c004a023389f1f6bf3d2f3efe93c13d423b23ccd
Bug 1280578 added some wrapping for the Win32 Heap* functions, mainly
for the rust static libraries that use them. Because pointer ownership
might cross the C++/Rust boundary, and because about:memory uses
malloc_usable_size/msize, we need both C++ and Rust to still use the
same heap on builds where our allocator is not enabled.
--HG--
extra : rebase_source : 37a25b376a02ea07c187fb161d2005141e783820
Bug 1423803 was attempting to remove the fallible library but didn't do
so on Android because of this bug. We can now fully retire it.
--HG--
extra : rebase_source : de38872a08d24768eadfbe81652cfcd6aa7aa041
In bug 1423803, mozilla::fallible was made an "alias" of std::nothrow.
Except C++ doesn't allow compilers to be too smart, and in many cases,
they would actually create data fields to hold a copy of std::nothrow,
even creating a static initializer on non-optimized builds to do so.
By turning it into a reference, we allow compilers to just use
std::nothrow directly, as if it were passed directly, but they can still
create unused data fields. Turning it into a static allows compilers to
skip the data fields altogether.
On a local linux64 build, this saves 242 bytes of .bss.
Note this does change a `lea` (address calculation) into a `mov` (read),
but it shouldn't matter too much.
--HG--
extra : rebase_source : 9c08e8263aef267b8ad5962b0248c7effcb67796
The std::nothrow variant of operator new is effectively a fallible
operator new. It is used in third party code.
The duplication with our own fallible operator new is unfortunate, and
we can reduce it by making one an alias of the other.
We keep the fallible library as a dummy on Android because bug 1423802
induces some linking problems.
--HG--
extra : rebase_source : d7b915aaafde40057e87b7ad4bbd82d348e4f12d
This was added in bug 1122337 back when the stackwalker was still
in XPCOM, which it isn't anymore, so XPCOM_GLUE is not necessary
anymore.
--HG--
extra : rebase_source : e550671c26e250843d34cb2b83497c861225883f
Bug 1163171 removed support for building for android with GCC, and we
don't need to use throw() anymore. We can use the same code as for other
non-Windows platforms.
--HG--
extra : rebase_source : 9c2c2179a6d214096619ff0ae1d1a42912beda79
MOZ_ALWAYS_INLINE_EVEN_DEBUG is always defined through
mozilla/Attributes.h, so the fallbacks are never used in practice. They
are just there from the old days when mozalloc.h didn't use
mozilla/Attributes.h.
--HG--
extra : rebase_source : 0d55068ab5fcec3f4bcafecd8c3ce371597f8cfe
They are both infallible wrappers of posix_memalign and valloc.
There is also moz_xmemalign, which wraps memalign, which is mostly
always available as of bug 1402647.
None of them are actually used, but it's still desirable to at least
have one infallible variant, so keep moz_xmemalign and remove the other
two.
While here, we actually make both memalign and moz_xmemalign always
available.
--HG--
extra : rebase_source : 1c3ca4b3e3310543145f3181dfa4e764be1d6ff8
When this was added, the xpcom glue was still a thing, and there was a
distinction between things that would build with mozalloc available and
others. There is no such distinction anymore. Anything that has access
to xpcom has access to infallible allocator functions.
--HG--
extra : rebase_source : 04bce114e940c53709275d0354ea7240df4a051e
They are both infallible wrappers of posix_memalign and valloc.
There is also moz_xmemalign, which wraps memalign, which is always
available as of bug 1402647.
None of them are actually used, but it's still desirable to at least
have one infallible variant, so keep moz_xmemalign and remove the other
two.
While here, we actually make moz_xmemalign always available, since
memalign is always available.
--HG--
extra : rebase_source : 17300bc03a715e5d36b4b687f22050622c1c70c8
moz_posix_memalign is a wrapper for posix_memalign that only exists if
posix_memalign exists.
On OSX, it has a fallback for an under-specified bug where it
purportedly returns a pointer that doesn't have the requested alignment.
That fallback was added in bug 414946, over 6 years ago, before jemalloc
was even enabled on OSX.
Considering posix_memalign is used directly in many other places in
Gecko, that we almost always use mozjemalloc, which doesn't have these
problems, and that in all likeliness, the bug was in some old version of
OSX that is not supported anymore, the fallback does not seem all that
useful.
So, just use posix_memalign directly.
--HG--
extra : rebase_source : b2151b5fb598dc20cbd70308555059f7545b18b2
Most adjustements come from some recent .clang-format changes. A few
were overlooked from changes to the code.
--HG--
extra : rebase_source : c397762ea739fec05256a8c0132541bf4c727c32
So far, logalloc has avoided logging calls that e.g. return null
pointers, but both to make the code more generic and to enable logging
of error conditions, we now log every call.
--HG--
extra : rebase_source : 5e41914552f44e330f8f9c12b34fd6d30fdf30a7
Instead, only register a minimal set of functions when an environment
variable is set.
--HG--
extra : rebase_source : 94f2403ed9afe2acab1f56714d60fb32401076dc
Also change the definition of the StaticMutex constructor to unconfuse clang-format.
--HG--
extra : rebase_source : aa2aa07c8c324e1253c20b34d850230579d45632
For functions with no result, such as free, it's invalid for some string
to appear after the closing parenthesis.
--HG--
extra : rebase_source : d7a72c064c408ba9c4a8722ebbaafb878633e857
While jemalloc_stats is not actively doing anything, it can be
cumbersome to not have it count as an operation, because the operation
count shown on jemalloc_stats doesn't match the line number in the input
replay log, and the offset grows as the number of jemalloc_stats
operations grows.
While here, also update a comment about the replay log format.
--HG--
extra : rebase_source : da0ad9a990487ebdfadae7f8fcfad85e82b482fc
It adds an uncompressible and noticeable time overhead to replaying
logs, even when one is not interested in measuring RSS. This has caused
me to clear the method body on multiple occasions.
If necessary, it is possible to enable zero or junk at the allocator
level for the same effect.
--HG--
extra : rebase_source : a4c44e97986668e712b500266d7fffe985e85881
And statically link logalloc.
Statically linking is the default, except when building with
--enable-project=memory, allowing to use the generated libraries from
such builds with Firefox.
--HG--
extra : rebase_source : efe9edce8db6a6264703e0105c2192edc5ca8415
This makes things slightly more inconvenient (having to set two
environment variables instead of one for the simplest case) until a few
patches down the line, when DMD is statically linked, at which point it
will get down to one environment variable every time.
--HG--
extra : rebase_source : 08dc3c05318b572ae1026227d0369fa8bf21b20f
Now that replace_init can opt-out of registering the replace-malloc
functions, don't do so when MALLOC_LOG was not set in the environment.
While one would normally set MALLOC_LOG alongside one of the environment
variable necessary to load the replace-malloc library, we're also going,
in a subsequent change, to allow statically linking replace-malloc
libraries, taking full advantage of this change.
--HG--
extra : rebase_source : 944a9d7af33f88f793ee0104bd5e58ec508e4f58
As of bug 1420353, DMD's replace_* functions can't be called before
replace_init places them in the malloc function table, which only
happens after DMD::Init has run, meaning DMD is always initialized
by the time any of its replace_* function can be called.
--HG--
extra : rebase_source : 96bf4d01b6fac5cbb4712f56c572791cc4972f77
The original purpose of those declarations was to avoid the function
definitions being wrong, as well as forcing them being exported
properly (as extern "C", as weak symbols when necessary, etc.), but:
- The implementations being C++, function overloads simply allowed
functions with the same name to have a different signature.
- As of bug 1420353, the functions don't need to be exported anymore,
nor do we care whether their symbols are mangled. Furthermore, they're
now being assigned to function table fields, meaning there is type
checking in place, now.
So all in all, these declarations can be removed.
Also, as further down the line we're going to statically link the
replace-malloc libraries, avoid symbol conflicts by making those
functions static.
--HG--
extra : rebase_source : 0dbb15f2c85bc873e7eb662b8d757f99b0732270
This was never strictly required (for instance, DMD doesn't do that),
and would make things harder with the subsequent changes.
--HG--
extra : rebase_source : 29ea08d41f54da7f99120f9fe9af4017f61d8a4b
And statically link logalloc.
Statically linking is the default, except when building with
--enable-project=memory, allowing to use the generated libraries from
such builds with Firefox.
--HG--
extra : rebase_source : efe9edce8db6a6264703e0105c2192edc5ca8415
This makes things slightly more inconvenient (having to set two
environment variables instead of one for the simplest case) until a few
patches down the line, when DMD is statically linked, at which point it
will get down to one environment variable every time.
--HG--
extra : rebase_source : 08dc3c05318b572ae1026227d0369fa8bf21b20f
Now that replace_init can opt-out of registering the replace-malloc
functions, don't do so when MALLOC_LOG was not set in the environment.
While one would normally set MALLOC_LOG alongside one of the environment
variable necessary to load the replace-malloc library, we're also going,
in a subsequent change, to allow statically linking replace-malloc
libraries, taking full advantage of this change.
--HG--
extra : rebase_source : 944a9d7af33f88f793ee0104bd5e58ec508e4f58
As of bug 1420353, DMD's replace_* functions can't be called before
replace_init places them in the malloc function table, which only
happens after DMD::Init has run, meaning DMD is always initialized
by the time any of its replace_* function can be called.
--HG--
extra : rebase_source : 96bf4d01b6fac5cbb4712f56c572791cc4972f77
The original purpose of those declarations was to avoid the function
definitions being wrong, as well as forcing them being exported
properly (as extern "C", as weak symbols when necessary, etc.), but:
- The implementations being C++, function overloads simply allowed
functions with the same name to have a different signature.
- As of bug 1420353, the functions don't need to be exported anymore,
nor do we care whether their symbols are mangled. Furthermore, they're
now being assigned to function table fields, meaning there is type
checking in place, now.
So all in all, these declarations can be removed.
Also, as further down the line we're going to statically link the
replace-malloc libraries, avoid symbol conflicts by making those
functions static.
--HG--
extra : rebase_source : 0dbb15f2c85bc873e7eb662b8d757f99b0732270
This was never strictly required (for instance, DMD doesn't do that),
and would make things harder with the subsequent changes.
--HG--
extra : rebase_source : 29ea08d41f54da7f99120f9fe9af4017f61d8a4b
The definition for replace_init is only used in two places:
replace_malloc.h and mozjemalloc.cpp. Invoking the malloc_decls.h
machinery for just that one function is a lot of overhead, and things
are actually simpler doing the declaration directly, especially thanks
to the use of C++11 decltype to avoid duplication of the function
type.
This has the additional advantage that now malloc_decls.h only contains
the allocator API.
While here, replace the MOZ_WIDGET_ANDROID ifdef with ANDROID.
--HG--
extra : rebase_source : 50ddf044f43904575598f17bd4c3477fc1a1155f
Because one entry point is simpler than two, we make replace_init fulfil
both the roles of replace_init and replace_get_bridge.
Note this should be binary compatible with older replace-malloc
libraries, albeit not detecting their bridge (and with the
previous change, they do not register anyways). So loading older
replace-malloc libraries should do nothing, but not crash in awful ways.
--HG--
extra : rebase_source : aaf83e706ee34f45cfa75551a2f0998e5c5b8726
The allocator API is a moving target, and every time we change it, the
surface for replace-malloc libraries grows. This causes some build
system problems, because of the tricks in replace_malloc.mk, which
require the full list of symbols.
Considering the above and the goal of moving some of the replace-malloc
libraries into mozglue, it becomes simpler to reduce the replace-malloc
exposure to the initialization functions.
So instead of the allocator poking into replace-malloc libraries for all
the functions, we expect their replace_init function to alter the table
of allocator functions it's passed to register its own functions.
This means replace-malloc implementations now need to copy the original
table, which is not a bad thing, as it allows function calls with one
level of indirection less. It also replace_init functions to not
actually register the replace-malloc functions in some cases, which will
be useful when linking some replace-malloc libraries into mozglue.
Note this is binary compatible with previously built replace-malloc
libraries, but because those libraries wouldn't update the function
table, they would stay disabled.
--HG--
extra : rebase_source : 2518f6ebe76b4c82359e98369de6a5a8c3ca9967
Some platforms (at least powerpc64) apparently can't handle long double
constants. So use double constants instead.
--HG--
extra : rebase_source : dd7f87cfff13f63566845e03a0bd6f4a8146f421
Bug 1417234 replaced all uses of CRITICAL_SECTION with SRWLocks in the
allocator on Windows, and this seems to have incurred performance
regressions on speedometer.
OTOH, there is a real benefit from not having to manually initialize the
allocator.
So we restore the use of CRITICAL_SECTIONs for Mutexes in the allocator,
except for the initialization lock, which is remaining as a SRWLock.
Talos indicates this solves the regression in large part, but is not
definitive as whether it has the same effect as a pure backout of bug
1417234. We'll see how things go over time.
--HG--
extra : rebase_source : a52b8d08159f6fce8286578114af84e55851a86b
- This makes all of them return an enum, quite similar to rust.
- This makes all of them derive from a single implementation of an
integer comparison.
- This implements that integer comparison in terms of simple operations,
rather than "smart" computation, which turns out to allow compilers
to do better optimizations.
See https://blogs.msdn.microsoft.com/oldnewthing/20171117-00/?p=97416
--HG--
extra : rebase_source : 89710d7954f445d43e6da55aaf171b1f57dce5fc
Zero and junk should have the same scope, but currently huge and large
reallocs are zeroed when zeroing is enabled, but are not junked when
junking is enabled. This makes things straight, leaning on the side of
filling the added bytes, rather than not.
This has an actual effect on debug builds, where junk is enabled by
default.
--HG--
extra : rebase_source : f409cae7ea720f69239d896d155b653efc648feb
This also creates a new arena_t::Ralloc function replacing parts of
iralloc, the other part being inlined in the unique caller.
--HG--
extra : rebase_source : 76a9ca77e651c99641d8906faea8e469d8eea041
Since the old size arena_ralloc is given is already a size class, when
the size class for the new size is the same as the old size, the new
size can't be larger than the old size, so there's never anything to
zero.
--HG--
extra : rebase_source : dd468de60b2ed87718ec4e26f13d3b0c8932f455
We intend to move some functions to methods of the arena_t class. Moving
the arena selection out of them is the first step towards that.
--HG--
extra : rebase_source : b8380c3a0c90ed817a1dbbe8d60e6107b78ec677
The immediate goal for this is to allow determinism in an arena used for
an upcoming test, by essentially disabling purge on that specific arena.
We do that by allowing arenas to be created with a specific setting for
mMaxDirty.
Incidentally, this allows to cleanup the mMaxDirty initialization for
thread-local arenas.
Longer term, this would allow to tweak arenas with more parameters, on
a per arena basis.
--HG--
extra : rebase_source : e4b844185d132aca9ee10224fc626f8293be0a34
Some unit tests rely on jemalloc_stats to get information such as chunk
size or page size. They can do so before any allocation happens, when
using gtest filters. So it is preferable for jemalloc_stats to
initialize the allocator.
--HG--
extra : rebase_source : 6696ec1cdaa3b121a3d12cb7b6049b79c656d271
We currently turn off the C++14 sized-deallocation facility on MSVC, and
we'd like to ensure we do the same thing for clang and gcc. To do so,
we add new functionality to moz.configure for checking and adding
compilation flags, similar to the facility for checking and adding
warning flags. The newly added facility is then used to add
-fno-sized-deallocation to the compilation flags, when the option is
supported.
Once we do this, we can't define the sized deallocation functions in
mozalloc.h; the compiler will complain that we are using
-fno-sized-deallocation, yet defining these special functions that we'll
never use. These functions were added for MinGW, where we needed to
compile with C++14 ahead of other platforms to be compatible with MSVC
headers. But they're no longer necessary, though they would be if we
removed -fno-sized-deallocation; the compiler will complain if we do
that and we'll add them back at that point.
SRWLock is more lightweight than CriticalSection, but is only available
on Windows Vista and more. So until we actually dropped support Windows
XP, we had to use CriticalSection.
Now that all supported Windows versions do have SRWLock, this is a
switch we can make, and not only because SRWLock is more lightweight,
but because it can be statically initialized like on other platforms,
allowing to use the same initialization code as on other platforms,
and removing the requirement for a DllMain, which in turn can allow
to statically link mozjemalloc in some cases, instead of requiring a
shared library (DllMain only works on shared libraries), or manually
call the initialization function soon enough.
There is a downside, though: SRWLock, as opposed to CriticalSection, is
not fair, meaning it can have thread scheduling implications, and can
theoretically increase latency on some threads. However, it is the
default used by Rust Mutex, meaning it's at least good enough there.
Let's see how things go with this.
--HG--
extra : rebase_source : 337dc4e245e461fd0ea23a2b6b53981346a545c6
Currently, huge allocations are completely independent from arenas. But
in order to ensure that e.g. moz_arena_realloc can't reallocate huge
allocations from another arena, we need to track which arena was
responsible for the huge allocation. We do that in the corresponding
extent_node_t.
Both functions do essentially the same thing, one having more validation
than the other. We can use a template with a boolean parameter to avoid
the duplication.
Furthermore, we're soon going to require, in some cases, more
information than just the size of the allocation, so we wrap their
result in a helper class that gives information about an active
allocation.
FloorLog2 expands to, essentially, a compiler builtin/intrinsic, that,
in turn, expands to a single machine instruction on tier 1 and other
platforms. On platforms where that's not the case, we can expect the
compiler to generate fast code anyways. So overall, this is all better
than manually using a log2 lookup table.
Also replace a manual power-of-two check with mozilla::IsPowerOfTwo,
which does the same test.
--HG--
extra : rebase_source : e8164c254723c74ef83e798073327ec6afa6f1fb
They are both only used once, are trivial wrappers, and even repeat the
same assertions.
--HG--
extra : rebase_source : b40b26e303cb69a451e63937efd8a666053954e5
There are multiple flaws to the current code:
- The loop calculating the right parameters for a given run size is
repeated.
- The loop trying different run sizes doesn't actually work to fulfil
the overhead constraint: while it stops when the constraint is
fulfilled, the values that are kept are those from the previous
iteration, which may well be well over the constraint.
In practice, the latter resulted in a few surprising results:
- most size classes had an overhead slightly over the constraint
(1.562%), which, while not terribly bad, doesn't match the set
expectations.
- some size classes ended up with relatively good overheads only because
of the additional constraint that run sizes had to be larger than the
run size of smaller size classes. Without this constraint, some size
classes would end up with overheads well over 2% just because that
happens to be the last overhead value before reaching below the 1.5%
constraint.
Furthermore, for higher-level fragmentation concerns, smaller run sizes
are better than larger run sizes, and in many cases, smaller run sizes
can yield the same (or even sometimes, better) overhead as larger run
sizes. For example, the current code choses 8KiB for runs of size 112,
but using 4KiB runs would actually yield the same number of regions, and
the same overhead.
We thus change the calculation to:
- not force runs to be smaller than those of smaller classes.
- avoid the code repetition.
- actually enforce its overhead constraint, but make it 1.6%.
- for especially small size classes, relax the overhead constraint to
2.4%.
This leads to an uneven set of run sizes:
size class before after
4 4 KiB 4 KiB
8 4 KiB 4 KiB
16 4 KiB 4 KiB
32 4 KiB 4 KiB
48 4 KiB 4 KiB
64 4 KiB 4 KiB
80 4 KiB 4 KiB
96 4 KiB 4 KiB
112 8 KiB 4 KiB
128 8 KiB 8 KiB
144 8 KiB 4 KiB
160 8 KiB 8 KiB
176 8 KiB 4 KiB
192 12 KiB 4 KiB
208 12 KiB 8 KiB
224 12 KiB 4 KiB
240 12 KiB 4 KiB
256 16 KiB 16 KiB
272 16 KiB 4 KiB
288 16 KiB 4 KiB
304 16 KiB 12 KiB
320 20 KiB 12 KiB
336 20 KiB 4 KiB
352 20 KiB 8 KiB
368 20 KiB 4 KiB
384 24 KiB 8 KiB
400 24 KiB 20 KiB
416 24 KiB 16 KiB
432 24 KiB 12 KiB
448 28 KiB 4 KiB
464 28 KiB 16 KiB
480 28 KiB 8 KiB
496 28 KiB 20 KiB
512 32 KiB 32 KiB
1024 64 KiB 64 KiB
2048 132 KiB 128 KiB
* Note: before is before this change only, not before the set of changes
from this bug; before that, the run size for 96 could be 8 KiB in some
configurations.
In most cases, the overhead hasn't changed, with a few exceptions:
- Improvements:
size class before after
208 1.823% 0.977%
304 1.660% 1.042%
320 1.562% 1.042%
400 0.716% 0.391%
464 1.283% 0.879%
480 1.228% 0.391%
496 1.395% 0.703%
- Regressions:
352 0.312% 1.172%
416 0.130% 0.977%
2048 1.515% 1.562%
For the regressions, the values are either still well within the
constraint or very close to the previous value, that I don't feel like
it's worth trying to avoid them, with the risk of making things worse
for other size classes.
--HG--
extra : rebase_source : fdff18df8a0a35c24162313d4adb1a1c24fb6e82
On 64-bit platforms, sizeof(arena_run_t) includes a padding at the end
of the struct to align to 64-bit, since the last field, regs_mask, is
32-bit, and its offset can be a multiple of 64-bit depending on the
configuration. But we're doing size calculations for a dynamically-sized
regs_mask based on sizeof(arena_run_t), completely ignoring that
padding.
Instead, we use the offset of regs_mask as a base for the calculation.
Practically speaking, this doesn't change much with the current set of
values, but could affect the overheads when we squeeze run sizes more.
--HG--
extra : rebase_source : a3bdf10a507b81aa0b2b437031b884e18499dc8f
This makes the run header larger than necessary, which happens to make
the current arena_bin_run_calc_size pick 8KiB runs for size class 96
when MOZ_DIAGNOSTIC_ASSERT_ENABLED is set. This change makes it pick
4KiB runs, making MOZ_DIAGNOSTIC_ASSERT_ENABLED builds use the same set
of run sizes as non-MOZ_DIAGNOSTIC_ASSERT_ENABLED builds.
--HG--
extra : rebase_source : fd7ef2d58ec601186647799e9dcf8146e723241c
First and foremost, the code and corresponding comment weren't in
agreement on what's going on.
The code checks:
RUN_MAX_OVRHD * (bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX
which is equivalent to:
(bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX / RUN_MAX_OVRHD
replacing constants:
(bin->mSizeClass << 3) <= 0x1800 / 0x3d
The left hand side is just bin->mSizeClass * 8, and the right hand side
is about 100, so this can be roughly summarized as:
bin->mSizeClass <= 12
The comment says the overhead constraint is relaxed for runs with a
per-region overhead greater than RUN_MAX_OVRHD / (mSizeClass << (3+RUN_BFP)).
Which, on itself, doesn't make sense, because it translates to
61 / (mSizeClass * 32768), which, even for a size class of 1 would mean
less than 0.2%, and this value would be even smaller for bigger classes.
The comment would make more sense with RUN_MAX_OVRHD_RELAX, but would
still not match what the code was doing.
So we change how the relaxed rule works, as per the comment in the new
code, and make it happen after the standard run overhead constraint has
been checked.
--HG--
extra : rebase_source : cec35b5bfec416761fbfbcffdc2b39f0098af849
The description above the RUN_* constant definitions talks about binary
fixed point math, which is one way to look at the problem, but a clearer
one is to look at it as comparing ratios in a way that doesn't use
divisions.
So, starting from the current expression:
(try_reg0_offset << RUN_BFP) <= RUN_MAX_OVRHD * try_run_size
This can be rewritten as
try_reg0_offset * (1 << RUN_BFP) <= RUN_MAX_OVRHD * try_run_size
Dividing both sides with ((1 << RUN_BFP) * try_run_size), and
simplifying, gives us:
try_reg0_offset / try_run_size <= RUN_MAX_OVRHD / (1 << RUN_BFP)
Replacing the constants:
try_reg0_offset / try_run_size <= 0x3d / (1 << 12)
or
try_reg0_offset / try_run_size <= 61 / 4096
61 / 4096 is roughly 1.5%.
So what the check really intends to do is check that the overhead is
below 1.5%.
So we introduce a helper class and a user-defined literal that makes the
test more self-descriptive, while producing identical machine code.
This is a lot of code to add, but I think it's one of those cases where
abstraction can help make the code clearer.
--HG--
extra : rebase_source : 3d4a94f524a60e40ba75859c4f761f59d689e81a
This is, practically speaking, a no-op, and will hopefully help make the
following changes clearer.
--HG--
extra : rebase_source : b704bdf2ae46c2408e0061363822b9744ef449cb
QUANTUM_2POW_MIN is exactly 4, and we are unlikely to ever make it
smaller. Also turn a MOZ_ASSERT into a static_assert, because it only
uses constants, and will fail if QUANTUM_2POW_MIN is lowered without
touching size_invs.
--HG--
extra : rebase_source : 7c8ee3c0ea30a88bddba816c41c6f63914f7a03c
There is a set of "constants" that are actually globals that depend on
the page size that we get at runtime, when compiling without
MALLOC_STATIC_PAGESIZE, but that are actual constants when compiling
with it. Their value was unfortunately duplicated.
We setup a set of macros allowing to make the declarations unique.
--HG--
extra : rebase_source : 56557b7ba01ee60fe85f2cd3c2a0aa910c4c93c6
At the same time, add user-defined literals to make those constants more
legible.
--HG--
extra : rebase_source : ce143ad9d8a6603179042d8cf432f00c815156c5
At the moment, while they are used before their declaration, it's from a
macro. It is desirable to replace the macros with C++ constants, which
will require the structures being defined first.
--HG--
extra : rebase_source : 7a351dafea04a7d75b6eec50fa52fb49c135e569
We create a new helper class that rounds up allocations sizes and
categorizes them. Compilers are smart enough to elide what they don't
need, like in malloc_good_size, they elide the code related to the
class type enum.
--HG--
extra : rebase_source : 61381e600587b045e720a85a7b46673edeb691b9
Because of alignment issues due to the system glibc when running the
SSE2 gcov code generated during the PGO profile gen phase, Firefox
crashes when running the PGO profile. We work around the issue by
disabling SSE2 when building mozjemalloc during that phase. That
shouldn't affect the coverage data anyways, which is bound to the
original C++ code, and the profile-use code generation will still emit
SSE2 based on the coverage data if it needs to.
--HG--
extra : rebase_source : 3596fdc795cdef0789f3a2dd8f10b42cde00430f
We introduce the notion of private arenas, separate from other arenas
(main and thread-local). They are kept in a separate arena tree, and
arena lookups from moz_arena_* functions only access the tree of
private arenas. Iteration still goes through all arenas, private and
non-private.
--HG--
extra : rebase_source : 86c43c7c920b01eb6fa1fa214d612fd9220eac3e
We create the ArenaCollection class to handle operations on the
arena tree. Ideally, iter() would trigger locking, but the
prefork/postfork code complicates things, so we leave this for later.
--HG--
extra : rebase_source : bd7021098baf0ec01c14063294098edea4473d36
Note we use a local variable for fallible allocator because using plain
`new (fallible)` would require some figuring out for non-Firefox builds
(e.g. standalone js).
--HG--
extra : rebase_source : 2132f98ebc7e37a139b673f80631e672bcf8ed15
RedBlackTree::{Insert,Remove} allocate an object on the stack for its
RedBlackTreeNode, and that shouldn't have side effects if the type
happens to have a constructor. This will allow to add constructors to
some of the mozjemalloc types.
--HG--
extra : rebase_source : 14dbb7d73c86921701d83156186df5d645530dda
We introduce the notion of private arenas, separate from other arenas
(main and thread-local). They are kept in a separate arena tree, and
arena lookups from moz_arena_* functions only access the tree of
private arenas. Iteration still goes through all arenas, private and
non-private.
--HG--
extra : rebase_source : ec48631a4a65520892331c1fcd62db37ed35ba1d
We create the ArenaCollection class to handle operations on the
arena tree. Ideally, iter() would trigger locking, but the
prefork/postfork code complicates things, so we leave this for later.
--HG--
extra : rebase_source : 90c96575d65c920f75aa621ba119d354d1ce252a
Note we use the deprecated `new (fallible_t())` form because using `new
(fallible)` would require some figuring out for non-Firefox builds (e.g.
standalone js).
--HG--
extra : rebase_source : 0159d8476a1b5509330517c00af6c387d522722d
RedBlackTree::{Insert,Remove} allocate an object on the stack for its
RedBlackTreeNode, and that shouldn't have side effects if the type
happens to have a constructor. This will allow to add constructors to
some of the mozjemalloc types.
--HG--
extra : rebase_source : 14dbb7d73c86921701d83156186df5d645530dda
- In the cases where it's used on powers of 2, replace it with
FloorLog2() + 1.
- In the cases where it's used on any kind of number, replace it with
CountTrailingZeroes, which is `ffs(x) - 1`.
- In the case of tiny allocations in arena_t::MallocSmall, we rearrange
the code so that the intent is clearer, which also simplifies the
expression for the mBins offset: mBins[0] is the first tiny bucket,
for allocations of sizes 1 << TINY_MIN_2POW, mBins[1] for allocations
of size 1 << (TINY_MIN_2POW + 1), etc. up to small_min. So the offset
is really the log2 of the normalized size.
--HG--
extra : rebase_source : 954a655dcaa93857dc976078e133704bb141de0d
Comparing ffs(x) == ffs(y), when x and y are guaranteed to be powers of
2 (or 0, or 1), is the same as x == y.
--HG--
extra : rebase_source : d6cc3399d85fa9fda2559435e99adbfb82ac8da0
The sentinel was taking as much space as one element of the tree, while
only really used for its RedBlackTreeNode, wasting space.
This results in some decrease in struct sizes, for example on 64-bits
linux:
- arena_bin_t: 80 -> 56
- arena_t (excluding mBins): 224 -> 144
- arena_t + dynamic size of mBins: 3024 -> 2104
It also decreases the size of several globals:
- gChunksBySize, gChunksByAddress, huge: 64 -> 8
- gArenaTree: 312 -> 8
--HG--
extra : rebase_source : d5bb52f93e064ab4cca3fb07b2c5a77ce57fb7db
Interestingly, this turns single-instruction checks into
two-instructions checks (at least with GCC, from one cmpb to a movl
followed by a testl), but this is due to Atomic<bool> being actually
backed by a uint32_t, not by the use of atomics.
--HG--
extra : rebase_source : cfc0bec2113b44635120216b4abbbbbe9028b286
- First, MOZ_DIAGNOSTIC_ASSERT_ENABLED is always true when MOZ_DEBUG is
set, so don't check for MOZ_DEBUG.
- Second, all the magic number assertions should be
MOZ_DIAGNOSTIC_ASSERTs instead of MOZ_ASSERTs.
--HG--
extra : rebase_source : 5601cd13604e21c46a9f0ad8b0b4d6fc399b853e
Some need initialization to happen, some can be skipped when the
allocator was not initialized, and others should crash.
--HG--
extra : rebase_source : d6c2697ca27f6110fe52a067440a0583e0ed0ccd
Also rearrange some code accordingly, but don't fix indentation issues
just yet.
Also apply changes from the google-readability-braces-around-statements
check.
But don't apply the modernize-use-nullptr recommendation about
strerror_r because it's wrong (bug #1412214).
--HG--
extra : histedit_source : 2d61af7074fbdc5429902d9c095c69ea30261769
Backed out changeset f75f427fde1d
Backed out changeset 6278aa5fec1d
Backed out changeset eefc284bbf13
Backed out changeset e2b391ae4688
Backed out changeset 58070c2511c6
--HG--
extra : histedit_source : d14fa171a5cf4d9400cae7f94d5cc64a1e58b98d%2C856ad5b650074a1dcff2edb0b95adc20aaf38db3