TlsGetValue has a semantic difference with pthread_getspecific, in that it
can return a non-error NULL value, so it always sets the LastError.
But allocator callers may not be expecting calling e.g. free() to change
the value of the last error, so preserve it.
The new "num" property lets identical blocks be aggregated in the output. This
patch only uses the "num" property for dead blocks, because that's where the
greatest potential benefit lies, but it could be used for live blocks as well.
On one test case (a complex PDF file) running with --mode=cumulative
--sample-below=1 this patch had the following effects.
- Change in running speed was negligible.
- Compressed output file size dropped from 8.8 to 5.0 MB.
- Compressed output file size dropped from 297 to 50 MB.
- dmd.py runtime (without stack fixing) dropped from 30 to 8 seconds.
--HG--
extra : rebase_source : 46a32058cd5c31cd823fe3f1accb5e68bcd320f3
The DEFINES and XPCOM_API changes are needed to get rid of "inconsistent dll
linkage" warnings on Windows builds.
--HG--
extra : rebase_source : 00756f51ebee85c70f65d51dbac17b4835262697
The infinite loop happens if chunk_alloc_arena needs to be called when a0malloc
is called. It in turn calls chunk_alloc_default, which uses tsd, which calls
a0malloc if it's the first time the tsd is being gotten from the current thread.
tsd only uses a0malloc on platforms where there is no native thread local storage
support, which, for Mozilla, essentially means anything that is not Linux.
But the tsd is only neededto get the dss precedence setting of the given arena.
That setting has no effect when dss is disabled, which it is on Windows and Mac.
Moreover, the default setting for dss precedence is "secondary", which means
jemalloc only tries dss after it failed to get memory with mmap/VirtualAlloc.
Considering the cases where mmap/VirtualAlloc would fail essentially means
there is shortage of address space, sbrk() is not going to have much more
success, so we might as well disable dss support on all platforms, avoiding
the infinite loop problem on Android and B2G as well.
This reduces the runtime on my Linux machine for one large DMD output file from
235 seconds to 105 seconds.
--HG--
extra : rebase_source : bd567319c1207a09fa47b94571066109de1cc9e7
Now that defining $DMD is no longer necessary to run DMD, this patch does the
following.
- Removes all the places where we set DMD=1 (test harnesses, etc.)
- Still handles DMD=1, for backwards compatibility.
- Prints "$DMD is undefined" at DMD start-up if appropriate.
- Writes a |null| value for |dmdEnvVar| in the JSON if $DMD is undefined. Bumps
the DMD output version number accordingly.
- Changes a bunch of the test files accordingly, including changing the mode of
script-ignore-alloc-fns.json in order to test a case where $DMD is undefined.
--HG--
extra : rebase_source : eb1ef5722410734ce6d7658465ff6f442ee4ed49
The reason for --disable-optimize is to make debugging easier, but not many
people actually need a high level of debuggability of the allocator itself.
This works around the issue that the Android NDK's definition of ffs is
broken when compiling without optimization, while avoiding to add yet another
configure test.
This patch moves profiling mode selection from post-processing (in dmd.py) to
DMD start-up. This will make it easier to add new kinds of profiling, such as
cumulative heap profiling.
Specifically:
- There's a new --mode option. |LiveWithReports| is the default, as it is
currently.
- dmd.py's --ignore-reports option is gone.
- There's a new |mode| field in the JSON output.
- Reports-related operations are now no-ops if DMD isn't in LiveWithReports
mode.
- Diffs are only allowed for output files that have the same mode.
- A new function ResetEverything() replaces the SetSampleBelowSize() and
ClearBlocks(), which were used by the test to change DMD options.
- The tests in SmokeDMD.cpp are split up so they can be run multiple times, in
different modes. The exact combinations of tests and modes has been changed a
bit.
--HG--
rename : memory/replace/dmd/test/full-reports-empty-expected.txt => memory/replace/dmd/test/full-empty-dark-matter-expected.txt
rename : memory/replace/dmd/test/full-heap-empty-expected.txt => memory/replace/dmd/test/full-empty-live-expected.txt
rename : memory/replace/dmd/test/full-heap-sampled-expected.txt => memory/replace/dmd/test/full-sampled-live-expected.txt
rename : memory/replace/dmd/test/full-reports-unsampled1-expected.txt => memory/replace/dmd/test/full-unsampled1-dark-matter-expected.txt
rename : memory/replace/dmd/test/full-heap-unsampled1-expected.txt => memory/replace/dmd/test/full-unsampled1-live-expected.txt
rename : memory/replace/dmd/test/full-reports-unsampled2-expected.txt => memory/replace/dmd/test/full-unsampled2-dark-matter-expected.txt
rename : memory/replace/dmd/test/script-diff-basic-expected.txt => memory/replace/dmd/test/script-diff-live-expected.txt
rename : memory/replace/dmd/test/script-diff1.json => memory/replace/dmd/test/script-diff-live1.json
rename : memory/replace/dmd/test/script-diff2.json => memory/replace/dmd/test/script-diff-live2.json
extra : rebase_source : bf32cc4e0d82aa1a20ceb55e8ea259850b49cc06
Because DMD is no longer just about measuring memory reports coverage, but is
also used for more general heap profiling.
--HG--
extra : rebase_source : 82b4579de240037f96cf6618b15870925adc431b
Bug 818922 made MOZ_REPLACE_MALLOC a global define, and that changed the
defines set when building mozjemalloc_compat.c for the jemalloc3
replace-malloc library. In turn, this made the symbol munging wrong for
that library, making the jemalloc_* functions exported as je_jemalloc_*
instead of replace_jemalloc_*.
This is to give better contrast with |DeadBlock|, which will be added in the
next patch.
--HG--
extra : rebase_source : cbc767fcc5667cfed108ca7c4ebf1d7e82aa185e
This patch:
- Uses |auto| in Range loops, so more of them fit on a single line.
- Converts one use of HashSet::Enum (which is only needed if you're modifying
the HashSet as you iterate) to HashSet::Range.
--HG--
extra : rebase_source : 09b011cd69218c06984f06420d375839cd4e9214
This also effectively changes how DMD is enabled from requiring both
replace-malloc initialization and the DMD environment variable to
requiring only the former. The DMD environment variable can still be
used to specify options, but not to disable entirely.
This however doesn't touch all the parts that do enable DMD by setting
the DMD environment variable to 1, so the code to handle this value
is kept.
The interesting feature JSONWriteFunc has, contrary to JSONWriter, is that it
only has virtual methods, which makes it a better candidate to be passed
around between libraries not linked against each other.
This will allow to make dmd and libxul independent from each other.
There are, sadly, many combinations of linkage in use throughout the tree.
The main differentiator, though, is between program/libraries related to
Gecko or not. Kind of. Some need mozglue, some don't. Some need dependent
linkage, some standalone.
Anyways, these new templates remove the need to manually define the
right dependencies against xpcomglue, nspr, mozalloc and mozglue
in most cases.
Places that build programs and were resetting MOZ_GLUE_PROGRAM_LDFLAGS
or that build libraries and were resetting MOZ_GLUE_LDFLAGS can now
just not use those Gecko-specific templates.
It appears to be an unnecessary optimization, since the compiler is still inlining
the functions when they're not marked static. OTOH, following patches will require
the _impl functions not to be static.
Since essentially everything is linked to libmozglue and libmozglue takes
precedence in symbol resolution in our dynamic linker, there is no need
to wrap most symbols. PR_GetEnv/PR_SetEnv still needs wrapping because
there's no other way to actually wrap the calls from NSPR itself and NSS,
as well as the symbols wrapped because our dynamic linker can't find them
in system libraries on some devices because they're weak.