MozStackWalk() is different on Windows to the other platforms. It has two extra
arguments, which can be used to walk the stack of a different thread.
This patch makes those differences clearer. Instead of having a single function
and forbidding those two arguments on non-Windows, it removes those arguments
from MozStackWalk, and splits off MozStackWalkThread() which retains them. This
also allows those arguments to have more appropriate types (HANDLE instead of
uintptr_t; CONTEXT* instead of than void*) and names (aContext instead of
aPlatformData).
The patch also removes unnecessary reinterpret_casts for the aClosure argument
at a couple of MozStackWalk() callsites.
--HG--
extra : rebase_source : 111ab7d6426d7be921facc2264f6db86c501d127
When set to true, the resulting profile will have a non-null meta.shutdownTime
field which is set to current time.
Non-shutdown profiles also get that field, but it's null for them.
MozReview-Commit-ID: 1vpmhBR8rC6
--HG--
extra : rebase_source : b026088053c30acd287f0dc3afa7ddf14093ec27
When set to true, the resulting profile will have a non-null meta.shutdownTime
field which is set to current time.
Non-shutdown profiles also get that field, but it's null for them.
MozReview-Commit-ID: 1vpmhBR8rC6
--HG--
extra : rebase_source : 38573ff847ee7e2ac5df9c82564dd6495cc1636f
On Win32, stack frames are now always present, so we can always use
FramePointerStackWalk(). On Win64, stack frames are never present, so we always
use MozStackWalk().
In both cases, we can get stack traces no matter the value of MOZ_PROFILING. So
this patch removes MOZ_PROFILING from the relevant conditions. It also
restructures the conditions and adds some helpful comments.
--HG--
extra : rebase_source : c76aee00432b875ae0c81f8e61f56cd4112bffde
This also fixes the bug where we would always profile child processes if the
parent process had been launched with MOZ_PROFILER_STARTUP=1, regardless of
whether the profiler was still running in the parent process.
MozReview-Commit-ID: LkIpYmKJOJ1
--HG--
extra : rebase_source : 49b38bc58ded91ecc2e2fce08bcb4f2d20a13b92
This is what prenv.h suggests:
When manipulating the environment there is no way to un-set
an environment variable across all platforms. We suggest
you interpret the return of a pointer to null-string to
mean the same as a return of NULL from PR_GetEnv().
I interpret "null-string" to mean "empty string".
MozReview-Commit-ID: 2mfVD1zULXL
--HG--
extra : rebase_source : 07ec16c002f5c6d1ed0003fa05985f4155f85dfc
If set, MOZ_PROFILER_STARTUP_FEATURES_BITFIELD overrides the value set by
MOZ_PROFILER_STARTUP_FEATURES.
This means that we won't need to go through an intermediate string
representation when propagating profiler settings to a child process through
environment variables.
MozReview-Commit-ID: 49eTVMI21GJ
--HG--
extra : rebase_source : 084040e7816929a8b63b7b087d7202180be4d4d5
This allows code outside the profiler to get fully interleaved stack traces
containing frames from the pseudo-stack, native stack, and JS stack.
--HG--
extra : rebase_source : e21b64e86ffec83a0052947afad1793f3fd62d00
This patch changes ProfileBuffer arguments from pointers to references. For
functions that modify the ProfileBuffer, it also moves the argument to the end.
--HG--
extra : rebase_source : 394dd3effc852447c703c0f5802c092ae96e2eaa
When a sample with a label and a dynamic string is written to the
ProfileBuffer, the profiler currently joins them together (up to a max length
of 512, omitting any that exceed this) and then writes a CodeLocation entry
with an empty string followed by a sequence of EmbeddedString entries. When
parsing those entries, we allow a length up to 8192, but that limit is never
reached due to the prior limit of 512.
This patch makes the following changes.
- Removes the joining at write time. Labels and dynamic strings are now written
separately into the ProfileBuffer. The 512 limit still applies, but just for
dynamic strings; dynamic strings longer than that are replaced with "(too
long)". (Labels also always take up one entry, because they only require a
single pointer, because they are always static strings.) The joining is
now done when the ProfileBuffer is parsed, and the max length for the joined
string is still 512; any strings exceeding 512 at that point are truncated,
rather than omitted. (This also happens to be outside the profier's critical
section.)
- Renames CodeLocation as Label and EmbeddedString as DynamicStringFragment.
This makes the ProfileBuffer entry names better match the names used in
GeckoProfiler.h.
- Moves AddDynamicCodeLocation(), now called addDynamicStringEntry(), into
ProfileBuffer.
- Adds some testing of long and overly-long dynamic strings to the GTest.
--HG--
extra : rebase_source : 38bdf6e84fa19576c9e0291249e84b19dbb421f7
Currently LUL is a member of CorePS, meaning that it is guarded by the PSMutex.
This mutex is grabbed by the main thread at random points during the execution
of the program. This is unfortunate, as initializing LUL can take a long
time (>1s on my local machine), and we definitely don't want to be blocking the
main thread waiting for it.
In addition, in the BHR case, we used to be grabbing LUL when we got our first
hang, while both the PSMutex and the BHR monitor were being held. This meant
that the main thread could make no progress during LUL initializaion, as the BHR
monitor is grabbed by the main thread on every spin of the event loop.
This patch moves that initialization to be behind a completely separate lock,
and makes BHR initialize it on the background thread before acquiring the BHR
lock, meaning that no locks other than the one guarding LUL should be held
during its initialization.
MozReview-Commit-ID: GwNYQaEAqJ1
The profiler writes ProfileBuffer entries in a particular order, and then later
has to parse them, mostly in StreamSamplesToJSON(). That function's parsing
code is poorly structured and rather gross, at least partly because no explicit
grammar is identified.
This patch identifies the grammar in a comment, and in the same comment also
includes some examples of the more complicated subsequences. Once written down,
the grammar is obviously suboptimal -- the |Sample| entries serve no useful
purpose, for example -- but I will leave grammar improvements as follow-ups.
The patch also rewrites the parser in a more typical fashion that obviously
matches the grammar. The new parser is slightly more verbose but far easier to
understand.
--HG--
extra : rebase_source : 762c21a68cdc18ff25b5feda3c5dfcf33afa53be
The patch also changes ProfileBuffer::processEmbeddedString() to take the
readAheadPos, instead of recomputing it.
--HG--
extra : rebase_source : 62bacb4c7cc61f43d78ada342af0a813c307b96a
The double variant is always 8 bytes, so the chars variant can be too. As well
as reducing memory usage on 32-bit platforms, this patch makes the code
clearer.
--HG--
extra : rebase_source : 8f3dd0a1e35c18ac812fa5db7c3f6e4626447c4c
- It's common for unions to be named |u|, because this makes it obvious that
it's a union when you access it, which is good. This patch introduces that
for the union in ProfilerBufferEntry. (This required move the union setting in
each constructor from the initializer list to the constructor body.)
- Each union variant had the prefix "mTag". But that's a bad name, because
|mKind| is actually the tag. So this patch removes the "Tag".
- |mTagData| was a poor name for the |const char*| variant, so this patch
renames it |mString|.
- The patch moves |mKind| before |u|, because that's the normal way that tagged
unions are done.
--HG--
extra : rebase_source : 563cbcf6414fa3c45abcdd5eafd99965bb842de5
If marker pointer is null the uses of it will immediately crash, so asserting
non-nullness doesn't add much. And removing the getter makes it more similar to
the other union variants.
--HG--
extra : rebase_source : a1066ef98ac5d2dae5303b465106b844937cfb73
This requires:
- Moving the constructors of ProfilerMarkerPayload and its subclasses into the
.h file so they are visible even when ProfilerMarkerPayload.cpp isn't
compiled.
- Similarly, using a macro to make StreamPayload() a crashing no-op when the
profiler isn't enabled. (It is never called in that case.)
--HG--
extra : rebase_source : 7aad2fdb1bd4e49782024dba6664e8f992771520
No point having all these explicit empty destructors.
Also, we can avoid IOMarkerPayload's constructor by using a UniqueFreePtr.
--HG--
extra : rebase_source : 0a2a5aecb66a2990c9188354c861f67633ed2fee
Currently LUL is a member of CorePS, meaning that it is guarded by the PSMutex.
This mutex is grabbed by the main thread at random points during the execution
of the program. This is unfortunate, as initializing LUL can take a long
time (>1s on my local machine), and we definitely don't want to be blocking the
main thread waiting for it.
In addition, in the BHR case, we used to be grabbing LUL when we got our first
hang, while both the PSMutex and the BHR monitor were being held. This meant
that the main thread could make no progress during LUL initializaion, as the BHR
monitor is grabbed by the main thread on every spin of the event loop.
This patch moves that initialization to be behind a completely separate lock,
and makes BHR initialize it on the background thread before acquiring the BHR
lock, meaning that no locks other than the one guarding LUL should be held
during its initialization.
MozReview-Commit-ID: GwNYQaEAqJ1
PROFILER_MARKER is now just a trivial wrapper for profiler_add_marker(). This
patch removes it.
--HG--
extra : rebase_source : 9858f34763bb343757896a91ab7ad8bd8e56b076
This patch reduces the differences between builds where the profiler is enabled
and those where the profiler is disabled. It does this by removing numerous
MOZ_GECKO_PROFILER checks.
These changes have the following consequences.
- Various functions and classes are now defined in all builds, and so can be
used unconditionally: profiler_add_marker(), profiler_set_js_context(),
profiler_clear_js_context(), profiler_get_pseudo_stack(), AutoProfilerLabel.
(They are effectively no-ops in non-profiler builds, of course.)
- The no-op versions of PROFILER_* are now gone. The remaining versions are
almost no-ops when the profiler isn't built.
--HG--
extra : rebase_source : 8fb5e8757600210c2f77865694d25162f0b7698a
It's a wafer thin wrapper around profiler_tracing() and it's only used three
times. Let's just remove it.
Note also that those three uses are the only places where TRACING_EVENT is
used. I wonder if they're really needed...
--HG--
extra : rebase_source : ac70b4c77c4592d96957a8e6249597eafc822fd4
This removes LUL's ability to recover frames by the heuristic mechanism of
stack scanning. Stack scanning is a last-ditch way to try to recover the
unwind when all other methods (metadata-based, frame-pointer chasing) have
failed, by scanning back up the stack and looking for the first word that
could plausibly be a return address. It often mis-identifies return addresses
because it has no way to distinguish live ones from dead ones that have not
been overwritten, and very often causes the unwind to fail as a result.
In any case LUL's stack scanning ability has actually been switched off (by
the parameters passed to LUL::Unwind) for some considerable time now, so this
change should make no observable difference to behaviour. Specific changes:
In LUL::Unwind():
* Removes formal parameters |scannedFramesAcquired| and |scannedFramesAllowed|
* Removes code that does stack scanning
* Simplifies control flow in the main unwind loop, so that loop now
has the easier-to-follow structure
while (true) {
// preliminary stuff
if (CFI data available for current PC) {
do CFI step;
continue;
}
if (FP chasing possible for current PC) {
do FP step;
continue;
}
// give up
break;
}
* Moves two #ifdefs upwards to enclose the comments pertaining to them, as
well as the code. This makes the top level structure easier to follow. The
corresponding #endifs are likewise commented with the condition.
From class LULStats, removes |mScanned|.
Removes PriMap::MaybeIsReturnPoint() entirely. This is a heuristic helper
only used by stack scanning.
In all, 395 lines of code are removed, according to hg diff --stat.
--HG--
extra : rebase_source : 5ffa73c64923149a58df3228cf940cb539f8f707
We rely partly on the constructor to set Registers, and partly on subsequent
assignments. This patch changes things so we rely entirely on subsequent
assignments, for consistency.
--HG--
extra : rebase_source : ca69186f5755003e710985bfb40c90072067305c
We can use a static ucontext_t on Linux for synchronous samples instead of
declaring one on the stack in profiler_get_backtrace(). This neatens
SyncPopulate()'s signature.
SyncPopulate() is also only used when HAVE_NATIVE_UNWIND is defined, so the
patch guards the definitions.
--HG--
extra : rebase_source : b71e6d76f24b37bc236ac8f4359d401b1551e2de
This patch does the following.
- Renames some ProfilerMarkerPayload subclasses so they all of the form
"FooMarkerPayload", to make the subclass relationship clearer.
(ProfilerMarkerTracing -- now TracingMarkerPayload -- was the worst
offender.)
- Removes ProfilerMarkerImagePayload and TouchDataPayload, neither of which are
used.
- Changes streamCommonProps() to StreamCommonProps().
- Does some minor style and comment fixes in ProfilerMarkerPayload.h.
--HG--
extra : rebase_source : dd732905e96da83bcbf124c70b20011c661fc332
Once the |aPayload| argument to profile_add_marker() became a UniquePtr the
default value of nullptr caused compilation difficulties that could only be
fixed by #including ProfilerMarkerPayload.h into lots of additional places
(because the UniquePtr<T> instantiation required the T to be fully defined). To
get around this I just split profile_add_marker() into two functions, one with
1 argument and one with 2 arguments.
The patch also removes the definition of PROFILER_MARKER_PAYLOAD in the case
where MOZ_GECKO_PROFILER isn't defined. A comment explains why.
Bug 1357829 added a third kind of sample, in addition to the existing
"periodic" and "synchronous" samples. This patch cleans things up around that
change. In particular, it cleans up TickSample, which is a mess of semi-related
things.
The patch does the following.
- It removes everything from TickSample except the register values and renames
TickSample as Registers. Almost all the removed stuff is available in
ThreadInfo anyway, and the patch adds a ThreadInfo argument to various
functions. (Doing it this way wasn't possible until recently because a
ThreadInfo wasn't available in profiler_get_backtrace() until recently.)
One non-obvious consequence: in synchronous samples we used to use a value of
0 for the stackTop. Because synchronous samples now use ThreadInfo directly,
they are able to use the proper stack top value from ThreadInfo::mStackTop.
This will presumably only improve the quality of the stack traces.
- It splits Tick() in two and renames the halves DoPeriodicSample() and
DoSyncSample().
- It reorders arguments in some functions so that ProfileBuffer (the output) is
always last, and inputs are passed in roughly the order they are obtained.
- It adds a comment at the top of platform.cpp explaining the three kinds of
sample.
- It renames a couple of other things.
--HG--
extra : rebase_source : 4f1e69c605102354dd56ef7af5ebade201e1d106
Bug 1357829 added a third kind of sample, in addition to the existing
"periodic" and "synchronous" samples. This patch cleans things up around that
change. In particular, it cleans up TickSample, which is a mess of semi-related
things.
The patch does the following.
- It removes everything from TickSample except the register values and renames
TickSample as Registers. Almost all the removed stuff is available in
ThreadInfo anyway, and the patch adds a ThreadInfo argument to various
functions. (Doing it this way wasn't possible until recently because a
ThreadInfo wasn't available in profiler_get_backtrace() until recently.)
One non-obvious consequence: in synchronous samples we used to use a value of
0 for the stackTop. Because synchronous samples now use ThreadInfo directly,
they are able to use the proper stack top value from ThreadInfo::mStackTop.
This will presumably only improve the quality of the stack traces.
- It splits Tick() in two and renames the halves DoPeriodicSample() and
DoSyncSample().
- It reorders arguments in some functions so that ProfileBuffer (the output) is
always last, and inputs are passed in roughly the order they are obtained.
- It adds a comment at the top of platform.cpp explaining the three kinds of
sample.
- It renames a couple of other things.
--HG--
extra : rebase_source : 61f4bf75ff5a2c331e8e39dcbb2bf3563606ebb0
This patch performs a refactoring to the internals of the profiler in order to
expose a function, profiler_suspend_and_sample_thread, which can be called from a
background thread to suspend, sample the native stack, and then resume the
target passed-in thread.
The interface was designed to expose as few internals of the profiler as
possible, exposing only a single callback which accepts the list of program
counters and stack pointers collected during the backtrace.
A method `profiler_current_thread_id` was also added to get the thread_id of the
current thread, which can then be passed by another thread into
profiler_suspend_sample_thread to sample the stack of that thread.
This is implemented in two parts:
1) Splitting SamplerThread into two classes: Sampler, and SamplerThread.
Sampler was created to extract the core logic from SamplerThread which manages
unix signals on android and linux, as well as suspends the target thread on all
platforms. SamplerThread was then modified to subclass this type, adding the
extra methods and fields required for the creation and management of the actual
Sampler Thread.
Some work was done to ensure that the methods on Sampler would not require
ActivePS to be present, as we intend to sample threads when the profiler is not
active for the Background Hang Reporter.
2) Moving the Tick() logic into the TickController interface.
A TickController interface was added to platform which has 2 methods: Tick and
Backtrace. The Tick method replaces the previous Tick() static method, allowing
it to be overridden by a different consumer of SuspendAndSampleAndResumeThread,
while the Backtrace() method replaces the previous MergeStacksIntoProfile
method, allowing it to be overridden by different consumers of
DoNativeBacktrace.
This interface object is then used to wrap implementation specific data, such as
the ProfilerBuffer, and is threaded through the SuspendAndSampleAndResumeThread
and DoNativeBacktrace methods.
This change added 2 virtual calls to the SamplerThread's critical section, which
I believe should be a small enough overhead that it will not affect profiling
performance. These virtual calls could be avoided using templating, but I
decided that doing so would be unnecessary.
MozReview-Commit-ID: AT48xb2asgV
Profiler labels can't currently be used in mozglue, because the profiler's code
is in libxul, and mozglue cannot depend on libxul.
This patch addresses this by basically duplicating AutoProfilerLabel in
mozglue. libxul passes two callback functions to mozglue to do the actual
pushing/popping of labels.
It's an annoying amount of machinery, but it is unavoidable if we want to use
profiler labels within mozglue.
--HG--
extra : rebase_source : 4bcb6fb0f050bba42c23d92d01f9c56611f8518f
This patch does the following renamings, which increase consistency.
- GeckoProfilerInitRAII -> AutoProfilerInit
- GeckoProfilerThread{Sleep,Wake}RAII -> AutoProfilerThread{Sleep,Wake}
- GeckoProfilerTracingRAII -> AutoProfilerTracing
- AutoProfilerRegister -> AutoProfilerRegisterThread
- ProfilerStackFrameRAII -> AutoProfilerLabel
- nsJSUtils::mProfilerRAII -> nsJSUtils::mAutoProfilerLabel
Plus a few other minor ones (e.g. local variables).
The patch also add MOZ_GUARD_OBJECT macros to all the profiler RAII classes
that lack them, and does some minor whitespace reformatting.
--HG--
extra : rebase_source : 47e298fdd6f6b4af70e3357ec0b7b0580c0d0f50
This patch puts the arrays inside NativeStack, gives NativeStack a constructor,
and renames its fields using "mFoo" form.
The patch also moves MAX_NATIVE_FRAMES from the LUL-only code and applies it
globally. This increases the max native frame count from 256 to 1024 on the
LUL platforms, which makes things consistent with other platforms.
--HG--
extra : rebase_source : 0854ab37d101e090ea05cba7cb6d296450f8dfdc
This increases naming consistency. The remaining aStartTime parameters within
the profiler refer to a different start time than the process start time.
--HG--
extra : rebase_source : 0a07c54288f31af5a15518180b00fe59b587f784
ProfilerStackFrameRAII and ProfilerStackFrameDynamicRAII are very similar; the
latter lets a dynamic string be specified as well (and lacks the
MOZ_GUARD_OBJECT stuff, for no good reason).
This patch does the following.
- Removes ProfilerStackFrameDynamicRAII, and adds a dynamic string to
ProfilerStackFrameRAII. It also reorders the constructor's arguments to match
the field ordering of ProfileEntry. There aren't many usage sites so these
changes don't affect many places.
- With that done, there is only a single callsite for each of
profiler_call_enter() and profiler_call_exit(), so the patch also inlines and
removes them.
These annotations aren't doing anything useful. The important thing with
the PseudoStack is that, during pushes, the stack pointer incrementing happens
after the new entry is written, and this is ensured by the stack pointer being
Atomic.
The patch also improves the comments on PseudoStack.
--HG--
extra : rebase_source : 100f8a5e4b750c15fac66175550c4c284a141f16
There are three flags in ProfileEntry::Flags, which suggests there are 2**3 = 8
combinations. But there are only actually 4 valid combinations.
This patch converts the three flags to a single "kind" enum, which makes things
clearer. Note also that the patch moves the condition at the start of
AddPseudoEntry() to its callsite, for consistency with the earlier JS_OSR entry
kind check.
--HG--
extra : rebase_source : 0950769ee1530291860ef3be47d240df5939e871
The category handling code at the end of AddPseudoEntry has two problems.
- The assertion checks |category| for the IS_CPP_ENTRY flag. This represents a
confusion between the entry's |category| and its |flags|. They're both stored
in a single uint32_t, but are conceptually different types. So the assertion
is vacuously satisfied.
Furthermore, there's no clear way to fix the assertion -- it doesn't make
sense to check the entry's flags for IS_CPP_ENTRY, because this code can
clearly take C++ or JS entries. So the patch just removes the assertion.
- The category is compared to zero. This also doesn't make sense, because zero
isn't a valid category. The patch removes this comparison.
--HG--
extra : rebase_source : 16a7248ffb90ccd90e6102d597fb5bdff706312e
I think all of these assertions are now unnecessary.
MozReview-Commit-ID: 9EI195QsizN
--HG--
extra : rebase_source : 4f03ef02ba6680ee6cad22d5d3f347db7d70aa9c
extra : intermediate-source : 2b8d50fcb20fc0bd808707aae00d6cbcb4536bac
extra : source : 327c145ded03d39970351a9cc01492f0541d9149
- The profiler gives the JS engine a reference to the pseudo-stack via
SetContextProfiilngStack(). That function now takes a |PseudoStack*| instead
of a |ProfileEntry*| and pointer to the stack pointer.
- PseudoStack now has a |kMaxEntries| field, which is easier to work with than
|mozilla::ArrayLength(entries)|.
- AddressOfStackPointer() is no longer needed.
- The patch also neatens up the push operations significantly. PseudoStack now
has pushCppFrame() and pushJsFrame(), which nicely encapsulate the two main
cases. These delegate to the updated initCppFrame() and initJsFrame()
functions in ProfileEntry.
- Renames max_stck in testProfileStrings.cpp as peakStackPointer, which is a
clearer name.
- Removes a couple of checks from testProfileStrings.cpp. These checks made
sense when the pseudo-stack was accessed via raw manipulation, but are not
applicable now because we can't artificially limit the maximum stack size so
easily.
This includes renaming its fields to match SpiderMonkey naming conventions
instead of Gecko naming conventions.
This patch is just about moving the code. The next patch will change
SpiderMonkey to actually use PseudoStack directly.
--HG--
extra : rebase_source : 27e77ddf950201eb6bdba60003218056442cf7ab
I think all of these assertions are now unnecessary.
MozReview-Commit-ID: 9EI195QsizN
--HG--
extra : rebase_source : ddeb80dfc61ff843b6ba4b35f73d005ca060e429
extra : source : 327c145ded03d39970351a9cc01492f0541d9149
If NotifyObservers is called on a background thread, dispatch a runnable to
the main thread which will send the observer notification. This can result in
awkward orders of the notifications if the profiler functions are called in
quick succession on both the main thread and another thread, but I'm not sure
what to do about that.
MozReview-Commit-ID: GlkVwGTa2b4
--HG--
extra : rebase_source : 628ae91c3adef85c7fb07441b6573eda55e7ba48
extra : intermediate-source : 8c96ac2d762fbed161b6e577f845c1b58ec8bc17
extra : source : b8c27d6137a43ecd031f0b58c62752b7a9cefacb
ProfileEntry has |string|, which can be static or dynamic, and |dynamicString|.
If |string| is dynamic, the FRAME_LABEL_COPY flag must be set, and it will be
copied into profiler output.
But there is only one place that uses dynamic |string| values, in SpiderMonkey.
And that place doesn't use |dynamicString|. So this patch changes that place to
use an empty |string| and put the old dynamic |string| value in
|dynamicString|. This in turn removes the need for FRAME_LABEL_COPY.
One minor wrinkle is that when |dynamicString| is used the old code put a space
between |string| and |dynamicString|. The new code omits the space if |string|
is empty.
The patch also renames ProfileEntry::string as ProfileEntry::label_, which
better matches how it's used, and ProfileEntry::dynamicString as
ProfileEntry::dynamicString_ so the getter can be renamed dynamicString().
profiler_resume() mistakenly has the PSAutoLock outside the scope created
specially for it. This patch move the PSAutoLock inside the scope, making it
the same as profiler_pause().
I think all of these assertions are now unnecessary.
MozReview-Commit-ID: 9EI195QsizN
--HG--
extra : rebase_source : 2a9969dd9e48873f6ec333a5ddfa32b6d938de80
extra : histedit_source : ea4dab2b111e465d3a1e29996cc7f10eb8cdbb67%2C8ab59878e4b1b5a715e0738408c26ac3aa0501e7
If NotifyObservers is called on a background thread, dispatch a runnable to
the main thread which will send the observer notification. This can result in
awkward orders of the notifications if the profiler functions are called in
quick succession on both the main thread and another thread, but I'm not sure
what to do about that.
MozReview-Commit-ID: GlkVwGTa2b4
--HG--
extra : rebase_source : 30537a31cc828ad7828e41c2885d84a8299a5e70
extra : source : b8c27d6137a43ecd031f0b58c62752b7a9cefacb
extra : histedit_source : f1340ef14acde098de281b81d5ea89e386e7ab42
It's a software memory barrier, and not a very strong one. If the values it is
protecting are Atomic, that provides a stronger hardware memory barrier.
This patch removes it, and changes one of the values it was protecting from
|volatile| to Atomic. (The other value it was protecting was already Atomic.)
This patch performs a refactoring to the internals of the profiler in order to
expose a function, profiler_suspend_and_sample_thread, which can be called from a
background thread to suspend, sample the native stack, and then resume the
target passed-in thread.
The interface was designed to expose as few internals of the profiler as
possible, exposing only a single callback which accepts the list of program
counters and stack pointers collected during the backtrace.
A method `profiler_current_thread_id` was also added to get the thread_id of the
current thread, which can then be passed by another thread into
profiler_suspend_sample_thread to sample the stack of that thread.
This is implemented in two parts:
1) Splitting SamplerThread into two classes: Sampler, and SamplerThread.
Sampler was created to extract the core logic from SamplerThread which manages
unix signals on android and linux, as well as suspends the target thread on all
platforms. SamplerThread was then modified to subclass this type, adding the
extra methods and fields required for the creation and management of the actual
Sampler Thread.
Some work was done to ensure that the methods on Sampler would not require
ActivePS to be present, as we intend to sample threads when the profiler is not
active for the Background Hang Reporter.
2) Moving the Tick() logic into the TickController interface.
A TickController interface was added to platform which has 2 methods: Tick and
Backtrace. The Tick method replaces the previous Tick() static method, allowing
it to be overridden by a different consumer of SuspendAndSampleAndResumeThread,
while the Backtrace() method replaces the previous MergeStacksIntoProfile
method, allowing it to be overridden by different consumers of
DoNativeBacktrace.
This interface object is then used to wrap implementation specific data, such as
the ProfilerBuffer, and is threaded through the SuspendAndSampleAndResumeThread
and DoNativeBacktrace methods.
This change added 2 virtual calls to the SamplerThread's critical section, which
I believe should be a small enough overhead that it will not affect profiling
performance. These virtual calls could be avoided using templating, but I
decided that doing so would be unnecessary.
MozReview-Commit-ID: AT48xb2asgV
Given that this basically hogs a core per Firefox process, this code only kicks in if you explicitly select a sub-millisecond resolution.
--HG--
extra : rebase_source : 58ca6f8f6537bc4b809e1634ed177c5d47fd499c
Bug 1359000 moved these functions from GeckoProfiler.h to platform.cpp, which
allowed a lot of follow-up simplifications. But it hurt performance.
This patch moves them back to GeckoProfiler.h and makes them |inline| again.
This required adding a second TLS pointer, sPseudoStack. Comments in the patch
explain why.
--HG--
extra : rebase_source : 4198e32b9e251f4014bce890936f4f85dabeb8ab
This removes the need for PROFILER_LIKELY_MEMORY_CONSTRAINED.
The patch also removes PROFILE_JAVA, USE_FAULTY_LIB, CONFIG_CASE_1,
CONFIG_CASE_2 and replaces all their uses with GP_OS_linux or GP_OS_android.
Finally, the patch removes a bogus |defined(GP_OS_darwin)| condition in
platform-linux-lul.cpp.
--HG--
extra : rebase_source : 77d1c625d65ddf551ab8cd4b962ae48c1a54466c
Currently the profiler mostly uses an array of strings to represent which
features are available and in use. This patch changes the profiler core to use
a uint32_t bitfield, which is a much simpler and faster representation.
(nsProfiler and the profiler add-on still use the array of strings, alas.) The
new ProfilerFeature type defines the values in the bitfield.
One side-effect of this change is that profiler_feature_active() now can be
used to query all features. Previously it was just a subset.
Another side-effect is that profiler_get_available_features() no longer incorrectly
indicates support for Java and stack-walking when they aren't supported. (The
handling of task tracer support is unchanged, because the old code handled it
correctly.)
- Use PROFILER_ consistently as the prefix for macros in this file. (As opposed
to PROFILE_ or SAMPLE_ or SAMPLER_ or MOZ_ or PLATFORM_ or no prefix.)
- Split overly long macros across multiple lines.
- Fix some macro indenting.
We pause/unpause the profiler before/after some streaming operations. But these
pause/unpause pairs occur with gPSMutex locked, and ActivePS::IsPaused() also
requires that gPSMutex be locked. Therefore these pause/unpause pairs cannot
be observed, and so this patch removes them.
ProfilerMarker is simple enough that it's best to fully define it in
ProfilerMarker.h, without introducing a ProfilerMarker.cpp.
This requires moving STORE_SEQUENCER() into its own header, StoreSequencer.h.
As a result, the following types are no longer visible outside the profiler:
ProfilerMarker, ProfilerLinkedList, ProfilerMarkerLinkedList,
ProfilerSignalSafeLinkedList. (PseudoStack.h now contains the PseudoStack class
and nothing else.)
The patch also makes the following non-obvious changes.
- It changes ProfilerMarker::{mMarkerName,mPayload} to unique pointers, which
removes the need for an explicit ~ProfilerMarker().
- It removes ProfilerMarker::GetMarkerName(), because that method is only used
within ProfilerMarker itself.
--HG--
extra : rebase_source : 22bdfb1c9c30751253ed66352d7edd51d308152d
Because ProfilerMarkerPayload is the main type defined in these files, and
because the next patch is going to introduce ProfilerMarker.{h,cpp}, which
would be confusingly similar to the old names.
--HG--
rename : tools/profiler/core/ProfilerMarkers.cpp => tools/profiler/core/ProfilerMarkerPayload.cpp
rename : tools/profiler/public/ProfilerMarkers.h => tools/profiler/public/ProfilerMarkerPayload.h
extra : rebase_source : df22a2ab3867650348ae78fe959ff0366aff230b
None of the accesses to these fields occur in hot operations, so it's
reasonable to do them with gPSMutex held. As a result, mJSSampling doesn't need
to be Atomic<>, and mContext's lack of Atomic-ness is no longer a cause for
concern.
This required adding an extra field, mJSContext, to TickSample.
--HG--
extra : rebase_source : 1485de5e493cef655233507248006574d0ab6ebd
PseudoStack is misnamed: it contains the pseudo-stack plus other stuff that is
accessed via TLS.
This patch moves the non-pseudo-stack parts of PseudoStack into a new type
called RacyThreadInfo, which is a subclass of PseudoStack. The patch also
capitalizes the first letter of the names of methods that it moves.
This means that PseudoStack is now accurately named. Also non-pseudo-stack
parts are now no longer visible outside the profiler, which is nice.
--HG--
extra : rebase_source : c110acfb6d2a1527ed33cc073fab3fb188851b22
Currently a reference to each thread's PseudoStack is stored in tlsPseudoStack.
This patch changes the TLS reference to refer to the enclosing ThreadInfo
instead. This allows profiler_clear_js_context() to access the current thread's
ThreadInfo via TLs, rather than searching with FindLiveThreadInfo().
The patch also encapsulates the TLS within a new class called TLSInfo. This
class allows access to the PseudoStack without protection from gPSMutex, but
access to the enclosing ThreadInfo requires a PSLockRef. This maintains the
current access regime.
--HG--
extra : rebase_source : de9967f6c055061bb65930ffd02e369703b1362e
This possibly incurs an extra function call (depends on exactly how much inling
the compiler does). But it helps enormously with subsequent refactorings,
because PseudoStack (and related types) don't need to be visible in
GeckoProfiler.h, which is exported outside the profiler.
--HG--
extra : rebase_source : f2dc5952d7444dfe12e627e86e6d37632b283107
This patch moves the manual polling up into the preceding loops, which is a
better place for it.
--HG--
extra : rebase_source : c95932d8f66635b9ca435f30bae78585dd7e04ca
PS contains some state that is always valid, and some state that is only valid
when the profiler is active. This patch does the following.
- Splits PS into two parts for the two kinds of state: CorePS and ActivePS.
- Moves gPS (now split in two) into CorePS and ActivePS, as static instances,
to improve encapsulation. This required changing all the state getters and
setters into static methods.
- Existence tests for CorePS and ActivePS are now done via the Exists()
methods.
Advantages of this change:
- It's now clear which parts of the global state (most of it!) are valid only
when the profiler is active, and we don't have to invalidate those parts with
zero/null/false when the profiler stops.
- Better OOP: more use of constructors and destructors, and more |const| to
indicate what state is immutable.
- With the old code there were some places where the order of things required
care, but with the new code it's not possible to get the order wrong.
--HG--
extra : rebase_source : dba177acb41e4dc2103ace2212ab5ae1f7b418ce
TimeStamp::ProcessCreations()'s aIsInconsistent outparam is ignored by the
majority of its caller. This patch makes it optional. Notably, this makes
ProcessCreation() easier to use in a constructor's initializer list.
On x86_64-Linux, LUL currently can only unwind frames for which CFI unwind data
is available. This causes a noticeable number of junk samples in the profiler,
characterised by failures at transition points between JIT and native frame
sequences. This patch allows LUL to try recovering the previous frame using
frame pointer chasing in the case where CFI isn't present. This allows LUL to
unwind through or jump over interleaved JIT frames, because, respectively:
* The baseline JIT produces frame-pointerised code.
* If the profiler is enabled, IonMonkey doesn't produce frame-pointerised code,
but also doesn't change the frame pointer register value. It can use the
frame pointer if profiling is disabled, but that's irrelevant here.
The patch also adds counts of FP-recovered frames to LUL's statistics printing,
to make it possible to assess how often this feature is used.
--HG--
extra : rebase_source : eadc54393788693b0e3f8d5129d48aaaad143a0b
AddPseudoEntry() has a single callsite which always passes nullptr for the
last argument. This means that js::ProfilingGetPC() is never called, and so can
be removed. (Even if it was called, it always returns nullptr because ipToPC()
always returns nullptr!)
--HG--
extra : rebase_source : 1260d726c79bf5116143da9904d39b38e3c93837
This option turns on a frame counter that is shown in the top left corner via a
QR code. It was designed to be used in video recordings of B2G phones.
It no longer seems useful, so this patch removes it.
* * *
Bug 1357298 - Remove all traces of frame numbers and power from the profiler output. r=mstange.
--HG--
extra : rebase_source : 0ce87963ce375df64bb8d80ef2b5d40ea507bc7c
The patch adds a missing |delete|. (This leak only occurred when the
"mainthreadio" feature was enabled, which is not the default.)
The patch also makes the unregistering of the interpose observer conditional on
there being one in the first place, avoiding a harmless but useless
unregistering of |nullptr|.
--HG--
extra : rebase_source : 7cc3679192e3effa8d86edad5374643d2e2b8948
For reasons related to the architecture of the Gecko Profiler in previous years,
which are no longer relevant, LUL will only unwind through the first 32KB of
stack. This is mostly harmless, since most stacks are smaller than 4KB, per
measurements today, but occasionally they go above 32KB, causing unwinding to
stop prematurely.
This patch changes the max size to 160KB, and documents the rationale for
copying the stack and unwinding, rather than unwinding in place. 160KB is big
enough for all stacks observed in several minutes of profiling all threads at
1KHz.
--HG--
extra : rebase_source : a1d5526aff50345be8b965c2b6b01c66b40fd0d8
This function can run off the main thread when 'layers.frame-counter' is
enabled.
--HG--
extra : rebase_source : f3db0fd01c636d5af97109761bb0b57b57c79293
This also renames HasProfile() to IsBeingProfiled().
MozReview-Commit-ID: 70RGHNbyZG3
--HG--
extra : rebase_source : 64f1df6985f41ae52d2385edcfd7822d16fd1e00
This also renames HasProfile() to IsBeingProfiled().
MozReview-Commit-ID: 70RGHNbyZG3
--HG--
extra : rebase_source : fbe6faf0ed9ee7273e77f1f81b79915800772212
PseudoStack requires that startJSSampling() and stopJSSampling() calls be
interleaved. But currently the conditions guarding those calls don't match:
startJSSampling() is guarded by ShouldProfileThread(), and stopJSSampling() is
guarded by HasProfile().
It's possible for HasProfile() to be true when ShouldProfileThread() is not
true -- e.g. profile many threads, then restart and profile fewer threads, and
we end up with live threads that have a profile but aren't being profiled right
now -- which leads to assertion failures in stopJSSampling().
This patch makes the stopJSSampling() condition use ShouldProfileThread(), just
like the startJSSampling() condition, which fixes the assertion failure.
--HG--
extra : rebase_source : e9931928c8ac1301f5018f9da319bc478722b98e
LUL doesn't read CFI from the main executable on x86_64-linux, and possibly
other Linux variants, because SharedLibraryInfo::GetInfoForSelf() doesn't
produce a name for the main executable object, even though it does notice the
mapping.
This causes noticeable unwind breakage because the main executable on Linux
contains various wrapper functions pertaining to memory allocation and locking,
such as
moz_xmalloc, moz_xcalloc, moz_xrealloc
mozilla::detail::MutexImpl::lock, mozilla::detail::MutexImpl::unlock
and is generally observable on x86_64-Linux as unwinding failures out of
functions with addresses around 0x40xxxx, since that's the traditional load
address for the main executable.
This patch modifies the Linux implementation of GetInfoForSelf() so as to
harvest the main executable's name from /proc/self/maps. This is then added
into the information acquired from dl_iterate_phdr. As a result
GetInfoForSelf() does correctly report the executable name, so LUL reads Dwarf
unwind info from it, and the abovementioned unwinding failures disappear.
--HG--
extra : rebase_source : 267c6d7c3967a4d29f8ff0b4a91d339a6625085d
shared-linux-libraries.cc is a maze of ifdefs which is hard to navigate, hard to
reason about and gets in the way of making a proper fix for bug 1354546. This
bug is for cleanup only. It should not change any functionality.
The following changes are made:
* adds emacs/vi tab-width lines
* removes the ARRAY_SIZE macro as it appears to be unused
* documents the 3 different configurations, splits #includes accordingly
* comments SharedLibraryInfo::GetInfoForSelf accordingly
* wraps some long lines
* documents in which cases dl_iterate_phdr is used and in which cases
/proc/<pid>/maps is used
* Puts /proc/<pid>/maps reading in its own scope
* Makes the LOG messages on failure clearer
Currently, ThreadInfos for live and dead threads are stored in a single vector.
This patch separates them into two separate vectors.
This ensures that the two kinds of ThreadInfos can't be mixed up. It also means
ThreadInfo::mPendingDelete can be removed.
Currently, when the profiler is active we hold onto the ThreadInfo of all
threads that die. Then when capturing a profile we ignore all threads that
aren't being profiled.
This patch changes things so we only hold onto the ThreadInfos of threads that
die if they are being profiled. In effect it removes state 3 from the following
list of possible ThreadInfo states:
1. !PendingDelete + !HasProfile
2. !PendingDelete + HasProfile
3. PendingDelete + !HasProfile (no longer used)
4. PendingDelete + HasProfile
Now that ThreadResponsiveness is only used on the main thread, we can refactor
ThreadInfo a bit. This patch does the following.
- Removes ThreadInfo::mThread, which is unused.
- Changes ThreadInfo::mRespInfo to a Maybe<>, and moves the is-main-thread
checking outside of ThreadInfo and ThreadResponsiveness.
- Renames {ThreadInfo,TickSample}::mRespInfo as mResponsiveness, to better
match the class name.
The state management is better done within nsProfiler::GetProfileDataAsync()
and nsProfiler::DumpProfileToFileAsync(). (The latter function is new in this
patch.)
This fixes a deadlock.
Other notes:
- The patch moves ProfileGatherer from ProfilerState to nsProfiler. This is
nice because the former is shared between threads but the latter is main
thread only. (This is how the deadlock is avoided.)
- ProfilerStateMutex and PSLockRef are no longer required in platform.h. Those
types and variables are now only used in platform.cpp and platform-*.cpp.
- ProfilerGatherer now calls profiler_get_profile() instead of ToJSON(). Which
means that ToJSON() now has a single caller, so the patch inlines it at the
callsite and removes it.
- profiler_save_profile_to_file_async() dispatched a Runnable to the main
thread. But this wasn't necessary, because it always ran on the main thread
itself. So the new function nsProfiler::DumpProfileToFileAsync() doesn't do
that.
- profiler_will_gather_OOP_profile(), profiler_gathered_OOP_profile(), and
profiler_OOP_exit_profile() are all moved into nsProfiler as well. This
removes the need for the horrible fake lock in
profiler_will_gather_OOP_profile(), hooray!
The conversion to a JSObject is better done within
nsProfiler::GetProfileData().
--HG--
extra : rebase_source : 4a0ba97d99681fca96f2d26b609bafe188095787
SetSampleContext() sets the TickSample's register fields, and the two callers
of SetSampleContext() set the TickSample's mContext.
This patch changes SetSampleContext() so it sets all the fields in one place.
It also renames SetSampleContext() as FillInSample(), because it sets more than
just the context.
--HG--
extra : rebase_source : 9b9f749fe3de687a7fd32f5c38e2321c2abebfdc
Currently each live thread has a PseudoStack that is owned by tlsPseudoStack,
and a ThreadInfo that has a non-owning pointer to the same PseudoStack.
Then, if the profile is active when the thread dies, ownership of the
PseudoStack is transferred to the ThreadInfo.
This patch simplifies the ownership rules. Every ThreadInfo now always owns its
PseudoStack and is responsible for destroying it. tlsPseudoStack is a
non-owning pointer, and so must be cleared when a PseudoStack is destroyed.
This simplifies the code in a few places.
--HG--
extra : rebase_source : 1012b6590380091d60eff98b4e0c5b1ba946cc7e
The patch also adds a MOZ_RELEASE_ASSERT in profiler_unregister_thread() for
the case where the ThreadInfo isn't found, which is informative.
--HG--
extra : rebase_source : 11a86914db235e4a60955ff1c9b77d46109af548
This avoids the need for the fake ThreadInfo in profiler_get_backtrace(). It
requires adding a few extra fields to TickSample.
--HG--
extra : rebase_source : c28e5493edc7db96a7160e78b297ae09dc05ca7c
This patch does the following.
- Splits TickSample's constructor in two, one for the periodic sample case, and
one for the synchronous sample case, and initializes more stuff in them. (The
two constructors aren't that different right now, but they will become more
different when I remove TickSample::mThreadInfo.)
- Makes all the constructor-filled fields in TickSample |const|.
- Reorders the fields so that the constructor-filled ones are before the ones
that get filled in later.
- Omits mContext on Mac via conditional compilation, to make the omission
clearer.
--HG--
extra : rebase_source : f3e392c4cf777df5b9f39577af82615890137018
LastSample only makes sense for periodic samples, which are written to the
global ProfileBuffer. It doesn't make sense for synchronous samples which are
written to their own unshared buffer. At the moment it doesn't hurt to use
them in this nonsensical way, but the ThreadInfo profiler_get_backtrace()
will be removed soon, and we won't even have a LastSample to use nonsensically.
So this patch makes the LastSample argument to addTagThreadId() optional. Which
means we have to pass in a ThreadId, so there's no longer much point
duplicating the ThreadId in LastSample, so the patch removes that field too.
This avoids the possibility of the duplicate ThreadId failing to match, which
is nice.
--HG--
extra : rebase_source : dad76ff8b33663398e6f45f85da500b0fd7a598f
At this point the only things in the ThreadInfo it uses are the thread name and
id, which are easy to store instead. This gets a step closer to avoiding the
use of ThreadInfo in profiler_get_backtrace().
--HG--
extra : rebase_source : f4feb08ec9fe7880ee43f784c6878c1c04fd3294
StreamSamplesAndMarkers() is the only ThreadInfo method called on
ProfilerBacktrace::mThreadInfo. Furthermore, it doesn't use all that much stuff
from ThreadInfo, and what stuff it does use we can instead pass in as
arguments.
This patch moves StreamSamplesAndMarkers() out of the class. It's a little
ugly, but a necessary precursor for removing ProfilerBacktrace::mThreadInfo and
all the subsequent improvements.
--HG--
extra : rebase_source : 417bda4f29a27c525f7240d3427494dd86b9a868
This required a tweak to DoNativeBacktrace() to work around an ASAN false
positive.
--HG--
extra : rebase_source : 2e21ae4c132db812150f42c26aa708aefce311be
When the profiler is running in privacy mode, we don't want to include dynamic
strings from PROFILER_LABEL_DYNAMIC to end up in the profile.
Rather than checking this every time we enter a scope marked with
PROFILER_LABEL_DYNAMIC, with this patch we will push the dynamic string into
the pseudo stack entry regardless, and then check the privacy mode during
sampling and ignore the dynamic string as necessary.
This way we can avoid taking the profiler state lock in PROFILER_LABEL_DYNAMIC
and also save a branch.
MozReview-Commit-ID: 5dXrtMuFJ5r
--HG--
extra : rebase_source : 1c2057e7ced332d9001137b5b280feab77a712e5
"Metadata" regularly confuses me, because it suggests something complicated
rather than a simple enum.
This change also has the benefit of removing inconsistent capitalization
("Metadata" vs. "MetaData").
--HG--
extra : rebase_source : b651e124142c8d93139d22dae1c993c899be4d7a
This patch:
- Removes TRACING_EVENT_BACKTRACE, which is unused.
- Removes TRACING_DEFAULT and replaces all its uses with TRACING_EVENT, because
there is no difference in how those two are used.
- Removes TRACING_TIMESTAMP, which is unused and also doesn't do anything
different to TRACING_EVENT.
--HG--
extra : rebase_source : 69af1c53aa918798d8050e6b9d1a2658a0902af5
This patch simplifies and increases the consistency of how HAVE_NATIVE_UNWIND
is used.
- Its definition is moved from platform.h to platform.cpp, because the latter
is the only file that uses it.
- It's now defined in the same place as USE_{NS,EHABI,LUL}_STACKWALK, and used
in preference to those, where possible. Also, it's now defined on Linux and
Android even if MOZ_PROFILING is not.
- HAVE_NATIVE_UNWIND is now used consistently and by itself for all relevant
conditions, including when defining the presence and use of the "stackwalk"
feature.
- The patch inlines and removes is_native_unwinding_avail().
Note that MOZ_PROFILING must be defined for HAVE_NATIVE_UNWIND to be true on
Windows and Mac, but not on Linux and Android.
--HG--
extra : rebase_source : 5be3e5fe65706a15179a2cf46ba9451f68fff815
Bug 1339695 part 8 unintentionally changed behaviour in profiler_init() when
MOZ_PROFILING is undefined. This patch undoes that change.
--HG--
extra : rebase_source : 16e992382e06fbc673555c87499c236e2b39bc7f
This patch does the following.
- Renames TickSample's members to mFoo style.
- Changes TickSample's constructor to set mTimeStamp.
- Moves TickSample creation from
SamplerThread::SuspendAndSampleAndResumeThread() to SamplerThread::Run(), so
it's not repeated for each platform.
- Changes TickSample::PopulateContext() so it takes a |tick_context_t*|
parameter, which avoids having to cast from |void*|.
Bug 1339695 part 8 accidentally disabled native stack walking on Android by
using GP_arm_android instead of GP_PLAT_arm_android in a #if. This patch fixes
that. It also fixes a couple of compile errors that crept into the relevant
code while it was disabled.
--HG--
extra : rebase_source : a7a94b018b8de7a7ca3c621a2b662859a65e69c1
This patch adds logging to some important functions that currently lack it.
Thread registration/unregistration is done with DEBUG_LOG because it's
more verbose than the other profiler logging, but less verbose than LUL's
logging.
The patch also scraps the BEGIN/END logging pairs because they bloat the output
for little gain. Now it just logs on function entry.
--HG--
extra : rebase_source : 3ef3d263c19cda03198e8b3a9ab89866f74ed1cd
The profiler will use level 3 (Info) and 4 (Debug) logging, though this patch
only uses level 3. LUL will use level 5 (Verbose) debugging.
The patch also tweaks parts of the the usage message, including adding
MOZ_PROFILER_{STARTUP,SHUTDOWN} to it.
--HG--
extra : rebase_source : f43a023912fbce993ed367cdd26b8f25f25381de
It's a very general mechanism for replacing the implementation of
printf_stderr().
It's primarily used by the profiler, sparingly, and not in an important way.
Worse, it prevents us from using MOZ_LOG in the profiler, which is something I
want. Because if any code that locks gPSMutex also calls MOZ_LOG, that then
calls printf_stderr(), which calls profiler_log(), which locks gPSMutex, which
deadlocks.
The only other use of set_stderr_callback() is for the ultra-hacky,
for-local-use-only copy_stderr_to_file() function, which was added for B2G
debugging and is no longer necessary.
This patch removes set_stderr_callback() altogether.
--HG--
extra : rebase_source : d31ecb482fe5899f62dc56a38e87d91f9271bab0
All three platform-*.cpp files have similar structure, most especially for
SamplerThread::Run(), with considerable duplication. This patch factors out
the common parts into a single implementation in platform.cpp.
* The top level structure of class SamplerThread has been moved to
platform.cpp.
* The class has some target-dependent fields, relating to signal handling and
thread identity.
* There's a single implementation of Run() in platform.cpp.
* AllocPlatformData() and PlatformDataDestructor::operator() have also been
commoned up and moved into platform.cpp.
* Time units in SamplerThread have been tidied up. We now use microseconds
throughout, except in the constructor. All time interval field and variable
names incorporate the unit (microseconds/milliseconds) for clarity. The
Windows uses of such values are scaled up/down by 1000 accordingly.
* The pre-existing MacOS Run() implementation contained logic that attempted
to keep "to schedule" in the presence of inaccuracy in the actual sleep
intervals. This now applies to all targets. A couple of comments on this
code have been added.
* platform-{win32,macos,linux-android}.cpp have had their Run() methods
removed, and all other methods placed in the same sequences, to the extent
that is possible.
* In the Win32 and MacOS implementations, Thread::SampleContext has been
renamed to Thread::SuspendSampleAndResumeThread as that better describes
what it does. In the Linux/Android implementation there was no such
separate method, so one has been created.
* The three Thread::SuspendSampleAndResumeThread methods have been commented
in such a way as to emphasise their identical top level structure.
* The point in platform.cpp where platform-{win32,macos,linux-android}.cpp are
#included has been moved slightly earlier in the file, into the
SamplerThread encampment, as that seems like a better place for it.
--HG--
extra : rebase_source : 0f93e15967b810c09e645fa593dbf85f94b53a9b
ProfileBuffer::FindLastSampleOfThread currently involves a linear search
backwards through the sample buffer. Profiling showed that to be the largest
profiler cost by far, at least on Linux. Bugs 1344118 and 1344258
significantly improve the situation, collectively reducing the cost by a
factor of at least 5 and often much more. But the linear search is still
present and still dominant. The worst of it is that it's unnecessary: we
could achieve the same by recording the start point of the most recent sample
for each thread in that thread's ThreadInfo record.
This patch does exactly that, adding the type ProfileBuffer::LastSample to
store the start points. LastSample also includes the ID of the thread it
pertains to as a read-only field, as that is needed in various places.
addTag doesn't check whether we're overwriting buffer entries containing start
points. Instead, FindLastSample checks whether the entry pointed to the
LastSample it is given still contains a marker.
--HG--
extra : rebase_source : 2987ec744a5c16e8b6814abe7efb507fc7280605
Instead of copying and concatenating strings into an mDest buffer in
SamplerStackFramePrintfRAII, require callers to keep the string buffer alive
for the duration of the current scope, and store the pointer to the annotation
string in the ProfileEntry. During stackwalking, concatenate the label and the
annotation (separated by a space) and store the resulting string in the
profile buffer.
MozReview-Commit-ID: GEjcLrhhdvb
--HG--
extra : rebase_source : 683749421ee2122805a249cf413e882ee5f33331
It's not necessary and causes hangs.
The patch also inlines setup_atfork() and moves the Linux-only code closer to
the Linux-only PlatformInit(), and tweaks the comments a bit.
--HG--
extra : rebase_source : 0db23d649d9468b9308b881c0bbf5ea25a95ea13