It's a software memory barrier, and not a very strong one. If the values it is
protecting are Atomic, that provides a stronger hardware memory barrier.
This patch removes it, and changes one of the values it was protecting from
|volatile| to Atomic. (The other value it was protecting was already Atomic.)
The eslint-plugin-mozilla currently searches for Mozilla Central root by
walking up from its installed dir. However, if the plugin is installed outside
of central, such as globally, it will not find the root. This changeset
retains that behaviour, but will also perform the same check from the current
working directly before failing.
MozReview-Commit-ID: 2L3JqLTuVDS
--HG--
extra : rebase_source : e24ddc726cb2470f7165f028f71416e942c44f87
This patch performs a refactoring to the internals of the profiler in order to
expose a function, profiler_suspend_and_sample_thread, which can be called from a
background thread to suspend, sample the native stack, and then resume the
target passed-in thread.
The interface was designed to expose as few internals of the profiler as
possible, exposing only a single callback which accepts the list of program
counters and stack pointers collected during the backtrace.
A method `profiler_current_thread_id` was also added to get the thread_id of the
current thread, which can then be passed by another thread into
profiler_suspend_sample_thread to sample the stack of that thread.
This is implemented in two parts:
1) Splitting SamplerThread into two classes: Sampler, and SamplerThread.
Sampler was created to extract the core logic from SamplerThread which manages
unix signals on android and linux, as well as suspends the target thread on all
platforms. SamplerThread was then modified to subclass this type, adding the
extra methods and fields required for the creation and management of the actual
Sampler Thread.
Some work was done to ensure that the methods on Sampler would not require
ActivePS to be present, as we intend to sample threads when the profiler is not
active for the Background Hang Reporter.
2) Moving the Tick() logic into the TickController interface.
A TickController interface was added to platform which has 2 methods: Tick and
Backtrace. The Tick method replaces the previous Tick() static method, allowing
it to be overridden by a different consumer of SuspendAndSampleAndResumeThread,
while the Backtrace() method replaces the previous MergeStacksIntoProfile
method, allowing it to be overridden by different consumers of
DoNativeBacktrace.
This interface object is then used to wrap implementation specific data, such as
the ProfilerBuffer, and is threaded through the SuspendAndSampleAndResumeThread
and DoNativeBacktrace methods.
This change added 2 virtual calls to the SamplerThread's critical section, which
I believe should be a small enough overhead that it will not affect profiling
performance. These virtual calls could be avoided using templating, but I
decided that doing so would be unnecessary.
MozReview-Commit-ID: AT48xb2asgV
Given that this basically hogs a core per Firefox process, this code only kicks in if you explicitly select a sub-millisecond resolution.
--HG--
extra : rebase_source : 58ca6f8f6537bc4b809e1634ed177c5d47fd499c
Bug 1359000 moved these functions from GeckoProfiler.h to platform.cpp, which
allowed a lot of follow-up simplifications. But it hurt performance.
This patch moves them back to GeckoProfiler.h and makes them |inline| again.
This required adding a second TLS pointer, sPseudoStack. Comments in the patch
explain why.
--HG--
extra : rebase_source : 4198e32b9e251f4014bce890936f4f85dabeb8ab
This removes the need for PROFILER_LIKELY_MEMORY_CONSTRAINED.
The patch also removes PROFILE_JAVA, USE_FAULTY_LIB, CONFIG_CASE_1,
CONFIG_CASE_2 and replaces all their uses with GP_OS_linux or GP_OS_android.
Finally, the patch removes a bogus |defined(GP_OS_darwin)| condition in
platform-linux-lul.cpp.
--HG--
extra : rebase_source : 77d1c625d65ddf551ab8cd4b962ae48c1a54466c
Currently the profiler mostly uses an array of strings to represent which
features are available and in use. This patch changes the profiler core to use
a uint32_t bitfield, which is a much simpler and faster representation.
(nsProfiler and the profiler add-on still use the array of strings, alas.) The
new ProfilerFeature type defines the values in the bitfield.
One side-effect of this change is that profiler_feature_active() now can be
used to query all features. Previously it was just a subset.
Another side-effect is that profiler_get_available_features() no longer incorrectly
indicates support for Java and stack-walking when they aren't supported. (The
handling of task tracer support is unchanged, because the old code handled it
correctly.)
- Use PROFILER_ consistently as the prefix for macros in this file. (As opposed
to PROFILE_ or SAMPLE_ or SAMPLER_ or MOZ_ or PLATFORM_ or no prefix.)
- Split overly long macros across multiple lines.
- Fix some macro indenting.
We pause/unpause the profiler before/after some streaming operations. But these
pause/unpause pairs occur with gPSMutex locked, and ActivePS::IsPaused() also
requires that gPSMutex be locked. Therefore these pause/unpause pairs cannot
be observed, and so this patch removes them.
ProfilerMarker is simple enough that it's best to fully define it in
ProfilerMarker.h, without introducing a ProfilerMarker.cpp.
This requires moving STORE_SEQUENCER() into its own header, StoreSequencer.h.
As a result, the following types are no longer visible outside the profiler:
ProfilerMarker, ProfilerLinkedList, ProfilerMarkerLinkedList,
ProfilerSignalSafeLinkedList. (PseudoStack.h now contains the PseudoStack class
and nothing else.)
The patch also makes the following non-obvious changes.
- It changes ProfilerMarker::{mMarkerName,mPayload} to unique pointers, which
removes the need for an explicit ~ProfilerMarker().
- It removes ProfilerMarker::GetMarkerName(), because that method is only used
within ProfilerMarker itself.
--HG--
extra : rebase_source : 22bdfb1c9c30751253ed66352d7edd51d308152d
Because ProfilerMarkerPayload is the main type defined in these files, and
because the next patch is going to introduce ProfilerMarker.{h,cpp}, which
would be confusingly similar to the old names.
--HG--
rename : tools/profiler/core/ProfilerMarkers.cpp => tools/profiler/core/ProfilerMarkerPayload.cpp
rename : tools/profiler/public/ProfilerMarkers.h => tools/profiler/public/ProfilerMarkerPayload.h
extra : rebase_source : df22a2ab3867650348ae78fe959ff0366aff230b
None of the accesses to these fields occur in hot operations, so it's
reasonable to do them with gPSMutex held. As a result, mJSSampling doesn't need
to be Atomic<>, and mContext's lack of Atomic-ness is no longer a cause for
concern.
This required adding an extra field, mJSContext, to TickSample.
--HG--
extra : rebase_source : 1485de5e493cef655233507248006574d0ab6ebd
PseudoStack is misnamed: it contains the pseudo-stack plus other stuff that is
accessed via TLS.
This patch moves the non-pseudo-stack parts of PseudoStack into a new type
called RacyThreadInfo, which is a subclass of PseudoStack. The patch also
capitalizes the first letter of the names of methods that it moves.
This means that PseudoStack is now accurately named. Also non-pseudo-stack
parts are now no longer visible outside the profiler, which is nice.
--HG--
extra : rebase_source : c110acfb6d2a1527ed33cc073fab3fb188851b22
Currently a reference to each thread's PseudoStack is stored in tlsPseudoStack.
This patch changes the TLS reference to refer to the enclosing ThreadInfo
instead. This allows profiler_clear_js_context() to access the current thread's
ThreadInfo via TLs, rather than searching with FindLiveThreadInfo().
The patch also encapsulates the TLS within a new class called TLSInfo. This
class allows access to the PseudoStack without protection from gPSMutex, but
access to the enclosing ThreadInfo requires a PSLockRef. This maintains the
current access regime.
--HG--
extra : rebase_source : de9967f6c055061bb65930ffd02e369703b1362e
This possibly incurs an extra function call (depends on exactly how much inling
the compiler does). But it helps enormously with subsequent refactorings,
because PseudoStack (and related types) don't need to be visible in
GeckoProfiler.h, which is exported outside the profiler.
--HG--
extra : rebase_source : f2dc5952d7444dfe12e627e86e6d37632b283107
This patch moves the manual polling up into the preceding loops, which is a
better place for it.
--HG--
extra : rebase_source : c95932d8f66635b9ca435f30bae78585dd7e04ca
PS contains some state that is always valid, and some state that is only valid
when the profiler is active. This patch does the following.
- Splits PS into two parts for the two kinds of state: CorePS and ActivePS.
- Moves gPS (now split in two) into CorePS and ActivePS, as static instances,
to improve encapsulation. This required changing all the state getters and
setters into static methods.
- Existence tests for CorePS and ActivePS are now done via the Exists()
methods.
Advantages of this change:
- It's now clear which parts of the global state (most of it!) are valid only
when the profiler is active, and we don't have to invalidate those parts with
zero/null/false when the profiler stops.
- Better OOP: more use of constructors and destructors, and more |const| to
indicate what state is immutable.
- With the old code there were some places where the order of things required
care, but with the new code it's not possible to get the order wrong.
--HG--
extra : rebase_source : dba177acb41e4dc2103ace2212ab5ae1f7b418ce
TimeStamp::ProcessCreations()'s aIsInconsistent outparam is ignored by the
majority of its caller. This patch makes it optional. Notably, this makes
ProcessCreation() easier to use in a constructor's initializer list.
On x86_64-Linux, LUL currently can only unwind frames for which CFI unwind data
is available. This causes a noticeable number of junk samples in the profiler,
characterised by failures at transition points between JIT and native frame
sequences. This patch allows LUL to try recovering the previous frame using
frame pointer chasing in the case where CFI isn't present. This allows LUL to
unwind through or jump over interleaved JIT frames, because, respectively:
* The baseline JIT produces frame-pointerised code.
* If the profiler is enabled, IonMonkey doesn't produce frame-pointerised code,
but also doesn't change the frame pointer register value. It can use the
frame pointer if profiling is disabled, but that's irrelevant here.
The patch also adds counts of FP-recovered frames to LUL's statistics printing,
to make it possible to assess how often this feature is used.
--HG--
extra : rebase_source : eadc54393788693b0e3f8d5129d48aaaad143a0b
We can lower the eslint cyclomatic complexity threshold in some directories without adding eslint suppression comments in any .js source files. We need to specify the complexity rule in accessible/.eslintrc because it doesn't inherit the mozilla/recommended rules. eslint's default complexity threshold is 20.
Also bump the eslint-plugin-mozilla version because we modified the mozilla/recommended rules.
MozReview-Commit-ID: 57T4gAjPH7z
--HG--
extra : rebase_source : 4565abfa722b9459cfb4e006e843da13ed7cffd4
extra : intermediate-source : 658588564c08c9fd5e60633d1457f24087de8570
extra : source : 7e0526e3b943419a80c0cd2fa462cabbf8925eb1
eslint's default max-nested-callbacks threshold is 10, but now we make it an error. We could further lower the max-nested-callbacks threshold globally to 8, like browser/.eslintrc.js, but that would require adding suppression comments in (two) more .js test files. 10 seems good enough for now since it's the eslint default.
We need to specify max-nested-callbacks in accessible/.eslintrc because it doesn't inherit the mozilla/recommended rules.
Also bump the eslint-plugin-mozilla version because we modified the mozilla/recommended rules.
MozReview-Commit-ID: JA41vsi4U7j
--HG--
extra : rebase_source : 2dd211ebd3b8cf83f67f26cac5244bec8978f0e2
extra : intermediate-source : 6f5e75502be394038543029e3cfd474c5b1c2e98
extra : source : a13437d73c97fabd073ab8a6f93e85a5084b7405
We've had 4 submissions in 2017. jgriffin and I agreed that it was a
failed experiment and we should either abandon it or figure out how
to increase participation. Since nobody is available to improve it,
I'm inclined to just delete it. If we want to bring it back, it is
always in version control.
MozReview-Commit-ID: AizkZxjQ8IJ
--HG--
extra : rebase_source : 0e29058cfc909fb51fc56fa9b46c0e0a6eef0bf9
This should make it unnecessary to bump the version number of our self-built modules, by comparing the actual files.
MozReview-Commit-ID: GFbxBs4bc4Z
--HG--
extra : rebase_source : d96d465615254655ce0fa7af29503a534e795428
AddPseudoEntry() has a single callsite which always passes nullptr for the
last argument. This means that js::ProfilingGetPC() is never called, and so can
be removed. (Even if it was called, it always returns nullptr because ipToPC()
always returns nullptr!)
--HG--
extra : rebase_source : 1260d726c79bf5116143da9904d39b38e3c93837
Remove the definition of sig_safe_t, which is only used by PseudoStack,
and replace the uses with mozilla::Atomic<uint32_t>.
MozReview-Commit-ID: GcPd9R94Vci
--HG--
extra : rebase_source : dcc05a219d59ffdc0486ef2e7118d888c6a93fda
This option turns on a frame counter that is shown in the top left corner via a
QR code. It was designed to be used in video recordings of B2G phones.
It no longer seems useful, so this patch removes it.
* * *
Bug 1357298 - Remove all traces of frame numbers and power from the profiler output. r=mstange.
--HG--
extra : rebase_source : 0ce87963ce375df64bb8d80ef2b5d40ea507bc7c
This fixes a JS exception that gets thrown when one tries to capture a profile
in this case.
--HG--
extra : rebase_source : 46f6eeed3c17086b0b6c35b26f3c9e4841dd6cff
clang's -Wcomma warning warns about suspicious use of the comma operator such as between two statements.
tools/profiler/lul/LulDwarf.cpp:604:15: warning: possible misuse of comma operator here [-Wcomma]
MozReview-Commit-ID: 6ZP79hgtrAD
--HG--
extra : rebase_source : 77028600c713aa3235c3729a5db7be0290df57e4
extra : source : e4536bbeb28050b38979a05b379f13eb4a12beee
The patch adds a missing |delete|. (This leak only occurred when the
"mainthreadio" feature was enabled, which is not the default.)
The patch also makes the unregistering of the interpose observer conditional on
there being one in the first place, avoiding a harmless but useless
unregistering of |nullptr|.
--HG--
extra : rebase_source : 7cc3679192e3effa8d86edad5374643d2e2b8948
For reasons related to the architecture of the Gecko Profiler in previous years,
which are no longer relevant, LUL will only unwind through the first 32KB of
stack. This is mostly harmless, since most stacks are smaller than 4KB, per
measurements today, but occasionally they go above 32KB, causing unwinding to
stop prematurely.
This patch changes the max size to 160KB, and documents the rationale for
copying the stack and unwinding, rather than unwinding in place. 160KB is big
enough for all stacks observed in several minutes of profiling all threads at
1KHz.
--HG--
extra : rebase_source : a1d5526aff50345be8b965c2b6b01c66b40fd0d8
Everything depending on the widget being gonk can go away, as well as
everything depending on MOZ_AUDIO_CHANNEL_MANAGER, which was only
defined on gonk builds under b2g/ (which goes away in bug 1357326).
--HG--
extra : rebase_source : 9f0aeeb7eea8417fa4e06d662d566d67ecaf2a24
Bump the version number to match what is currently published.
MozReview-Commit-ID: 8r8otQQBqBo
--HG--
extra : rebase_source : 1b7daac6b2852117ae08927fd5a09b2a7650d683
This function can run off the main thread when 'layers.frame-counter' is
enabled.
--HG--
extra : rebase_source : f3db0fd01c636d5af97109761bb0b57b57c79293
This also renames HasProfile() to IsBeingProfiled().
MozReview-Commit-ID: 70RGHNbyZG3
--HG--
extra : rebase_source : 64f1df6985f41ae52d2385edcfd7822d16fd1e00
This also renames HasProfile() to IsBeingProfiled().
MozReview-Commit-ID: 70RGHNbyZG3
--HG--
extra : rebase_source : fbe6faf0ed9ee7273e77f1f81b79915800772212
For reasons which are unclear, but possibly due to lack of any known use cases
when the code was written, LUL on i686/x86_64-linux only accepts CFA (canonical
frame address) expressions of the form SP+offset or FP+offset. However, on
Fedora 25 x86_64 and Ubuntu 16.10 x86_64, at least one address range per object
uses a Dwarf expression for the CFA, for example:
00000018 000000000024 0000001c FDE cie=00000000 pc=0000000031e0..0000000031f0
DW_CFA_def_cfa_offset: 16
DW_CFA_advance_loc: 6 to 00000000000031e6
DW_CFA_def_cfa_offset: 24
DW_CFA_advance_loc: 10 to 00000000000031f0
DW_CFA_def_cfa_expression(
DW_OP_breg7 (rsp): 8; DW_OP_breg16 (rip): 0; DW_OP_lit15; DW_OP_and;
DW_OP_lit11; DW_OP_ge; DW_OP_lit3; DW_OP_shl; DW_OP_plus)
producing the following complaint from LUL:
can't summarise: SVMA=0x31f0: rule for DW_REG_CFA: invalid |how|, expr=LExpr(PFXEXPR,0,0)
Given that LUL is capable of handling such a CFA expression, it seems artificial
to stop it doing so. This patch changes Summariser::Rule() so as to allow such
expressions.
PseudoStack requires that startJSSampling() and stopJSSampling() calls be
interleaved. But currently the conditions guarding those calls don't match:
startJSSampling() is guarded by ShouldProfileThread(), and stopJSSampling() is
guarded by HasProfile().
It's possible for HasProfile() to be true when ShouldProfileThread() is not
true -- e.g. profile many threads, then restart and profile fewer threads, and
we end up with live threads that have a profile but aren't being profiled right
now -- which leads to assertion failures in stopJSSampling().
This patch makes the stopJSSampling() condition use ShouldProfileThread(), just
like the startJSSampling() condition, which fixes the assertion failure.
--HG--
extra : rebase_source : e9931928c8ac1301f5018f9da319bc478722b98e
LUL doesn't read CFI from the main executable on x86_64-linux, and possibly
other Linux variants, because SharedLibraryInfo::GetInfoForSelf() doesn't
produce a name for the main executable object, even though it does notice the
mapping.
This causes noticeable unwind breakage because the main executable on Linux
contains various wrapper functions pertaining to memory allocation and locking,
such as
moz_xmalloc, moz_xcalloc, moz_xrealloc
mozilla::detail::MutexImpl::lock, mozilla::detail::MutexImpl::unlock
and is generally observable on x86_64-Linux as unwinding failures out of
functions with addresses around 0x40xxxx, since that's the traditional load
address for the main executable.
This patch modifies the Linux implementation of GetInfoForSelf() so as to
harvest the main executable's name from /proc/self/maps. This is then added
into the information acquired from dl_iterate_phdr. As a result
GetInfoForSelf() does correctly report the executable name, so LUL reads Dwarf
unwind info from it, and the abovementioned unwinding failures disappear.
--HG--
extra : rebase_source : 267c6d7c3967a4d29f8ff0b4a91d339a6625085d
Pick up autofix improvements in 3.19.0. Upgrade eslint-plugin-react. Also fix issues with the package.json file and the ESLint node_modules upload script.
MozReview-Commit-ID: IDZ1n4qTTuv
--HG--
extra : rebase_source : aa97cd6f314ce10d16d12446e50a27d6f994a9f1
shared-linux-libraries.cc is a maze of ifdefs which is hard to navigate, hard to
reason about and gets in the way of making a proper fix for bug 1354546. This
bug is for cleanup only. It should not change any functionality.
The following changes are made:
* adds emacs/vi tab-width lines
* removes the ARRAY_SIZE macro as it appears to be unused
* documents the 3 different configurations, splits #includes accordingly
* comments SharedLibraryInfo::GetInfoForSelf accordingly
* wraps some long lines
* documents in which cases dl_iterate_phdr is used and in which cases
/proc/<pid>/maps is used
* Puts /proc/<pid>/maps reading in its own scope
* Makes the LOG messages on failure clearer
Currently, ThreadInfos for live and dead threads are stored in a single vector.
This patch separates them into two separate vectors.
This ensures that the two kinds of ThreadInfos can't be mixed up. It also means
ThreadInfo::mPendingDelete can be removed.
Currently, when the profiler is active we hold onto the ThreadInfo of all
threads that die. Then when capturing a profile we ignore all threads that
aren't being profiled.
This patch changes things so we only hold onto the ThreadInfos of threads that
die if they are being profiled. In effect it removes state 3 from the following
list of possible ThreadInfo states:
1. !PendingDelete + !HasProfile
2. !PendingDelete + HasProfile
3. PendingDelete + !HasProfile (no longer used)
4. PendingDelete + HasProfile
Now that ThreadResponsiveness is only used on the main thread, we can refactor
ThreadInfo a bit. This patch does the following.
- Removes ThreadInfo::mThread, which is unused.
- Changes ThreadInfo::mRespInfo to a Maybe<>, and moves the is-main-thread
checking outside of ThreadInfo and ThreadResponsiveness.
- Renames {ThreadInfo,TickSample}::mRespInfo as mResponsiveness, to better
match the class name.
The state management is better done within nsProfiler::GetProfileDataAsync()
and nsProfiler::DumpProfileToFileAsync(). (The latter function is new in this
patch.)
This fixes a deadlock.
Other notes:
- The patch moves ProfileGatherer from ProfilerState to nsProfiler. This is
nice because the former is shared between threads but the latter is main
thread only. (This is how the deadlock is avoided.)
- ProfilerStateMutex and PSLockRef are no longer required in platform.h. Those
types and variables are now only used in platform.cpp and platform-*.cpp.
- ProfilerGatherer now calls profiler_get_profile() instead of ToJSON(). Which
means that ToJSON() now has a single caller, so the patch inlines it at the
callsite and removes it.
- profiler_save_profile_to_file_async() dispatched a Runnable to the main
thread. But this wasn't necessary, because it always ran on the main thread
itself. So the new function nsProfiler::DumpProfileToFileAsync() doesn't do
that.
- profiler_will_gather_OOP_profile(), profiler_gathered_OOP_profile(), and
profiler_OOP_exit_profile() are all moved into nsProfiler as well. This
removes the need for the horrible fake lock in
profiler_will_gather_OOP_profile(), hooray!
The conversion to a JSObject is better done within
nsProfiler::GetProfileData().
--HG--
extra : rebase_source : 4a0ba97d99681fca96f2d26b609bafe188095787
This reduces the amount of places where we need to specify the mozilla/frame-script environment. It does have
the side effect of allowing those globals in the whole file, but that is what specifying the environment would
do, and this is also for mochitest test files only.
MozReview-Commit-ID: 1LLFbn6fFJR
--HG--
extra : rebase_source : 82a6934d90bbbbd25f91b7b06bf4f9354e38865a
This retains the advantage of running only once per process, while
avoiding the per-process overhead of a jsm.
MozReview-Commit-ID: 1N53MvRwUpg
--HG--
rename : browser/modules/ContentObservers.jsm => browser/modules/ContentObservers.js
extra : rebase_source : 6a502cff26fcb55526f97385274bbae871f5cc6c
SetSampleContext() sets the TickSample's register fields, and the two callers
of SetSampleContext() set the TickSample's mContext.
This patch changes SetSampleContext() so it sets all the fields in one place.
It also renames SetSampleContext() as FillInSample(), because it sets more than
just the context.
--HG--
extra : rebase_source : 9b9f749fe3de687a7fd32f5c38e2321c2abebfdc
Currently each live thread has a PseudoStack that is owned by tlsPseudoStack,
and a ThreadInfo that has a non-owning pointer to the same PseudoStack.
Then, if the profile is active when the thread dies, ownership of the
PseudoStack is transferred to the ThreadInfo.
This patch simplifies the ownership rules. Every ThreadInfo now always owns its
PseudoStack and is responsible for destroying it. tlsPseudoStack is a
non-owning pointer, and so must be cleared when a PseudoStack is destroyed.
This simplifies the code in a few places.
--HG--
extra : rebase_source : 1012b6590380091d60eff98b4e0c5b1ba946cc7e
The patch also adds a MOZ_RELEASE_ASSERT in profiler_unregister_thread() for
the case where the ThreadInfo isn't found, which is informative.
--HG--
extra : rebase_source : 11a86914db235e4a60955ff1c9b77d46109af548
This avoids the need for the fake ThreadInfo in profiler_get_backtrace(). It
requires adding a few extra fields to TickSample.
--HG--
extra : rebase_source : c28e5493edc7db96a7160e78b297ae09dc05ca7c
This patch does the following.
- Splits TickSample's constructor in two, one for the periodic sample case, and
one for the synchronous sample case, and initializes more stuff in them. (The
two constructors aren't that different right now, but they will become more
different when I remove TickSample::mThreadInfo.)
- Makes all the constructor-filled fields in TickSample |const|.
- Reorders the fields so that the constructor-filled ones are before the ones
that get filled in later.
- Omits mContext on Mac via conditional compilation, to make the omission
clearer.
--HG--
extra : rebase_source : f3e392c4cf777df5b9f39577af82615890137018