The responsiveness was stored with periodic stack samples.
Since bug 1581049, periodic stack samples are now contained within a single
"modern" `CompactStack` entry.
When a thread is sleeping, `CompactStack`s are simply copied verbatim, including
the responsiveness, which doesn't make sense because since we're sleeping so the
responsiveness should still increase.
To avoid duplicating the responsiveness, it is now stored as a separate buffer
entry `UnresponsiveDurationMs` before the `CompactStack`; this replaces the
legacy entry `Responsiveness` that was stored at the end of the stack.
Differential Revision: https://phabricator.services.mozilla.com/D51554
--HG--
extra : moz-landing-system : lando
The buffer range starts at 1 (the first valid entry, 0 is reserved as
null marker). So if it now starts *after* 1, it means we have started to
overwrite our oldest data, and we should get rid of Base profiles if any.
Differential Revision: https://phabricator.services.mozilla.com/D51225
--HG--
extra : moz-landing-system : lando
profile.counters[n].sample_groups was mistakenly streamed as an object, which
prevents having more than one, and goes against the published format
documentation.
The front-end was implemented to process the incorrect format, so it will need
to be updated as well; hence the version change to 18.
Differential Revision: https://phabricator.services.mozilla.com/D49867
--HG--
extra : moz-landing-system : lando
Instead of requesting profiles until it "seems" to have collected something,
use `nsIProfiler.waitOnePeriodicSampling()` to really wait for a sample to be
taken.
Note that this means some test could theoretically fail if they were in fact
waiting for stack samples to appear in the first registered thread. If that
happens, these tests should be udpated to do that extra wait-for-data.
Differential Revision: https://phabricator.services.mozilla.com/D50783
--HG--
extra : moz-landing-system : lando
`nsIProfiler.waitOnePeriodicSampling()` returns a promise that gets resolved
after the next full periodic sampling. The promise is rejected if the sampler is
not running.
Implementation detail: `Promise` uses single-threaded ref-counting, so special
care is needed not to touch it in the sampler thread. Safe ref-counting is
handled by `nsMainThreadPtrHolder` and `nsMainThreadPtrHandle`. And the promise
is only resolved/rejected on the main thread.
Differential Revision: https://phabricator.services.mozilla.com/D50782
--HG--
extra : moz-landing-system : lando
Register a callback to be invoked the next time the Sampler thread ends its
periodic loop.
Registration may fail (e.g., if the profiler is not active), in which case the
callback is never called.
After successful registration the callback is guaranteed to be invoked after the
sampler runs, or stops running.
Differential Revision: https://phabricator.services.mozilla.com/D50781
--HG--
extra : moz-landing-system : lando
The source of deadlock is due to profiler->lock->IO->interposer->profiler->lock
loops, where the second profiler call tries to lock the profiler mutex again.
Thanks to `profiler_is_locked_on_current_thread` the interposer can now check
whether the mutex is already locked, in which case it just won't call profiler
functions anymore, breaking the loop.
And now the profiler doesn't need to manually pause/resume/unregister the
(potential) interposer when doing I/Os.
Differential Revision: https://phabricator.services.mozilla.com/D50561
--HG--
extra : moz-landing-system : lando
PSMutex can store the id of the thread owning the lock, and it's safe to check
whether the current thread owns that lock (either it does and no-one else can
change it, or it's another id).
Comments related to memory hooks have been moved to memory_hooks.cpp, because
`profiler_is_locked_on_current_thread()` could be used for other things.
Differential Revision: https://phabricator.services.mozilla.com/D49895
--HG--
extra : moz-landing-system : lando
The LUL unwinder on x86_64-android and x86_64-linux tries to read Dwarf unwind
information from the ELF section named ".eh_frame", and expects the section to
have type SHT_PROGBITS. It appears that that section might instead have the
type SHT_X86_64_UNWIND, possibly as a result of recent toolchain (linker?)
changes. This patch tracks that change.
Differential Revision: https://phabricator.services.mozilla.com/D50474
--HG--
extra : moz-landing-system : lando
Logging could be intercepted by the I/O interposer, which takes the lock, this
could deadlock if the mutex had already been taken in the same thread (e.g., by
a "locked" profiler function that wants to log something).
Now, functions that hold the lock must use `LOG_LOCKED(lock, ...)`, which will
skip logging when the I/O interposer is running.
In DEBUG builds, `LOG()` checks that the lock is not owned, to find possible
misuses.
`LOG()`s outside of platform.cpp were changed to `NS_WARNING`s, because they
could not access `gPSMutex` that's defined in platform.cpp; and they're really
warnings anyway.
Some interposer pauses/resumes have been relocated to more appropriate spots:
- Not needed around `locked_profiler_stream_json_for_this_process` anymore,
thanks to `LOG_LOCKED` inside.
- Definitely needed around `std::ofstream` operations in
`locked_profiler_save_profile_to_file`.
Differential Revision: https://phabricator.services.mozilla.com/D49896
--HG--
extra : moz-landing-system : lando
PSMutex can store the id of the thread owning the lock, and it's safe to check
whether the current thread owns that lock (either it does and no-one else can
change it, or it's another id).
Comments related to memory hooks have been moved to memory_hooks.cpp, because
`profiler_is_locked_on_current_thread()` could be used for other things.
Differential Revision: https://phabricator.services.mozilla.com/D49895
--HG--
extra : moz-landing-system : lando
In Bug 1587907 we got some contentPid not found errors. It seemed like we were
failing intermittently at getting content pid. After moving test content inside
the BrowserTestUtils.withNewTab callback, we won't get the same error anymore.
Also from my testing, it looks like it makes the test execution faster. We were
getting some intermittents before because of the timeouts, these will be fixed
if my testing is correct.
Differential Revision: https://phabricator.services.mozilla.com/D49951
--HG--
extra : moz-landing-system : lando
Since markers are stored directly into the buffer:
- We don't need to wait for a sampling run to collect them,
- No markers are left uncollected between the last sampling and the time we stop
the profiler.
Differential Revision: https://phabricator.services.mozilla.com/D49233
--HG--
extra : moz-landing-system : lando
`stopProfilerAndGetThreads()` waits until at least one sample appears in the
main thread, before capturing a profile and stopping the profiler.
`stopProfilerNowAndGetThreads()` does not wait; this should be useful in tests
that may not generate any samples (e.g., using the nostacksampling feature).
Differential Revision: https://phabricator.services.mozilla.com/D49232
--HG--
extra : moz-landing-system : lando
We want to remove flat strings (JSFlatString). With this patch we only expose
linear strings (JSLinearString) to API consumers.
This is very mechanical for the most part, because code typically only cares
about linear strings and not the null-termination aspect.
CTypes's Library.cpp has some Windows-specific code where we relied on null-terminated
strings. This patch adds JS_CopyStringCharsZ for that use case.
Differential Revision: https://phabricator.services.mozilla.com/D48314
--HG--
extra : moz-landing-system : lando
This patch creates a new API to the nsIProfiler interface, through the activeConfiguration
property. It exposes the internal configuration of the profiler. In addition, this information
is serialized into the profile meta object.
Differential Revision: https://phabricator.services.mozilla.com/D48733
--HG--
extra : moz-landing-system : lando
Having `mProfileBuffer` be a pointer is not really helpful:
- The pointer is never null (It's allocated on ActivePS construction, and
implicitly deleted on ActivePS destruction); it's never moved-from.
- It requires an extra `new` and an extra `delete`.
Differential Revision: https://phabricator.services.mozilla.com/D48650
--HG--
extra : moz-landing-system : lando
Add assertions that all `sInstance` pointers (from both `CorePS` and `ActivePS`)
are not null before being dereferenced.
This is probably more than needed, but it's only `MOZ_ASSERT`s limited to
Nightly, and it should give better feedback in case something goes wrong.
Eventually, I think it would be better to make most methods non-static, and have
a checked reference-to-instance getter.
Differential Revision: https://phabricator.services.mozilla.com/D48649
--HG--
extra : moz-landing-system : lando
We are not simply excluding all about:blanks because there might be some
about:blank that user really visits. But for others we don't want to include
the first about:blank because when a BrowsingContext is loaded, and if the
principal matches, the first document loaded in it will share the inner window.
Differential Revision: https://phabricator.services.mozilla.com/D47067
--HG--
extra : moz-landing-system : lando
We were keeping nsDocShell::mHistoryId and nsDocShell::mOSHE as keys. They
weren't quite good because:
1. While loading an iframe, they were being registered twice with the same
ids(for about:blank and the real URL) sometimes.
2. It wasn't possible to access to the parent mHistoryId and mOSHE from a child
processes if the parent is in a different process. That may not be the case for
now, but it will be after fission.
So we had to find other IDs to:
1. Determine the Tab of the frames.
2. Determine the URLs of the frames.
For the first use case, we were using nsDocShell::mHistoryId for that purpose
but that was wrong. The closest thing that we can get to a tab ID is
BrowsingContext ID because they don't change after a navigation. But iframes
have different BrowsingContext's, so we still need to create a tree to
construct a tab content. That can be either in the front-end or capture time.
For the second use case, we were using a key pair of mHistoryId and mOSHE. We
now chose to keep inner window IDs for that purpose. Inner window IDs are
unique for each navigation loads because inner window correspond to each JS
window global objects. That's why we can use that without any problem. But one
problem is that we cannot handle `history.pushState` and `history.replaceState`
changes with that change since window global objects won't change during those.
But that was the best thing we can do after fission. So this will be a small
sacrifice for us to keep that functionality working after fission.
In that patch we also remove the registration/unregistration calls. We are
going to add those calls in the next patch.
Differential Revision: https://phabricator.services.mozilla.com/D47065
--HG--
extra : moz-landing-system : lando
Test more of the ProfilerMarkerPayload's, all those from ProfilerMarkerPayload.h
that did not require external structures. (More to be added in the future.)
These tests focus on the profiler, by simulating markers as we expect them, and
verifying that the corresponding output is correct JSON as expected by the
frontend.
Differential Revision: https://phabricator.services.mozilla.com/D48327
--HG--
extra : moz-landing-system : lando
This does not test more markers than before. But instead of searching for
strings in the raw json, expected values are checked in the parsed output
profile.
Also some `ASSERT_...` tests could be changed to `EXPECT_...`, because their
failures would not prevent running more of the test.
Differential Revision: https://phabricator.services.mozilla.com/D48326
--HG--
extra : moz-landing-system : lando
Previously, the absence of "stackwalk", "leaf", and "javascript" implied that
the test/user didn't want any sampling, but this caused issues in some tests
that enabled "stackwalk" on platforms that didn't support stack-walking, which
ended up suppressing label-only stacks that the test expected.
we now have an explicit feature "nostacksampling" that disables backtraces from
the samplers in both profilers. This effectively cancels "stackwalk", "leaf",
and "javascript" if present.
Differential Revision: https://phabricator.services.mozilla.com/D47731
--HG--
extra : moz-landing-system : lando
Previously, the absence of "stackwalk", "leaf", and "javascript" implied that
the test/user didn't want any sampling, but this caused issues in some tests
that enabled "stackwalk" on platforms that didn't support stack-walking, which
ended up suppressing label-only stacks that the test expected.
we now have an explicit feature "nostacksampling" that disables backtraces from
the samplers in both profilers. This effectively cancels "stackwalk", "leaf",
and "javascript" if present.
Differential Revision: https://phabricator.services.mozilla.com/D47731
--HG--
extra : moz-landing-system : lando
If all sampling-related features ("Native Stacks", "JavaScript", and "Native
Leaf Stack") are OFF, the sampler loop will not record any stack samples, not
even from labels, which should reduce the profiler overhead significantly.
This means that the sampling rate could be slowed down (up to 5s interval), to
help reduce the power consumption incurred by wake-ups for sampling.
Markers are not affected by this, and will all be recorded as expected.
However counters (e.g., memory allocations) are still tied to sampling, so their
sampling resolution will be reduced to whatever sampling rate is chosen.
Some existing tests relied on stack sampling happening, so they now enable at
least "leaf". Bug 1579333 may revisit these tests for a better solution (if
possible).
Differential Revision: https://phabricator.services.mozilla.com/D46753
--HG--
extra : moz-landing-system : lando
`BlocksRingBuffer` had an "entry destructor" to make it a more generic
container, and it was useful during early prototyping of the new profiler
storage (so that we could store owning pointers).
But this entry destructor is stored in an `std::function`, which gets marked as
a potential GC caller by the js rooting hazard analyzer; and as this bug found,
it's not obvious where to place `JS::AutoSuppressGCAnalysis`, because profiler
entries (including stacks) could be added on one thread while GC happens
elsewhere, which triggers the embedded `AutoAssertNoGC` check.
Since we don't actually use the entry destructor facility in the profiler, it's
easier to just get rid of it. As a bonus, it's a small optimization.
Tests that were using an entry destructor now use the `State` instead, to verify
that entries are pushed and cleared as expected.
If needed in the future outside of the profiler, `BlocksRingBuffer` could again
include an entry destructor, but it would have to be through templating, so that
the class used in the profiler wouldn't contain an `std::function`.
Differential Revision: https://phabricator.services.mozilla.com/D46738
--HG--
extra : moz-landing-system : lando