The main `BlocksRingBuffer`s will be stored in `CorePS` (outside of
`ProfileBuffer`s), as we need to be able to safely access the underlying buffers
when profilers are not enabled.
Also `ProfilerBacktrace` will own the `BlocksRingBuffer` that its captured
`ProfileBuffer` uses.
Taking this opportunity to rename the different `mBuffer`s to more useful names.
Differential Revision: https://phabricator.services.mozilla.com/D43422
--HG--
extra : moz-landing-system : lando
Now that we are using a byte-oriented `BlocksRingBuffer` instead of an array of
9-byte `ProfileBufferEntry`'s, internally the profiler sets the buffer size in
bytes. However all external users (popup, tests, etc.) still assume that the
requested capacity is in entries!
To limit the amount of changes, we will keep assuming externally-visible
capacities are in entries, and convert them to bytes.
Even though entries used to be 9 bytes each, and `BlocksRingBuffer` adds 1 byte
for the entry size, some entries actually need less space (e.g., 32-bit numbers
now take 6 bytes instead of 9), so converting to less than 9 bytes per entry is
acceptable.
We are settling on 8 bytes per entry: It's close to 9, and it's a power of two;
since the effective number of entries was a power of two, and `BlocksRingBuffer`
also uses a power of two size in bytes, this convertion keeps sizes in powers of
two.
Differential Revision: https://phabricator.services.mozilla.com/D44953
--HG--
extra : moz-landing-system : lando
This just replaces `ProfileBuffer`'s self-managed circular buffer with a
`BlocksRingBuffer`.
That `BlocksRingBuffer` does not need its own mutex (yet), because all uses go
through gPSMutex-guarded code.
`ProfileBuffer` now also pre-allocates a small buffer for use in
`DuplicateLastSample()`, this avoids multiple mallocs at each sleeping thread
stack duplication.
Note: Internal "magic" sizes have been multiplied by 8 (and tweaked upwards, to
handle bigger stacks), because they originally were the number of 9-byte
entries, but now it's the buffer size in bytes. (And entries can now be smaller
than 9 bytes, so overall the capacity in entries should be similar or better.)
However, external calls still think they are giving a number of "entries", this
will be handled in the next patch.
Differential Revision: https://phabricator.services.mozilla.com/D43421
--HG--
extra : moz-landing-system : lando
Add some stats (off by default) around streaming JSON, as the following patches
may affect it.
Differential Revision: https://phabricator.services.mozilla.com/D44952
--HG--
extra : moz-landing-system : lando
In some situations we will *need* to use a `BlocksRingBuffer` that absolutely
does not use a mutex -- In particular in the critical section of
`SamplerThread::Run()`, when a thread is suspended, because any action on any
mutex (even a private one that no-one else interacts with) can result in a hang.
As a bonus, `BlocksRingBuffer`s that are known not to be used in multi-thread
situations (e.g., backtraces, extracted stacks for duplication, etc.) will waste
less resources trying to lock/unlock their mutex.
Differential Revision: https://phabricator.services.mozilla.com/D45305
--HG--
extra : moz-landing-system : lando
`BaseProfilerMaybeMutex` wraps a `BaseProfilerMutex` inside a `Maybe`.
The decision to use a mutex or not is set at construction time.
If there is no mutex, all operations do nothing (at the small cost of checking
if the mutex is present.)
`BaseProfilerMaybeAutoLock` is the recommented RAII object to lock and
automatically unlock a `BaseProfilerMaybeMutex` until the end of a block scope.
Differential Revision: https://phabricator.services.mozilla.com/D45304
--HG--
extra : moz-landing-system : lando
Instead of copying `BlocksRingBuffer` data byte-by-byte (using iterators byte
dereferencers), we can now use `ModuloBuffer::Iterator::ReadInto(Iterator&)` to
copy them using a small number of `memcpy`s.
Differential Revision: https://phabricator.services.mozilla.com/D45839
--HG--
extra : moz-landing-system : lando
Some objects are copied byte-by-byte to/from `ModuloBuffer`s.
E.g., serialized `BlocksRingBuffer`s, or duplicate stacks. (And more to come.)
`Iterator::ReadInto(Iterator&)` optimizes these copies by using the minimum
number of `memcpy`s possible.
Differential Revision: https://phabricator.services.mozilla.com/D45838
--HG--
extra : moz-landing-system : lando
profiler_can_accept_markers() is a fast and racy check that markers would
currently be stored, it should be used around potentially-expensive calls to
add markers.
And now markers are no longer stored when the profiler is paused. (Note that the
profiler is paused when a profile is being stored, this will help make this
operation faster.)
Differential Revision: https://phabricator.services.mozilla.com/D44434
--HG--
extra : moz-landing-system : lando
Since all profiler data is now stored inside ProfileBuffer, there is no real
need to continuously discard old data during sampling (this was particularly
useful to reclaim memory taken by old markers&payloads).
Instead, we can now just discard the old data once, just before starting to
stream it to JSON.
Differential Revision: https://phabricator.services.mozilla.com/D44433
--HG--
extra : moz-landing-system : lando
Now that what was in ProfilerMarker is stored directly in `BlocksRingBuffer`,
there is no need for this class anymore!
This also removes all the pointer management around it (when added to a TLS
list, moved during sampling, deleted when expired).
Differential Revision: https://phabricator.services.mozilla.com/D43431
--HG--
extra : moz-landing-system : lando
Since payloads are not kept alive long after they have been serialized, we can
just create them on the stack and pass a reference to their base (or pointer,
`nullptr` representing "no payloads") to `profiler_add_marker()`.
Differential Revision: https://phabricator.services.mozilla.com/D43430
--HG--
extra : moz-landing-system : lando
Markers and their payloads can now be serialized straight into the profiler's
`BlocksRingBuffer`, which is now thread-safe to allow such concurrent accesses
(even when gPSMutex is not locked).
This already saves us from having to allocate a `PayloadMarker` on the heap, and
from having to manage it in different lists.
The now-thread-safe `BlocksRingBuffer` in `CorePS` cannot be used inside the
critical section of `SamplerThread::Run`, because any mutex function could hang
because of the suspended thread (even though they functionally don't appear to
interact). So the sampler now uses a local `BlocksRingBuffer` without mutex.
As a bonus, the separate buffer helps reduce the number of concurrent locking
operations needed when capturing the stack.
Differential Revision: https://phabricator.services.mozilla.com/D43429
--HG--
extra : moz-landing-system : lando
Copy the full contents of BlocksRingBuffer into another one.
This is mainly useful to use a temporary buffer to store some data without
contentions, and then integrate the temporary buffer into the main one.
Differential Revision: https://phabricator.services.mozilla.com/D45306
--HG--
extra : moz-landing-system : lando
Payloads will serialize themselves into a `BlocksRingBuffer` entry when first
captured.
Later they will be deserialized, to stream JSON for the output profile.
Differential Revision: https://phabricator.services.mozilla.com/D43428
--HG--
extra : moz-landing-system : lando
The common data members stored in the ProfilerMarkerPayload base class can be
gathered into a struct, which will make it easier to pass around, especially
when a derived object is constructed with these common properties.
Differential Revision: https://phabricator.services.mozilla.com/D43427
--HG--
extra : moz-landing-system : lando
Markers may contain backtraces, so we need to be able to de/serialize them.
This involves de/serializing the underyling `BlocksRingBuffer`.
Differential Revision: https://phabricator.services.mozilla.com/D43426
--HG--
extra : moz-landing-system : lando
Add basic testing of `profiler_get_backtrace` before working on serializing it.
Differential Revision: https://phabricator.services.mozilla.com/D43424
--HG--
extra : moz-landing-system : lando
The main `BlocksRingBuffer`s will be stored in `CorePS` (outside of
`ProfileBuffer`s), as we need to be able to safely access the underlying buffers
when profilers are not enabled.
Also `ProfilerBacktrace` will own the `BlocksRingBuffer` that its captured
`ProfileBuffer` uses.
Taking this opportunity to rename the different `mBuffer`s to more useful names.
Differential Revision: https://phabricator.services.mozilla.com/D43422
--HG--
extra : moz-landing-system : lando
Now that we are using a byte-oriented `BlocksRingBuffer` instead of an array of
9-byte `ProfileBufferEntry`'s, internally the profiler sets the buffer size in
bytes. However all external users (popup, tests, etc.) still assume that the
requested capacity is in entries!
To limit the amount of changes, we will keep assuming externally-visible
capacities are in entries, and convert them to bytes.
Even though entries used to be 9 bytes each, and `BlocksRingBuffer` adds 1 byte
for the entry size, some entries actually need less space (e.g., 32-bit numbers
now take 6 bytes instead of 9), so converting to less than 9 bytes per entry is
acceptable.
We are settling on 8 bytes per entry: It's close to 9, and it's a power of two;
since the effective number of entries was a power of two, and `BlocksRingBuffer`
also uses a power of two size in bytes, this convertion keeps sizes in powers of
two.
Differential Revision: https://phabricator.services.mozilla.com/D44953
--HG--
extra : moz-landing-system : lando
This just replaces `ProfileBuffer`'s self-managed circular buffer with a
`BlocksRingBuffer`.
That `BlocksRingBuffer` does not need its own mutex (yet), because all uses go
through gPSMutex-guarded code.
`ProfileBuffer` now also pre-allocates a small buffer for use in
`DuplicateLastSample()`, this avoids multiple mallocs at each sleeping thread
stack duplication.
Note: Internal "magic" sizes have been multiplied by 8 (and tweaked upwards, to
handle bigger stacks), because they originally were the number of 9-byte
entries, but now it's the buffer size in bytes. (And entries can now be smaller
than 9 bytes, so overall the capacity in entries should be similar or better.)
However, external calls still think they are giving a number of "entries", this
will be handled in the next patch.
Differential Revision: https://phabricator.services.mozilla.com/D43421
--HG--
extra : moz-landing-system : lando
Add some stats (off by default) around streaming JSON, as the following patches
may affect it.
Differential Revision: https://phabricator.services.mozilla.com/D44952
--HG--
extra : moz-landing-system : lando
In some situations we will *need* to use a `BlocksRingBuffer` that absolutely
does not use a mutex -- In particular in the critical section of
`SamplerThread::Run()`, when a thread is suspended, because any action on any
mutex (even a private one that no-one else interacts with) can result in a hang.
As a bonus, `BlocksRingBuffer`s that are known not to be used in multi-thread
situations (e.g., backtraces, extracted stacks for duplication, etc.) will waste
less resources trying to lock/unlock their mutex.
Differential Revision: https://phabricator.services.mozilla.com/D45305
--HG--
extra : moz-landing-system : lando
`BaseProfilerMaybeMutex` wraps a `BaseProfilerMutex` inside a `Maybe`.
The decision to use a mutex or not is set at construction time.
If there is no mutex, all operations do nothing (at the small cost of checking
if the mutex is present.)
`BaseProfilerMaybeAutoLock` is the recommented RAII object to lock and
automatically unlock a `BaseProfilerMaybeMutex` until the end of a block scope.
Differential Revision: https://phabricator.services.mozilla.com/D45304
--HG--
extra : moz-landing-system : lando
Having a full VPATH for the srcdir sometimes causes make to grab the
wrong prerequisite for a rule, in particular if we have a file in the
srcdir and also generate a file of the same name in the objdir. We don't
really need VPATH anymore though, since most of the information comes
from mozbuild, where we can explicitly list the path to the srcdir or
objdir as necessary.
Differential Revision: https://phabricator.services.mozilla.com/D42968
--HG--
extra : moz-landing-system : lando
`WindowsErrorResult` is a class to hold either a value or a Windows error
code based on the `Result` template. We also have `LauncherResult` for the
same purpose, which was introduced as a part of the launcher process feature
afterward. The difference is `LauncherResult` holds a filename and line
number along with an error code.
This patch integrates LauncherResult.h into WinHeaderOnlyUtils.h so that we
can use `LauncherResult` more broadly.
Differential Revision: https://phabricator.services.mozilla.com/D44512
--HG--
extra : moz-landing-system : lando
Gather stats for most calls to `profiler_add_marker()`, including the time to
allocate payloads if any.
Differential Revision: https://phabricator.services.mozilla.com/D43420
--HG--
extra : moz-landing-system : lando
All calls to `profiler_add_marker()` (outside of the profilers code) are
now replaced by either:
- `PROFILER_ADD_MARKER(name, categoryPair)`
- `PROFILER_ADD_MARKER_WITH_PAYLOAD(name, categoryPair, TypeOfMarkerPayload,
(payload, ..., arguments))`
This makes all calls consistent, and they won't need to prefix the category pair
with `JS::ProfilingCategoryPair::`.
Also it will make it easier to add (and later remove) internal-profiling
instrumentation (bug 1576550), and to replace heap-allocated payloads with
stack-allocated ones (bug 1576555).
Differential Revision: https://phabricator.services.mozilla.com/D43588
--HG--
extra : moz-landing-system : lando
Bug 1556993 fixed the crash in PoisonIOInterposer (due to missing stdout&stderr
fd's), bug 1559379 fixed the mozglue memory issue that the unit test
experienced, and this patch also includes a fix for the test failure in
browser_checkdllblockliststate.js.
So now Base Profiler can be enabled by default on Windows, still using
`MOZ_BASE_PROFILER_...` env-vars for now (upcoming bug will merge these with
non-BASE env-vars soon).
Differential Revision: https://phabricator.services.mozilla.com/D44091
--HG--
extra : moz-landing-system : lando
Whereas previously MozDescribeCodeAddress would have handled demangling,
we need to explicitly do that from our new GetFunction method. The string we
generate is now more useful for the profiler to merge -- having dropped the
address in the previous patch, and the file & line number and library in this
patch.
While we're at it, try to demangle Rust symbols too.
Ideally we'd add Rust symbol handling to DemangleSymbol in
StackWalk.cpp, but that lives in mozglue, which currently cannot have
any Rust crate dependencies.
Differential Revision: https://phabricator.services.mozilla.com/D43142
--HG--
extra : moz-landing-system : lando
Use AUTO_PROFILER_STATS in both profilers, in:
- SamplerThread::Run() calling DoPeriodicSample()
- racy_profiler_add_marker
- ProfileBuffer::DeleteExpiredStoredMarkers()
This should cover all areas affected by the upcoming changes to the
ProfileBuffer storage, and how markers are stored.
Differential Revision: https://phabricator.services.mozilla.com/D42826
--HG--
extra : moz-landing-system : lando
AUTO_PROFILER_STATS(name) can be used to time a {block}.
Statistics are gathered in a function-static variable, and printf'd when the
program ends.
Differential Revision: https://phabricator.services.mozilla.com/D42825
--HG--
extra : moz-landing-system : lando
Some users will want to lock the buffer but not do any specific operation with
it.
Differential Revision: https://phabricator.services.mozilla.com/D42824
--HG--
extra : moz-landing-system : lando
Backtraces (that are kept in some marker payloads) are stored in a small
ProfileBuffer, we will need to store that data, which will happen to be inside
a BlockRingBuffer, so BlockRingBuffer needs to be able to (de)serialize itself!
This is done by storing the contents in the active buffer range, and some extra
data, to later reconstruct a BlocksRingBuffer that looks like the original.
Depends on D42496
Differential Revision: https://phabricator.services.mozilla.com/D42634
--HG--
extra : moz-landing-system : lando
Markers and their payloads contain all kinds of objects that we'll need to
serialize into a BlocksRingBuffer (new ProfileBuffer storage).
This patch will add functions to:
- Compute the size needed to store objects,
- Write multiple objects into a BlockRingBuffer entry,
- Read objects back from an entry.
And it will provide a number of useful de/serialization helpers for:
- Trivially-copyable objects,
- Strings of different types,
- Raw pointers (with some safety guards to avoid surprises),
- Tuples (to store multiple sub-objects),
- Spans,
- Maybe (for optional objects),
- Variant.
This should be enough to store most kinds of data. Further specializations
can&will be written as necessary for more complex or obscure types.
Differential Revision: https://phabricator.services.mozilla.com/D42496
--HG--
extra : moz-landing-system : lando
This allows its use in std algorithms and types that require a real iterator
(like `template <typename InputIt> std::string::string(InputIt, InputIt)`).
Differential Revision: https://phabricator.services.mozilla.com/D42452
--HG--
extra : moz-landing-system : lando
`Reader::At()` can be used to get a `BlockIterator` at a given `BlockIndex`,
clamped between `begin()` and `end()`.
This will be useful when we want to iterate starting at a given index, e.g.,
when duplicating stacks.
Differential Revision: https://phabricator.services.mozilla.com/D42449
--HG--
extra : moz-landing-system : lando
The main goal of these bugs is to move markers to a new storage, so I'm adding
lots of markers to TestBaseProfiler.
Also adding labels, easier to read unsymbolicated profiles, and gives a bit more
coverage too.
And adding a separate "fibonacci canceller" thread, which is needed on some
slower platforms (e.g., Linux 64 ASAN times out otherwise); as a bonus, this
tests AUTO_BASE_PROFILER_REGISTER_THREAD.
Differential Revision: https://phabricator.services.mozilla.com/D42448
--HG--
extra : moz-landing-system : lando
(Also MOZ_BASE_PROFILER_STARTUP.)
This makes it easier to keep that variable setup in the environment, and change
its value to switch between enabling and disabling the profiler.
Differential Revision: https://phabricator.services.mozilla.com/D42447
--HG--
extra : moz-landing-system : lando
In practice the Reader doesn't need to be copied/moved/reassign.
BlocksRingBuffer::Read() can just instantiate one on the stack, and pass it by
reference to callbacks.
Differential Revision: https://phabricator.services.mozilla.com/D42118
--HG--
extra : moz-landing-system : lando
The point of the EntryReserver was mainly to have an object that represented a
writing lock on BlocksRingBuffer, so potentially perform multiple consecutive
writes.
After some experience implementing bug 1562604, there's actually no need for it.
So instead of having `Put()` create an `EntryReserver`, we now have
`ReserveAndPut()` that does the whole work in one function.
Differential Revision: https://phabricator.services.mozilla.com/D42116
--HG--
extra : moz-landing-system : lando
EntryWriter doesn't even need to be moveable, as BlocksRingBuffer can just
create one on the stack, and pass it by reference to callbacks.
This removes risks, and potential data copies.
Differential Revision: https://phabricator.services.mozilla.com/D42115
--HG--
extra : moz-landing-system : lando
Actually, we use clang for all Linux and Android platform, so it is no reason to use frame pointer based stack walker for LUL on Android/x86 and Android/x86-64 if no DWARF rule.
Differential Revision: https://phabricator.services.mozilla.com/D41320
--HG--
extra : moz-landing-system : lando
While mozglue continues to be the correct location for calling the affected
code in this patch, the calls requiring profiler labels will soon be
originating from firefox.exe via the launcher process.
mozglue will be supplying the launcher process with an interface that consists
of what are effectively "OnBeginDllLoad" and "OnEndDllLoad" callback
notifications; obviously an RAII class is not going to be useful for that case.
We still want to keep the RAII stuff around, however, since we still need it
for cases where we need to fall back to using the legacy DLL blocklist.
Differential Revision: https://phabricator.services.mozilla.com/D41807
--HG--
extra : moz-landing-system : lando
While mozglue continues to be the correct location for calling the affected
code in this patch, the calls requiring stackwalk suppression will soon be
originating from firefox.exe via the launcher process.
mozglue will be supplying the launcher process with an interface that consists
of what are effectively "OnBeginDllLoad" and "OnEndDllLoad" callback
notifications; obviously an RAII class is not going to be useful for that case.
We still want to keep the RAII stuff around, however, since we still need it
for cases where we need to fall back to using the legacy DLL blocklist.
Differential Revision: https://phabricator.services.mozilla.com/D41808
--HG--
extra : moz-landing-system : lando
`BlocksRingBuffer` will be used both inside and outside `ProfileBuffer`:
- Inside to serve as `ProfileBuffer`'s main storage for stack traces,
- Outside to allow marker storage even when `ProfileBuffer` is locked during
stack sampling.
`ProfileBuffer` only exists while `ActivePS` is alive, but because of the
potential outside accesses above (due to small races between ProfileBuffer
shutdown, and thread-local IsBeingProfiled() flags), we cannot just do the same
for BlocksRingBuffer, and it must remain alive to gracefully deny these accesses
around the profiler startup and shutdown times.
To accomplish this, `BlocksRingBuffer` may be in different states:
- "In-session", we have a real buffer to write to and read from,
- "Out-of-session", without buffer so reads&writes do nothing.
This is implemented by enclosing the underlying `ModuloBuffer` and the entry
deleter in a `Maybe`, which may be `Nothing` when the profiler is not running
and the `ProfileBuffer`'s `BlocksRingBuffer` is out-of-session.
Differential Revision: https://phabricator.services.mozilla.com/D41519
--HG--
extra : moz-landing-system : lando
`BlocksRingBuffer` will be used both inside and outside `ProfileBuffer`:
- Inside to serve as `ProfileBuffer`'s main storage for stack traces,
- Outside to allow marker storage even when `ProfileBuffer` is locked during
stack sampling.
`ProfileBuffer` only exists while `ActivePS` is alive, but because of the
potential outside accesses above (due to small races between ProfileBuffer
shutdown, and thread-local IsBeingProfiled() flags), we cannot just do the same
for BlocksRingBuffer, and it must remain alive to gracefully deny these accesses
around the profiler startup and shutdown times.
To accomplish this, `BlocksRingBuffer` may be in different states:
- "In-session", we have a real buffer to write to and read from,
- "Out-of-session", without buffer so reads&writes do nothing.
This is implemented by enclosing the underlying `ModuloBuffer` and the entry
deleter in a `Maybe`, which may be `Nothing` when the profiler is not running
and the `ProfileBuffer`'s `BlocksRingBuffer` is out-of-session.
Differential Revision: https://phabricator.services.mozilla.com/D41519
--HG--
extra : moz-landing-system : lando
After some bad experiences, I think EntryReader should be move-only:
- It needs to be moveable so it can be created from a function, and move-
constructed into a Maybe<> if needed.
- It can be passed around as a reference.
Previously, it could be passed by value, but it was too easy to create bugs,
e.g.: A function delegates to a sub-function to read something at the beginning,
then the first function wants to read more past that, but if the reader was
passed by value the first function would not see past what the sub-function did
read.
As a bonus, `mRing` can now be a reference instead of a pointer, and other
members can be const.
Differential Revision: https://phabricator.services.mozilla.com/D40958
--HG--
extra : moz-landing-system : lando
This patch does two things:
1. We refactor the resolution of function pointer and return type so that we
may support additional calling conventions besides just __stdcall;
2. We refactor DynamicallyLinkedFunctionPtr into a base class, and create
StaticDynamicallyLinkedFunctionPtr to specifically handle the static local
use case.
Differential Revision: https://phabricator.services.mozilla.com/D40885
--HG--
extra : moz-landing-system : lando
It makes little sense to copy a writer (also an output iterator).
We're keeping it move-constructible (so it can be passed around for construction
purposes), but not move-assignable to help make more members `const`.
Differential Revision: https://phabricator.services.mozilla.com/D40622
--HG--
extra : moz-landing-system : lando
This makes it easier to grab all BlocksRingBuffer state variables:
- Range start and end.
- Number of pushed blocks/entries, number of cleared blocks/entries.
The function is thread-safe, and the returned values are consistent with each
other, but they may become stale straight after the function returns (and the
lock is released).
They are still valuable to statistics, and to know how far the range has at
least reached (but may go further soon).
Differential Revision: https://phabricator.services.mozilla.com/D40621
--HG--
extra : moz-landing-system : lando
Renamed `BPAutoLock` to `BaseProfilerAutoLock`.
DEBUG-build `~BaseProfilerMutex()` checks that it is unlocked.
Prevent `BaseProfilerMutex` and `BaseProfilerAutoLock` copies&moves.
DEBUG-build check that `Lock()` sees `mOwningThreadId`==0 (because that is the
initial value, and the value after a previous `Unlock()`).
Don't preserve atomic `mOwningThreadId` in JS recording.
Differential Revision: https://phabricator.services.mozilla.com/D40620
--HG--
extra : moz-landing-system : lando
`ProfilerMarkerPayload::Set...()` functions are only used by derived classes in
the same files, and these values could just be set during construction.
Differential Revision: https://phabricator.services.mozilla.com/D40619
--HG--
extra : moz-landing-system : lando
cppunittest TestBaseProfiler and gtest GeckoProfiler.Markers now show overhead
stats.
(Separate patch, because we may want to remove them after a while.)
Differential Revision: https://phabricator.services.mozilla.com/D39642
--HG--
extra : moz-landing-system : lando
`ProfileBuffer` is now responsible for collecting overhead stats, and adding
them to the struct returned by `profiler_get_buffer_info()`.
Differential Revision: https://phabricator.services.mozilla.com/D39641
--HG--
extra : moz-landing-system : lando
`SamplerThread` inheriting from `Sampler` was a bit confusing, and scary with no
virtual destructor&functions.
`SamplerThread` only uses `Sampler`'s `Disable()` and
`SuspendAndSampleAndResumeThread()` functions, and `SamplerThread` is never
accessed through a `Sampler` reference/pointer.
So `SamplerThread` can just own a `Sampler` to make that relationship clearer.
Differential Revision: https://phabricator.services.mozilla.com/D39640
--HG--
extra : moz-landing-system : lando
This of course checks that the mutex is locked as expected in non-public APIs.
It also checks that user callbacks will not keep readers/writers longer than
they should.
Differential Revision: https://phabricator.services.mozilla.com/D39625
--HG--
extra : moz-landing-system : lando
`BaeProfilerMutex` is a concrete mutex based on MutexImpl, which was previously
implemented twice in both platform.h and BlocksRingBuffer.h.
This combined mutex has some DEBUG code (when MOZ_BASE_PROFILER is #defined) to
catch recursive locking, and to assert that the mutex is held (for code that
cannot easily use the "proof of lock" pattern; e.g., going through user-provided
callbacks).
This class needs to be public (because it is used in public headers), but is an
implementation detail, so it is located in a new header
"mozilla/BaseProfilerDetail.h" that will collect `mozilla::baseprofiler::detail`
code that may be useful to a few files in Base Profiler.
Differential Revision: https://phabricator.services.mozilla.com/D39624
--HG--
extra : moz-landing-system : lando
`ClearBefore()` with a past-the-end `BlockIndex` was calling `Clear()`, which
tried to take the lock again! Also we didn't return after that.
Fixed, and added corresponding test.
Also: Removed ambiguous "delete" word, now using more precise "destroy" or
"entry destructor".
Differential Revision: https://phabricator.services.mozilla.com/D38846
--HG--
extra : moz-landing-system : lando
This is a similar concept as `nullptr` is to a pointer.
`BlocksRingBuffer` now skips the first byte in the buffer, so that no entries
start at 0 (the internal default `BlockIndex` value).
All `BlocksRingBuffer` public APIs handle this default value, and do nothing
and/or return Nothing (as if it pointed at an already-deleted entry).
Added tests for this, and for all BlockIndex operations.
Differential Revision: https://phabricator.services.mozilla.com/D38667
--HG--
extra : moz-landing-system : lando
Without declaring them, ModuloBuffer had its copy&move constructor&assignments
defaulted. This means it could have been copied, and then both objects would now
own the same resource and attempt to free it on destruction!
So now:
- Copy construction&assignment are now explicitly disallowed.
- Move assignment is disallowed, to keep some members `const`.
- Move construction is allowed (so a function can return a ModuloBuffer), and
ensures that the moved-from object won't free the resource anymore.
Bonus: `mBuffer` is now `const`, to ensure that it cannot point at something
else, but note the pointed-at bytes are *not* const.
So ModuloBuffer is like an unchanging resource, but it allows to be moved-from
as an xvalue that should not be used after the move.
Differential Revision: https://phabricator.services.mozilla.com/D38665
--HG--
extra : moz-landing-system : lando
By default `ModuloBuffer` allocates its own buffer on the heap.
Now `ModuloBuffer` adds two alternatives:
- Take ownership of a pre-allocated `UniquePtr<uint8_t>` buffer.
- Work over an unowned `uint8_t*` array. The caller is responsible for
ownership, and ensuring that the array lives at least as long as the
`ModuloBuffer`/`BlocksRingBuffer`.
`BlocksRingBuffer` can pass along these new options to its underlying
`ModuloBuffer`.
The main use will be for small on-stack `BlocksRingBuffer` that can store a
stack trace, or to more easily collect data (without allocating anything on the
heap) that can then go into the upcoming `ProfileBuffer`'s `BlocksRingBuffer`.
Differential Revision: https://phabricator.services.mozilla.com/D38285
--HG--
extra : moz-landing-system : lando
This adds to the byte-oriented ModuloBuffer from bug 1563425:
- Thread-safety: All APIs may be called at any time from any thread.
- Structure: The buffer will be divided in "blocks" of different size, with some
block meta-data and space for the user "entry".
- Capable of handling user resources: The user may provide a "deleter" that will
be informed about soon-to-be-destroyed entries; so if some entries reference
outside resources, these references may be properly released.
Note: This first implementation still only allows the user to manipulate bytes
and trivially-copyable objects (same as with the ModuloBuffer iterators). A
follow-up bug will introduce better serialization capabilities, with the aim to
eventually store everything that current Profiler Markers and their payloads
contain.
Differential Revision: https://phabricator.services.mozilla.com/D37702
--HG--
extra : moz-landing-system : lando
As we are increasingly moving toward enabling new types of DLL blocking across
our various process types, we need to be able to generate various headers in
various distinct formats.
This script enables us to use a unified DLL blocklist input that generates
these distinct headers. From WindowsDllBlocklistDefs.in, we generate:
WindowsDllBlocklistA11yDefs.h - definitions for a11y
WindowsDllBlocklistLauncherDefs.h - definitions for the launcher process
WindowsDllBlocklistLegacyDefs.h - definitions for the legacy mozglue blocklist
WindowsDllBlocklistTestDefs.h - test-only definitions
These headers are then exported to mozilla.
Note that not all headers use the same format, as not all consumers of these
headers have identical workings. There will be additional header types added
in the future which diverge even more from the standard blocklist format. While
this work may seem a bit pointless at the moment, it will become more necessary
in the future. In particular, this work is a prerequisite for bug 1238735.
Differential Revision: https://phabricator.services.mozilla.com/D36993
--HG--
extra : moz-landing-system : lando
Since higher-level APIs that we test may depend on lower-level APIs that we
also test, and since those higher-level APIs may spin up background threads
that call those lower-level APIs, we should ensure that tests are ordered
such that the lower-level APIs are hooked first, thus preventing races where
higher-level background threads call lower-level APIs while the test's main
thread is in the midst of hooking a lower-level API.
I also added some fflush calls to the test so that, the next time we see lots
of crashes in this test, the log output is more complete.
Differential Revision: https://phabricator.services.mozilla.com/D37497
--HG--
extra : moz-landing-system : lando
Basic usage:
- Create buffer: `ModuloBuffer mb(PowerOfTwo);`
- Get iterator: `auto writer = mb.WriterAt(Index);` (or `ReaderAt()`)
- Basic iterator functions on bytes: `*++writer = 'x';`
- Write: `writer.WriteULEB128(sizeof(int)); writer.WriteObject<int>(42);`
- Comparisons, move: `while (writer > reader) { --writer; }`
- Read: `size_t s = reader.ReadULEB128<size_t>(); int i = ReadObject<int>();`
There are no safety checks, it will be up to the caller to ensure thread-safety,
and data safety when wrapping around the buffer.
Differential Revision: https://phabricator.services.mozilla.com/D36869
--HG--
extra : moz-landing-system : lando
Bug 1556993 fixed the crash in PoisonIOInterposer (due to missing stdout&stderr
fd's), and bug 1559379 fixed the mozglue memory issue that the unit test
experienced.
So now Base Profiler can be enabled by default on Windows, still using
`MOZ_BASE_PROFILER_...` env-vars for now (upcoming bug will merge these with
non-BASE env-vars soon).
Differential Revision: https://phabricator.services.mozilla.com/D37355
--HG--
extra : moz-landing-system : lando
Now that Gecko Profiler only registers its entry&exit functions when running,
and it ensures that Base Profiler is stopped beforehand, Base Profiler can now
register its own entry&exit functions to capture labels before xpcom starts.
Differential Revision: https://phabricator.services.mozilla.com/D34810
--HG--
extra : moz-landing-system : lando
Profilers will soon be able to set/reset entry&exit functions at different
times, but simultaneously other code may want to use AutoProfilerLabel, so we
need to make this all thread-safe.
All shared static information is now encapsulated in an RAII class that enforces
proper locking before giving access to this information.
Also added a "generation" count, so that if an AutoProfilerLabel is in-flight
when entry&exit functions are changed, the context given by the old entry
function will not be passed to a mismatched new exit function.
Differential Revision: https://phabricator.services.mozilla.com/D34807
--HG--
extra : moz-landing-system : lando
`ProfilingStack*` happens to be the information that the current Gecko Profiler
entry function wants to forward to the exit function, but AutoProfilerLabel does
not really need to know about that.
Changing it to `void*`, so that we can later use different entry/exit functions
that use different context types.
Differential Revision: https://phabricator.services.mozilla.com/D34806
--HG--
extra : moz-landing-system : lando
The new ProfileBuffer data structure will need to store block sizes (usually
small, LEB128 uses fewer bytes for small numbers), and the circular buffer will
provide iterators that hide the wrapping-around.
Differential Revision: https://phabricator.services.mozilla.com/D36473
--HG--
extra : moz-landing-system : lando
PowerOfTwo makes for a cleaner and more expressive interface, showing that the
profiler will use a power-of-2 storage size.
Using PowerOfTwoMask in ProfilerBuffer also makes it more obvious that we want
cheap modulo operations.
And we don't need to keep the original capacity, as it's only used once and can
easily be recomputed from the mask.
Differential Revision: https://phabricator.services.mozilla.com/D36027
--HG--
extra : moz-landing-system : lando
PowerOfTwo stores a power of 2 value, i.e., 2^N.
PowerOfTwoMask stores a mask corresponding to a power of 2, i.e., 2^N-1.
These should be used in places where a power of 2 (or its mask) is stored or
expected.
`% PowerOfTwo{,Mask}` and `& PowerOfTwoMask` operations are optimal.
MakePowerOfTwo{,Mask}<T, Value>() may be used to create statically-checked
constants.
{,Make}PowerOfTwo{,Mask}{32,64} shortcuts for common 32- and 64-bit types.
Differential Revision: https://phabricator.services.mozilla.com/D36026
--HG--
extra : moz-landing-system : lando
PowerOfTwo makes for a cleaner and more expressive interface, showing that the
profiler will use a power-of-2 storage size.
Using PowerOfTwoMask in ProfilerBuffer also makes it more obvious that we want
cheap modulo operations.
And we don't need to keep the original capacity, as it's only used once and can
easily be recomputed from the mask.
Differential Revision: https://phabricator.services.mozilla.com/D36027
--HG--
extra : moz-landing-system : lando
PowerOfTwo stores a power of 2 value, i.e., 2^N.
PowerOfTwoMask stores a mask corresponding to a power of 2, i.e., 2^N-1.
These should be used in places where a power of 2 (or its mask) is stored or
expected.
`% PowerOfTwo{,Mask}` and `& PowerOfTwoMask` operations are optimal.
MakePowerOfTwo{,Mask}<T, Value>() may be used to create statically-checked
constants.
{,Make}PowerOfTwo{,Mask}{32,64} shortcuts for common 32- and 64-bit types.
Differential Revision: https://phabricator.services.mozilla.com/D36026
--HG--
extra : moz-landing-system : lando
BIONIC is only platform that actually supports gettid. Easiest
solution is to check for linux and disable for BIONIC platform. This
includes the change requested by Gerald to keep the two profilers sync'd.
Differential Revision: https://phabricator.services.mozilla.com/D34919
--HG--
extra : moz-landing-system : lando
Update the tests for ARM64 to include additional functions that are now
supported via 4 byte patching.
We also convert the TEST macros to accept the DLL names as strings, as this
works better with clang-format.
Differential Revision: https://phabricator.services.mozilla.com/D32209
--HG--
extra : moz-landing-system : lando
This patch modifies arm64 so that detours are peformed via two passes:
1. The first pass uses a null trampoline to count how many bytes are available
for patching the original function.
2. If we have >= 16 bytes to patch, we reuse existing trampoline space. If we
have less than 16 bytes to patch, we reserve trampoline space within 128MB
of the function, allowing for a 4 byte patch.
3. Then we recurse, this time using a real trampoline.
Note that we still do a single-pass on x86(-64).
Differential Revision: https://phabricator.services.mozilla.com/D32193
--HG--
extra : moz-landing-system : lando
A null trampoline is just a trampoline that is not backed by a VM reservation.
These are used for tracking the number of bytes that are needed to make a patch.
This patch also contains the changes needed to work with TrampolinePool.
Differential Revision: https://phabricator.services.mozilla.com/D32192
--HG--
extra : moz-landing-system : lando
VMSharingPolicyShared needs to become much smarter. This patch modifies that
policy to track different VM reservations and reuse them whenever possible.
We add TrampolinePools to abstract away the differences between VM policies
with respect to the caller who is making the reservation.
Differential Revision: https://phabricator.services.mozilla.com/D32191
--HG--
extra : moz-landing-system : lando
In order to support 4-byte patches on ARM64, we need to be able to reserve
trampoline space within +/- 128 MB of the beginning of a function.
These changes allow us to make such reservations using OS APIs when
available.
Differential Revision: https://phabricator.services.mozilla.com/D32190
--HG--
extra : moz-landing-system : lando
VMSharingPolicyShared needs to become much smarter. This patch modifies that
policy to track different VM reservations and reuse them whenever possible.
We add TrampolinePools to abstract away the differences between VM policies
with respect to the caller who is making the reservation.
Differential Revision: https://phabricator.services.mozilla.com/D32191
--HG--
extra : moz-landing-system : lando
In order to support 4-byte patches on ARM64, we need to be able to reserve
trampoline space within +/- 128 MB of the beginning of a function.
These changes allow us to make such reservations using OS APIs when
available.
Differential Revision: https://phabricator.services.mozilla.com/D32190
--HG--
extra : moz-landing-system : lando
We remove the debugging hooks that were added to check to see whether a DLL
was loaded, as we can just as easily check that by querying the loader itself.
Plus, we shouldn't be exporting a bunch of test-only loader hooks from mozglue
in our release builds, which is what we are currently doing.
We also remove Injector, InjectorDLL, and TestDLLEject, as these tests can
just as easily be done from within TestDllBlocklist by creating a thread with
LoadLibrary* as the entry point. The CreateRemoteThread stuff, while a more
accurate simulation, has no material effect on whether or not the thread
blocking code works.
Differential Revision: https://phabricator.services.mozilla.com/D34444
--HG--
extra : moz-landing-system : lando
We also s/mincore/version/ in OS_LIBS because the former breaks the test on
Windows 7.
Differential Revision: https://phabricator.services.mozilla.com/D34437
--HG--
extra : moz-landing-system : lando
The profiler will require non-fuzzed timers for accuracy. Making the switch early will avoid surprises when FuzzyFox is enabled.
Differential Revision: https://phabricator.services.mozilla.com/D31010
--HG--
extra : moz-landing-system : lando
Start using BaseProfiler in Firefox main(), before&after XPCOM runs.
Also added a BaseProfiler label around Gecko Profiler init/shutdown (so that
samples may be ignored if user is only interested in non-XPCOM profiling).
Main process name changed to "Main Thread (Base Profiler)", so as not to confuse
the front-end, and show where this thread comes from.
Differential Revision: https://phabricator.services.mozilla.com/D31933
--HG--
extra : moz-landing-system : lando
If MOZ_BASE_PROFILER_STARTUP and MOZ_PROFILER_STARTUP are set, this will integrate
a pre-XPCOM startup profile into the main profile.
It is stored as separate threads (in a single JSON string that is moved around),
which will appear as a new track under the main process.
Only adding threads from BaseProfiler means a better integration with Gecko
Profiler profiles, and is more efficient: Less code, and a smaller memory
footprint.
Differential Revision: https://phabricator.services.mozilla.com/D31932
--HG--
extra : moz-landing-system : lando
Running identical (but separate) InitializeWin64ProfilerHooks in both profilers
confuses the DLL interceptor and the 2nd one crashes because of unexpected
opcodes introduced by the 1st one.
If MOZ_BASE_PROFILER is defined, Gecko Profiler will use that implementation of
InitializeWin64ProfilerHooks instead of its own; and that code also has a guard
so that it effectively only run once even if called from both profilers.
Differential Revision: https://phabricator.services.mozilla.com/D31931
--HG--
extra : moz-landing-system : lando
E.g., AUTO_PROFILER_INIT -> AUTO_BASE_PROFILER_INIT.
This will allow #including BaseProfiler.h anywhere as needed, without clashing
with Gecko Profiler macros.
Differential Revision: https://phabricator.services.mozilla.com/D31929
--HG--
extra : moz-landing-system : lando
Notice the extra 'BASE' in the env-var names.
This is to control BaseProfiler separately from the Gecko Profiler.
Differential Revision: https://phabricator.services.mozilla.com/D31928
--HG--
extra : moz-landing-system : lando
Android not implemented yet.
Windows not working yet when packaged, so disabled by default, but may be
enabled locally by uncommenting `#define MOZ_BASE_PROFILER` where indicated in
BaseProfiler.h.
Differential Revision: https://phabricator.services.mozilla.com/D31927
--HG--
extra : moz-landing-system : lando
Simple test program that exercises the most important APIs of BaseProfiler.
(Including checking that macros work even when BaseProfiler is not enabled.)
Differential Revision: https://phabricator.services.mozilla.com/D31926
--HG--
extra : moz-landing-system : lando
Almost-mechanical changes include:
- Removed unneeded/incompatible #includes and functions (any JS- or XPCOM-
dependent).
- Use std::string for strings and nsIDs.
- Use thin wrappers around mozilla::detail::MutexImpl for mutexes.
- Use hand-rolled AddRef&Release's for ref-counted classes -- could not use
mfbt/RefCounted.h because of bug 1536656.
- Added some platform-specific polyfills, e.g.: MicrosecondsSince1970().
- Only record the main thread by default.
- Logging controlled by env-vars MOZ_BASE_PROFILER_{,DEBUG_,VERBOSE_}LOGGING.
This now builds (with --enable-base-profiler), but is not usable yet.
Differential Revision: https://phabricator.services.mozilla.com/D31924
--HG--
extra : moz-landing-system : lando
Added baseprofiler to mozglue/moz.build, so it will be built.
However all cpp files are dependent on `MOZ_BASE_PROFILER`, which is currently
not #defined by default (in public/BaseProfiler.h).
Added mozglue/mozprofiler to js/src/make-source-package.sh, because
mozglue/moz.build now refers to it.
Differential Revision: https://phabricator.services.mozilla.com/D33258
--HG--
extra : moz-landing-system : lando
Start using BaseProfiler in Firefox main(), before&after XPCOM runs.
Also added a BaseProfiler label around Gecko Profiler init/shutdown (so that
samples may be ignored if user is only interested in non-XPCOM profiling).
Main process name changed to "Main Thread (Base Profiler)", so as not to confuse
the front-end, and show where this thread comes from.
Differential Revision: https://phabricator.services.mozilla.com/D31933
--HG--
extra : moz-landing-system : lando
If MOZ_BASE_PROFILER_STARTUP and MOZ_PROFILER_STARTUP are set, this will integrate
a pre-XPCOM startup profile into the main profile.
It is stored as separate threads (in a single JSON string that is moved around),
which will appear as a new track under the main process.
Only adding threads from BaseProfiler means a better integration with Gecko
Profiler profiles, and is more efficient: Less code, and a smaller memory
footprint.
Differential Revision: https://phabricator.services.mozilla.com/D31932
--HG--
extra : moz-landing-system : lando
Running identical (but separate) InitializeWin64ProfilerHooks in both profilers
confuses the DLL interceptor and the 2nd one crashes because of unexpected
opcodes introduced by the 1st one.
If MOZ_BASE_PROFILER is defined, Gecko Profiler will use that implementation of
InitializeWin64ProfilerHooks instead of its own; and that code also has a guard
so that it effectively only run once even if called from both profilers.
Differential Revision: https://phabricator.services.mozilla.com/D31931
--HG--
extra : moz-landing-system : lando
E.g., AUTO_PROFILER_INIT -> AUTO_BASE_PROFILER_INIT.
This will allow #including BaseProfiler.h anywhere as needed, without clashing
with Gecko Profiler macros.
Differential Revision: https://phabricator.services.mozilla.com/D31929
--HG--
extra : moz-landing-system : lando
Notice the extra 'BASE' in the env-var names.
This is to control BaseProfiler separately from the Gecko Profiler.
Differential Revision: https://phabricator.services.mozilla.com/D31928
--HG--
extra : moz-landing-system : lando
Android not implemented yet.
Windows not working yet when packaged, so disabled by default, but may be
enabled locally by uncommenting `#define MOZ_BASE_PROFILER` where indicated in
BaseProfiler.h.
Differential Revision: https://phabricator.services.mozilla.com/D31927
--HG--
extra : moz-landing-system : lando
Simple test program that exercises the most important APIs of BaseProfiler.
(Including checking that macros work even when BaseProfiler is not enabled.)
Differential Revision: https://phabricator.services.mozilla.com/D31926
--HG--
extra : moz-landing-system : lando
Almost-mechanical changes include:
- Removed unneeded/incompatible #includes and functions (any JS- or XPCOM-
dependent).
- Use std::string for strings and nsIDs.
- Use thin wrappers around mozilla::detail::MutexImpl for mutexes.
- Use hand-rolled AddRef&Release's for ref-counted classes -- could not use
mfbt/RefCounted.h because of bug 1536656.
- Added some platform-specific polyfills, e.g.: MicrosecondsSince1970().
- Only record the main thread by default.
- Logging controlled by env-vars MOZ_BASE_PROFILER_{,DEBUG_,VERBOSE_}LOGGING.
This now builds (with --enable-base-profiler), but is not usable yet.
Differential Revision: https://phabricator.services.mozilla.com/D31924
--HG--
extra : moz-landing-system : lando
Added baseprofiler to mozglue/moz.build, so it will be built.
However all cpp files are dependent on `MOZ_BASE_PROFILER`, which is currently
not #defined by default (in public/BaseProfiler.h).
Added mozglue/mozprofiler to js/src/make-source-package.sh, because
mozglue/moz.build now refers to it.
Differential Revision: https://phabricator.services.mozilla.com/D33258
--HG--
extra : moz-landing-system : lando
Start using BaseProfiler in Firefox main(), before&after XPCOM runs.
Also added a BaseProfiler label around Gecko Profiler init/shutdown (so that
samples may be ignored if user is only interested in non-XPCOM profiling).
Main process name changed to "Main Thread (Base Profiler)", so as not to confuse
the front-end, and show where this thread comes from.
Differential Revision: https://phabricator.services.mozilla.com/D31933
--HG--
extra : moz-landing-system : lando
If MOZ_BASE_PROFILER_STARTUP and MOZ_PROFILER_STARTUP are set, this will integrate
a pre-XPCOM startup profile into the main profile.
It is stored as separate threads (in a single JSON string that is moved around),
which will appear as a new track under the main process.
Only adding threads from BaseProfiler means a better integration with Gecko
Profiler profiles, and is more efficient: Less code, and a smaller memory
footprint.
Differential Revision: https://phabricator.services.mozilla.com/D31932
--HG--
extra : moz-landing-system : lando
Running identical (but separate) InitializeWin64ProfilerHooks in both profilers
confuses the DLL interceptor and the 2nd one crashes because of unexpected
opcodes introduced by the 1st one.
If MOZ_BASE_PROFILER is defined, Gecko Profiler will use that implementation of
InitializeWin64ProfilerHooks instead of its own; and that code also has a guard
so that it effectively only run once even if called from both profilers.
Differential Revision: https://phabricator.services.mozilla.com/D31931
--HG--
extra : moz-landing-system : lando