Because JITFrameInfo objects are creating *during* profiling, they need to have
their own fallible FailureLatchSource, so that any error can be safely handled
even before we know which FailureLatch will be used in the final JSON
generation work.
Differential Revision: https://phabricator.services.mozilla.com/D155655
This will help with forwarding the chosen FailureLatch to the code and
structures gathering per-thread samples and markers.
Differential Revision: https://phabricator.services.mozilla.com/D155654
Note that at this point, it's using the FailureLatchInfallibleSource singleton,
so operations are still effectively infallible, i.e. they will terminate the
program.
But users already handle future fallible outcomes.
Differential Revision: https://phabricator.services.mozilla.com/D155649
The localization file is moved to `content/` to make it more clear that it's not intended to be localised.
For the same reason, no Fluent migration script is included as there are no localised strings to migrate.
Differential Revision: https://phabricator.services.mozilla.com/D155702
What I am doing:
- Adding code to allow the --rebuild flag to be added as an option
- Fixing bug where if no compare commit is provided parent is used as default
Why:
- User request
Differential Revision: https://phabricator.services.mozilla.com/D155103
Most users of JSONWriter want to fill a string, so instead of having all these
similar implementations, we now have central reusable implementations:
- JSONStringWriteFunc contains a string and writes to it.
- JSONStringRefWriteFunc references a string and writes to it. This is most
useful when the string already exists somewhere, or needs to be returned from
a function (so we avoid another conversion when returning).
Differential Revision: https://phabricator.services.mozilla.com/D154618
mWriter is now a reference, and the ownership is optional through a separate
member variable that could stay null.
User can now choose to keep the JSONWriteFunc on their stack, which saves a
heap allocation, and makes it easier to access the concrete JSONWriteFunc
implementation directly (instead of through WriteFunc()).
Differential Revision: https://phabricator.services.mozilla.com/D154617
Most users of JSONWriter want to fill a string, so instead of having all these
similar implementations, we now have central reusable implementations:
- JSONStringWriteFunc contains a string and writes to it.
- JSONStringRefWriteFunc references a string and writes to it. This is most
useful when the string already exists somewhere, or needs to be returned from
a function (so we avoid another conversion when returning).
Depends on D154617
Differential Revision: https://phabricator.services.mozilla.com/D154618
mWriter is now a reference, and the ownership is optional through a separate
member variable that could stay null.
User can now choose to keep the JSONWriteFunc on their stack, which saves a
heap allocation, and makes it easier to access the concrete JSONWriteFunc
implementation directly (instead of through WriteFunc()).
Depends on D154616
Differential Revision: https://phabricator.services.mozilla.com/D154617
This patch adds the following components:
- nsICookieBannerService: Main service singleton managing the rules and initiating other components.
It's exposed via Services.cookieBanners and can be configured via the cookiebanners.* prefs.
To enable it set "cookiebanners.service.mode" to 1 or 2 and restart the browser.
- nsCookieInjector: Looks up rules and injects cookies for matching top level loads.
- nsICookieBannerListService: Imports and updates the cookie banner rules.
- nsICookieBannerRule: Rules for a given domain.
- nsICookieRule: Part of nsICookieBannerRule. Holds cookie specific rules.
Depends on D153641
Differential Revision: https://phabricator.services.mozilla.com/D153642
We currently support stack walking everywhere, and when it fails, we fall back
to leaf stacks. The leaf option is a bit confusing in how it works, and doesn't
provide much value.
Differential Revision: https://phabricator.services.mozilla.com/D153695
We currently support stack walking everywhere, and when it fails, we fall back
to leaf stacks. The leaf option is a bit confusing in how it works, and doesn't
provide much value.
Differential Revision: https://phabricator.services.mozilla.com/D153695
Adds a simple ESLint plugin for custom environments.
The plugin has a single exported value named `globals`, which is an object with
keys for all globally available self-hosted identifiers. All self-hosted values
are read-only, so we set all properties of `globals` to `"readonly"`.
BytecodeEmitter special identifiers are added to the `.eslintrc.js` file,
because that keeps them closer to the SpiderMonkey source tree when compared
to "tools/lint/eslint/eslint-plugin-spidermonkey-js".
Also see:
- tools/lint/eslint/eslint-plugin-mozilla/lib/environments/
- https://eslint.org/docs/latest/user-guide/configuring/language-options
Differential Revision: https://phabricator.services.mozilla.com/D153337
The BASE_PROFILER_DEFAULT_...ENTRIES constants in BaseProfiler.h were smaller
than those in ProfilerControl.h, leading to a shorter profiling range in the
parent process!
Now these constants and some other shared ones are only defined in
BaseProfiler.h, and reused in ProfilerControl.h.
PROFILER_DEFAULT_DURATION was moved to where it's first used, and should one
day disappear (see bug 1632365).
Differential Revision: https://phabricator.services.mozilla.com/D153669
Added `mach uniffi generate` which executes `uniffi-bindgen-gecko-js` to
generate UniFFI bindings. It's unfortunate that we need to check these
files in, but I couldn't figure out a way to auto-generate them as part
of the build process.
Adding `#include "nsIContent.h"` to dom/base/nsINodeList.h. I think
that should have been present before, but things built okay because of
the way things got combined in the uniffied .cpp files. Adding these new
webIDL files bumped `NodeListBinding.cpp` to a new uniffied .cpp file
which caused the build to fail.
Differential Revision: https://phabricator.services.mozilla.com/D144468
Generate the C++ and JS code to handle UniFFI bindings. The WebIDL code
is completely static and doesn't need to be generated.
There's support for both synchronus and async functions, but we haven't
decided the how we want this to be configured. In practice, almost all
functions will need to be async, so for now we're just forcing all
functions to be.
The `uniffi-bindgen-gecko-js` crate builds the binary that generates the
bindings. This binary needs to be fed a list of UDL files, the path of
the .cpp file to generate, and the directory to generate .jsm files in
(and also all of those arguments again, but for the test fixtures).
This is quiet a horrible UI, but it's going to be wrapped in a mach
command.
The `uniffi-js` directory contains shared C++ code for
`uniffi-bindgen-gecko-js`. As much as possible we tried to put the
functionality here and have the generated code simply forward function
calls here.
Still Todo:
- CallbackInterfaces
- Custom and external types
- Datetime and TimeInterval
Differential Revision: https://phabricator.services.mozilla.com/D144472
Added `mach uniffi generate` which executes `uniffi-bindgen-gecko-js` to
generate UniFFI bindings. It's unfortunate that we need to check these
files in, but I couldn't figure out a way to auto-generate them as part
of the build process.
Adding `#include "nsIContent.h"` to dom/base/nsINodeList.h. I think
that should have been present before, but things built okay because of
the way things got combined in the uniffied .cpp files. Adding these new
webIDL files bumped `NodeListBinding.cpp` to a new uniffied .cpp file
which caused the build to fail.
Differential Revision: https://phabricator.services.mozilla.com/D144468
Generate the C++ and JS code to handle UniFFI bindings. The WebIDL code
is completely static and doesn't need to be generated.
There's support for both synchronus and async functions, but we haven't
decided the how we want this to be configured. In practice, almost all
functions will need to be async, so for now we're just forcing all
functions to be.
The `uniffi-bindgen-gecko-js` crate builds the binary that generates the
bindings. This binary needs to be fed a list of UDL files, the path of
the .cpp file to generate, and the directory to generate .jsm files in
(and also all of those arguments again, but for the test fixtures).
This is quiet a horrible UI, but it's going to be wrapped in a mach
command.
The `uniffi-js` directory contains shared C++ code for
`uniffi-bindgen-gecko-js`. As much as possible we tried to put the
functionality here and have the generated code simply forward function
calls here.
Still Todo:
- CallbackInterfaces
- Custom and external types
- Datetime and TimeInterval
Differential Revision: https://phabricator.services.mozilla.com/D144472
Generate the C++ and JS code to handle UniFFI bindings. The WebIDL code
is completely static and doesn't need to be generated.
There's support for both synchronus and async functions, but we haven't
decided the how we want this to be configured. In practice, almost all
functions will need to be async, so for now we're just forcing all
functions to be.
The `uniffi-bindgen-gecko-js` crate builds the binary that generates the
bindings. This binary needs to be fed a list of UDL files, the path of
the .cpp file to generate, and the directory to generate .jsm files in
(and also all of those arguments again, but for the test fixtures).
This is quiet a horrible UI, but it's going to be wrapped in a mach
command.
The `uniffi-js` directory contains shared C++ code for
`uniffi-bindgen-gecko-js`. As much as possible we tried to put the
functionality here and have the generated code simply forward function
calls here.
Still Todo:
- CallbackInterfaces
- Custom and external types
- Datetime and TimeInterval
Differential Revision: https://phabricator.services.mozilla.com/D144472
The shell module loader was rewritten to C++ in bug 1637529.
"jsrtfuzzing-example.js" was changed to appease prettier, now that it's enabled.
Differential Revision: https://phabricator.services.mozilla.com/D153151
This covers the gc stats' JSONPrinter changes, and would trigger the
verifyJSONStringIsCompact assertion (see next patch) without these changes.
Differential Revision: https://phabricator.services.mozilla.com/D152844
These were the last remaining JSON whitespace characters, so we can now our
regression tests can check that there are non of these left.
Differential Revision: https://phabricator.services.mozilla.com/D152607
JSON whitespace comprises all spaces and newlines that are not inside strings,
and which could be removed without changing the decoded contents.
This tests the current state of affairs, where JSON whitespace in profiles may
account for up to 25% of the full length.
Differential Revision: https://phabricator.services.mozilla.com/D152601
The baldrdash integration of Cranelift is agreed between SM and CL
to be the wrong shape. Our import of the code base is also old and
causes difficulties for us when upgrading some crates (see bug
1774829). We should remove it for now to unblock bug 1774829.
Differential Revision: https://phabricator.services.mozilla.com/D152806
Instead of using the public member type `PrivateIPDLCaller` to
restrict access to certain `Shmem` member functions, make them
`private`, and designated `IProtocol` and `ITopLevelProtocol` as
friends of `Shmem`.
Differential Revision: https://phabricator.services.mozilla.com/D151952
This prevents copies and avoids the hack we have to avoid this, which
right now is using nsDependent{C,}String.
Non-virtual actors can still use `nsString` if they need to on the
receiving end.
Differential Revision: https://phabricator.services.mozilla.com/D152519
The Wasm profiling frame iterator uses the return address in the LR register when
interrupting during the prologue. The Linux/Android code was correctly initializing it,
but on Mac and Windows we always used 0.
Differential Revision: https://phabricator.services.mozilla.com/D152268
What we are doing:
Adding a contact section in auto-generated performance testing docs so people know who to contact when they are having an issue with a given module/component
Differential Revision: https://phabricator.services.mozilla.com/D151400
Instead of trying to create a too-big shmem, or if the shmem creation fails,
send a short message starting with '*', which the parent can put into the
profileGatheringLog.
Differential Revision: https://phabricator.services.mozilla.com/D152028
Instead of trying to send a too-big message, send a short message starting
with '*', which the parent can put into the profileGatheringLog.
Note: The `CollectProfileOrEmptyString` is now only used by GrabShutdownProfile
(it was previously also used by RecvGatherProfile before it went async). Since
this patch is completely changing its implementation, we may as well fold it
back into its only caller.
Differential Revision: https://phabricator.services.mozilla.com/D152027
Instead of throwing away the whole profile at the end if it's too big, we try
to keep as much data as possible, by only throwing away incoming child profiles
that would make the resulting combined full profile too big.
Differential Revision: https://phabricator.services.mozilla.com/D152026
Instead of allocating a buffer with the profile and then copying it into an
nsCString (at which point there are two full copies in memory), we resize the
nsCString as needed and directly output the profile data into it.
Differential Revision: https://phabricator.services.mozilla.com/D152025
The promise to be resolved may end up in JavaScript, so we check for the
maximum JS string length -- which is under the nsCString max length, so we
won't fail the too-big-string assertion there.
Differential Revision: https://phabricator.services.mozilla.com/D152024
This will be used in the following patches to more easily reject the promise
with a specific code AND reset gathering, in one call.
Differential Revision: https://phabricator.services.mozilla.com/D152023
This may be useful to advanced profiler users, to see how the multi-process
profile gathering went.
(Some information, like missing processes, could be useful to expose to all
users, but this should be done in future tasks, with corresponding front-end
work.)
Note: There are no direct tests, as this is intended for advanced human users,
and the format is not guaranteed to stay stable.
Differential Revision: https://phabricator.services.mozilla.com/D151902
This log goes into `profile.profilingLog[<parent pid>].bufferGlobalController.updates`, an array of data arrays.
See the related `updatesSchema` for what's in those data arrays.
Note: There are no direct tests, as this is intended for advanced human users,
and the format is not guaranteed to stay stable.
Differential Revision: https://phabricator.services.mozilla.com/D151776
What we are doing:
Giving the fuzzy selector used in the compare selector --full as an option
Why:
By using --full we expand the test list available to the compare selector, and allow all tasks to be run. Without --full we only see a common subset of tasks typically run on moz-central
How we did this:
We are adding in the common_groups = ["task"] line which allowed fuzzy to be run with a general set of args of which one is --full
Differential Revision: https://phabricator.services.mozilla.com/D151837
When converting lazy getter calls, if the previous/next statement is
`ChromeUtils.defineESModuleGetters`, add properties into it, instead of
converting the lazy getter call into `ChromeUtils.defineESModuleGetters`
or creating new `ChromeUtils.defineESModuleGetters` call.
Differential Revision: https://phabricator.services.mozilla.com/D151980
This revision migrates all the content of the following modules:
- Desktop_Firefox
- Toolkit
- Core
- Testing
Up to the their state on Monday June 20th, 2022 in the old system.
The rst file was automatically generated by running `mots export`.
Differential Revision: https://phabricator.services.mozilla.com/D130508
When reading Dwarf unwind info, `CallFrameInfo::State::DoInstruction` is
called once per CFI instruction. At both call sites, the call is driven by a
simple loop. Because each call doesn't do much work, the call overhead is
quite high, and there are huge numbers of CFI instructions to be processed.
This patch moves the loop into its own method `DoInstructions`, and adds
annotations in the hope of getting `DoInstruction` inlined into the loop.
On an Intel Core i5 1135G7 at circa 4 GHz, this reduces the Dwarf read time
from 0.27 seconds (after bugs 1754932, 1777540 and 1777949 have landed) to
0.26 seconds. Not much of a win, but on the other hand, the insn count falls
from 3906 million to 3640 million, which seems like a worthwhile win for what
is a trivial change.
Differential Revision: https://phabricator.services.mozilla.com/D151262
This profilingEndTime is the time when this property is actually written, which
corresponds to the end of the profiling session.
If it's a shutdown profile, the exact same time is used for the existing
`meta.shutdownTime` property.
Depends on D151355
Differential Revision: https://phabricator.services.mozilla.com/D151356
This contentEarliestTime is the time when the earliest (and oldest) surviving
chunk was prepared to start receiving data.
It should be a good hint to the front-end about where the profiling data
actually starts.
Differential Revision: https://phabricator.services.mozilla.com/D151355
When reading Dwarf unwind info, `CallFrameInfo::RuleMap::registers_` is a
`std::map<int, Rule>` used to map (Dwarf) register numbers to the current
unwind rule for each number. These mappings are very small (typically <= 7
elems) and very short lived. Result is that `std::map` creates a lot of
overhead because it is implemented as a Red-Black tree, and hence does a lot
of malloc/freeing of nodes.
This patch replaces it with a simple vector, wrapped up in a new `class
CallFrameInfo::RuleMapLowLevel`. Comments have also been improved. The
resulting performance changes are as follows:
user time insns #malloc #megabytes
seconds million calls allocated
x86_64 before 0.42 5609 5,891,857 560.8
x86_64 after 0.27 3906 894,188 323.1
arm64 before 0.46 7697 8,469,659 680.8
arm64 after 0.24 4922 1,043,154 427.1
x86_64: Intel Core i5 1135G7, 4.2GHz, Fedora 35
arm64: Apple M1, ??? MHz, Fedora 33 running on Parallels Workstation
Differential Revision: https://phabricator.services.mozilla.com/D151261
We'll probably want to do something more accurate in the future with a
custom clang static analysis pass which validates that XPIDL interfaces
have the expected vtable and struct layout, however doing so would be
more involved than the string matching done in this patch.
In addition to checking for extra virtual methods, we'll likely also
want to check for data members on interfaces, and reject them unless the
class is marked as `[builtinclass]` in addition to some other attribute
which we'll need to add to prevent them from being implemented in Rust
(as c++ data members will not be reflected by the rust macro).
There were 2 instances of a comment which contained the word 'virtual'
within a CDATA block. These comments were moved out of the CDATA block
to avoid triggering the error.
Differential Revision: https://phabricator.services.mozilla.com/D151068
After parsing Dwarf unwind info, the resulting mExtents array must be sorted
by start address. This forms a significant part of the overall unwind-info
reading cost.
This is done by calling std::sort. Unfortunately the way it is coded, the
comparison function is not inlined, which results in 91 million calls to it
when reading info up to and including for libxul.so.
This small patch turns it into a closure, which is inlined. For (performance)
safety, Extent::offset() is marked inline too.
On an Intel Core i5 1135G7 at circa 4 GHz, this reduces the Dwarf read time
from 0.42 seconds (after bug 1754932 has landed) to 0.35 seconds. Insn count
falls from 5630 million to 4775 million.
Differential Revision: https://phabricator.services.mozilla.com/D150943
Fix for a regression in bug 1668867, which removed this important line.
Without it, the logic that handles chunk manager updates doesn't run, and the
overall buffer size limit across processes is not enforced anymore.
To catch future regressions, the ProfileBufferGlobalController now adds its
creation timestamp to the profiling log, which can be tested.
Differential Revision: https://phabricator.services.mozilla.com/D150995
This is a special log that's collected while the parent is gathering all the
JSON profiles (its own, and its children's).
This separate log is necessary, because the gathered logs have already
completed and closed their own `profilingLog` object, and it would be much more
difficult to add to them now.
Differential Revision: https://phabricator.services.mozilla.com/D150562
At profiler startup, LUL reads and preprocesses Dwarf unwind info for all
shared objects mapped into the process. That includes libxul.so and around 90
other shared objects on an x86_64-linux build. This takes too long.
Profiling just this reading phase shows a huge number of mallocs and frees
resulting from LulDwarf.cpp. The largest cause of these is the implementation
of a sum-of-products type with 7 variants (CallFrameInfo::*Rule) using C++
inheritance. The variants have different sizes and so have to be boxed (stored
on the heap and referred to by pointers). They are very short lived and this
causes a lot of heap turnover.
For an m-c build on x86_64-linux on Fedora 35 using clang-13.0.0 -g -O2, when
measuring a MOZ_PROFILER_STARTUP=1 run that is hacked so as to exit
immediately after the unwind info for libxul.so has been read, this patch has
the following perf effects:
* run time reduced from 0.50user to 0.41user (Core i5 1135G7, circa 4 GHz)
* heap allocation reduced from 8,087,404 allocs/397,177,435 bytes to
3,176,936 allocs, 328,204,042 bytes.
* instruction count reduced from 6,848,730,307 to 5,572,028,393.
Main changes are:
* `class CallFrameInfo::Rule` has been completely rewritten, so as to merge
its 7 children into it, and those children have been removed. The new
representation uses just 3 machine words to represent all 7 alternatives.
The various virtual methods of `class CallFrameInfo::Rule` (`Handle`,
`operator==`, etc) have been rewritten to use a simple switch-based scheme.
* The code that uses `class CallFrameInfo::Rule` has been changed (but not
majorly rewritten) so as to pass around instances of it by value, rather
than pointers to it. This removes the need to allocate them on the heap.
To simulate the previous use case where a NULL value had a meaning, the
revised class in fact has an 8th alternative, `INVALID`, and a routine
`isVALID`. INVALID rules are now used in place of NULL pointers to rules as
previously.
* Accessors and constructors for the revised `class CallFrameInfo::Rule` hide
the underlying 3-word representation, ensure that the correct
representational invariants are maintained, and make it impossible to access
fields that are meaningless for the given variant. So it's not any less
safe than the original.
* Additionally, Dwarf expression strings are no longer represented using
`std::string`. Instead a new two-word type `ImageSlice` has been provided.
This avoids all unnecessary (and unknown) overheads with `std::string` and
provides a significant part of the measured speedup.
Differential Revision: https://phabricator.services.mozilla.com/D139288
I went through the docs and fixed up a few things. I also removed the other
version that was imported under XPCOM, updated a few obsolete MDN links that
now have updated docs, and deleted some ancient Bonsai links.
Differential Revision: https://phabricator.services.mozilla.com/D150504
This patch aims to be minimally invasive. In order to do this it deals a lot with raw JSON strings. This makes the end product be rather unfortunate indentation wise. Considering this is processed by our testing infrastructure internally at this point it seems reasonable to defer 'prettifying' this, since it would be quite a complicated task.
Differential Revision: https://phabricator.services.mozilla.com/D149949