This patch adds telemetry to measure the time spent in Wasm compilers
and in the code that they generate (actually, all JS and Wasm code).
For simplicity, it measures wallclock time as a proxy for CPU time.
Furthermore, it measures runtime for all JS and Wasm code, and all
native functions invoked by the JS or Wasm code, by timing from
top-level entry to exit. This is for efficiency reasons: we do not want
to add a VM call in the transition stubs between native and JS or JS and
Wasm; that would be a Very Bad Thing for performance, even for a Nightly
build instrumented with telemetry. Because of that, it's difficult to
separate JITted JS and JITted Wasm runtime, but observing their sum
should still be useful when making comparisons across compiler changes
because absolute reductions will still be visible.
The plumbing is somewhat awkward, given that Wasm compilers can run on
background threads. It appears that the telemetry-callback API that
SpiderMonkey includes to avoid a direct dependency on the Gecko
embedding had artificially limited the callback to main-thread use only.
This patch removes that limitation, which is safe at least for Gecko;
the telemetry hooks in Gecko are thread-safe (they take a global mutex).
That way, the background threads performing compilation can directly add
telemetry incrementally, without having to pass this up through the main
thread somehow.
Finally, I have chosen to add the relevant metrics as Scalar telemetry
values rather than Histograms. This is because what we are really
interested in is the sum of all these values (all CPU time spent in
compilation + running Wasm code); if this value goes down as a result of
a Wasm compiler change, then that Wasm compiler change is good because
it reduces CPU time on the user's machine. It is difficult to add two
Histograms together because the bins may have different boundaries. If
we instead need to use a binned histogram for other reasons, then we
could simply report the sum (of all compiler time plus run time) as
another histogram.
Differential Revision: https://phabricator.services.mozilla.com/D85654
This patch adds telemetry to measure the time spent in Wasm compilers
and in the code that they generate (actually, all JS and Wasm code).
For simplicity, it measures wallclock time as a proxy for CPU time.
Furthermore, it measures runtime for all JS and Wasm code, and all
native functions invoked by the JS or Wasm code, by timing from
top-level entry to exit. This is for efficiency reasons: we do not want
to add a VM call in the transition stubs between native and JS or JS and
Wasm; that would be a Very Bad Thing for performance, even for a Nightly
build instrumented with telemetry. Because of that, it's difficult to
separate JITted JS and JITted Wasm runtime, but observing their sum
should still be useful when making comparisons across compiler changes
because absolute reductions will still be visible.
The plumbing is somewhat awkward, given that Wasm compilers can run on
background threads. It appears that the telemetry-callback API that
SpiderMonkey includes to avoid a direct dependency on the Gecko
embedding had artificially limited the callback to main-thread use only.
This patch removes that limitation, which is safe at least for Gecko;
the telemetry hooks in Gecko are thread-safe (they take a global mutex).
That way, the background threads performing compilation can directly add
telemetry incrementally, without having to pass this up through the main
thread somehow.
Finally, I have chosen to add the relevant metrics as Scalar telemetry
values rather than Histograms. This is because what we are really
interested in is the sum of all these values (all CPU time spent in
compilation + running Wasm code); if this value goes down as a result of
a Wasm compiler change, then that Wasm compiler change is good because
it reduces CPU time on the user's machine. It is difficult to add two
Histograms together because the bins may have different boundaries. If
we instead need to use a binned histogram for other reasons, then we
could simply report the sum (of all compiler time plus run time) as
another histogram.
Differential Revision: https://phabricator.services.mozilla.com/D85654
Previously this used |cx->isExceptionPending()| to determine whether the import had succeeded, which doesn't work if there was an uncatchable exception. The patch changes this to pass an explicit status.
Differential Revision: https://phabricator.services.mozilla.com/D85856
CLOSED TREE
We don't need these macros anymore, for two reasons:
1. We have static analysis to provide the same sort of checks via `MOZ_RAII`
and friends.
2. clang now warns for the "temporary that should have been a declaration" case.
The extra requirements on class construction also show up during debug tests
as performance problems.
This change was automated by using the following sed script:
```
# Remove declarations in classes.
/MOZ_DECL_USE_GUARD_OBJECT_NOTIFIER/d
/MOZ_GUARD_OBJECT_NOTIFIER_INIT/d
# Remove individual macros, carefully.
{
# We don't have to worry about substrings here because the closing
# parenthesis "anchors" the match.
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM_TO_PARENT)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM_IN_IMPL)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM_IN_IMPL)/)/g;
# Remove the longer identifier first.
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM_TO_PARENT//g;
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM//g;
}
# Remove the actual include.
\@# *include "mozilla/GuardObjects.h"@d
```
and running:
```
find . -name \*.cpp -o -name \*.h | grep -v 'GuardObjects.h' |xargs sed -i -f script 2>/dev/null
mach clang-format
```
Differential Revision: https://phabricator.services.mozilla.com/D85168
We don't need these macros anymore, for two reasons:
1. We have static analysis to provide the same sort of checks via `MOZ_RAII`
and friends.
2. clang now warns for the "temporary that should have been a declaration" case.
The extra requirements on class construction also show up during debug tests
as performance problems.
This change was automated by using the following sed script:
```
# Remove declarations in classes.
/MOZ_DECL_USE_GUARD_OBJECT_NOTIFIER/d
/MOZ_GUARD_OBJECT_NOTIFIER_INIT/d
# Remove individual macros, carefully.
{
# We don't have to worry about substrings here because the closing
# parenthesis "anchors" the match.
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM_TO_PARENT)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_PARAM_IN_IMPL)/)/g;
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM_IN_IMPL)/)/g;
# Remove the longer identifier first.
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM_TO_PARENT//g;
s/MOZ_GUARD_OBJECT_NOTIFIER_ONLY_PARAM//g;
}
# Remove the actual include.
\@# *include "mozilla/GuardObjects.h"@d
```
and running:
```
find . -name \*.cpp -o -name \*.h | grep -v 'GuardObjects.h' |xargs sed -i -f script 2>/dev/null
mach clang-format
```
Differential Revision: https://phabricator.services.mozilla.com/D85168
This avoids the issue where xpcshell appears to startup, create a realm, then
read preferences. Prior to this patch, by the time the prefs were read, it was
too late for the realm creation option to take effect. This way, we update the
realm option.
Differential Revision: https://phabricator.services.mozilla.com/D84426
Useful for the public BigInt API which takes Range<const CharT>. Allows
JS::NumberToBigInt(JSContext*, JS::ConstLatin1Chars)
in order to match
JS::NumberToBigInt(JSContext*, JS::ConstTwoByteChars)
which already exists.
Differential Revision: https://phabricator.services.mozilla.com/D82797
Originally we used to rekey our hash tables when GC things were moved but nowadays we generally use MovableCellHasher and key them on unique IDs that don't change when this happens.
Interestingly this exposed one place where the original rekeying strategy was still in use unnecessarily.
Differential Revision: https://phabricator.services.mozilla.com/D84337
Useful for the public BigInt API which takes Range<const CharT>. Allows
JS::NumberToBigInt(JSContext*, JS::ConstLatin1Chars)
in order to match
JS::NumberToBigInt(JSContext*, JS::ConstTwoByteChars)
which already exists.
Differential Revision: https://phabricator.services.mozilla.com/D82797
This changes the way the the host triggers cleanup, from calling an API to calling a JS function. This is done so we can use the existing DOM infrastructure that handles setting up the incumbent global for us.
Differential Revision: https://phabricator.services.mozilla.com/D83613
This changes the way the the host triggers cleanup, from calling an API to calling a JS function. This is done so we can use the existing DOM infrastructure that handles setting up the incumbent global for us.
Differential Revision: https://phabricator.services.mozilla.com/D83613
CLOSED TREE
Backed out changeset 8b21977bb2df (bug 1648453)
Backed out changeset 4cac71f274b8 (bug 1648453)
Backed out changeset a9ad01b4ab2e (bug 1648453)
This changes the way the the host triggers cleanup, from calling an API to calling a JS function. This is done so we can use the existing DOM infrastructure that handles setting up the incumbent global for us.
Differential Revision: https://phabricator.services.mozilla.com/D83613
I added an overload of IsInsideNursery for TenuredCell that always returns false so that IsAboutToBeFinalizedInternal should optimise out the nursery check for things that are never allocated in the nursery.
Differential Revision: https://phabricator.services.mozilla.com/D83308
The Iterator Helpers proposal defines 6 lazy synchronous iterator methods,
`map`, `filter`, `take`, `drop`, `asIndexedPairs`, and `flatMap`.
All of these methods are to return the newly defined builtin generators,
which we implement as self-hosted generators that are stored in an
internal slot and thus observably distinct from regular generators.
Differential Revision: https://phabricator.services.mozilla.com/D81743
In the JIT frame sampler, we apply the appropriate category in addition to
the "implementation" field. For JS frames (IS_JS_FRAME), we identify as
either BaselineInterpreter or Interpreter. Note that JS_Other still applies
to various places we enter SpiderMonkey outside of RunScript.
Differential Revision: https://phabricator.services.mozilla.com/D79524
Replace the duplicate lists in mozglue/baseprofiler/public and js/public with
a shared list. Add this list to both moz.build files so it is published twice
which simplifies supporting different standalone configurations.
Differential Revision: https://phabricator.services.mozilla.com/D79520
The 'asyncStack' flag on JS execution contexts is used as a general switch
to enable async stack capture across all locations in SpiderMonkey, but
this causes problems because it can at times be too much of a performance
burden to general and track all of these stacks.
Since the introduction of this option, we have only enabled it on Nightly
and DevEdition for non-mobile builds, which has left a lot of users unable
to take advantage of this data while debugging.
This patch enables async stack traces across all of Firefox, but introduces
a new pref to toggle the scope of the actual expensive part of async stacks,
which is _capturing_ them and keeping them alive in memory. The new pref
limits the capturing of async stack traces to only debuggees, unless an
explicit pref is flipped to capture async traces for all cases.
This means that while async stacks are technically enabled, and code could
manually capture a stack and pass it back to SpiderMonkey and see that stack
reflected in later captured stacks, SpiderMonkey itself and related async
DOM APIs, among others, will not capture stacks or pass them to SpiderMonkey,
so there should be no general change in performance by enabling the broader
feature itself, unless the user is actively debugging the page.
One effect of this patch is that if you have the debugger open and then close
it, objects that have async stacks associated with them will retain those
stacks and they will continue to show up in stack traces, no _new_ stacks
will be captured. jorendorff and I have decided that this is okay because
the expectation that the debugger fully revert every possible effect that it
could have on a page is a nice goal but not a strict requirement.
Differential Revision: https://phabricator.services.mozilla.com/D68503
The 'asyncStack' flag on JS execution contexts is used as a general switch
to enable async stack capture across all locations in SpiderMonkey, but
this causes problems because it can at times be too much of a performance
burden to general and track all of these stacks.
Since the introduction of this option, we have only enabled it on Nightly
and DevEdition for non-mobile builds, which has left a lot of users unable
to take advantage of this data while debugging.
This patch enables async stack traces across all of Firefox, but introduces
a new pref to toggle the scope of the actual expensive part of async stacks,
which is _capturing_ them and keeping them alive in memory. The new pref
limits the capturing of async stack traces to only debuggees, unless an
explicit pref is flipped to capture async traces for all cases.
This means that while async stacks are technically enabled, and code could
manually capture a stack and pass it back to SpiderMonkey and see that stack
reflected in later captured stacks, SpiderMonkey itself and related async
DOM APIs, among others, will not capture stacks or pass them to SpiderMonkey,
so there should be no general change in performance by enabling the broader
feature itself, unless the user is actively debugging the page.
One affect of this patch is that if you have the debugger open and then close
it, objects that have async stacks associated with them will retain those
stacks and they will continue to show up in stack traces, no _new_ stacks
will be captured. jorendorff and I have decided that this is okay because
the expectation that the debugger fully revert every possible effect that it
could have on a page is a nice goal but not a strict requirement.
Differential Revision: https://phabricator.services.mozilla.com/D68503
This patch bases on the work of krystalyang2 with minor bug fixes.
The patch includes several major parts,
1. mark nursery strings pointed by tenured strings as
non-deduplicatable,
2. deduplicate strings when they are moved to tenured from nursery, and
3. adjust dependent strings to correct their pointers to the base
string and external buffer after tenuring.
4. reorder store buffer processing to trace the string whole cell buffer
first, since strings traced through the whole cell buffer need to be marked
non-deduplicatable.
(Part 4 was originally phabricator D77715 but is now merged in here.)
Differential Revision: https://phabricator.services.mozilla.com/D74366
Previously there was a quirk of nursery size rounding where requesting a nursery size of just less than a chunk resulted in a smaller size than a chunk being used even if one chunk was the nearest valid size.
This was to avoid having a sub-chunk size greater than the amount of usable space in a chunk, but that is not possible anyway as long as the sub-chunk step is greater than the chunk trailer size.
The patch simplifies the rounding code and updates the test code to add a size that will give different results depending on chunk size. I had to add a gcparam to get the chunk size to make this work.
Differential Revision: https://phabricator.services.mozilla.com/D79148
Introduce an IS_BLINTERP_FRAME flag to ProfilingStackFrame to distinguish C++
and Baseline interpreter frames. In the profile data this sets the
"implementation" to "blinterp".
Differential Revision: https://phabricator.services.mozilla.com/D78725
Round the number of reserved flag bits up to 16. This leaves 16-bits for the
category (so 64k subcategories). Also make the baseprofiler consistent.
Differential Revision: https://phabricator.services.mozilla.com/D78724
Also add `JS::WasmModule::createObjectForAsmJS` which is similar to
`createObject` but does not set the WasmModule prototype. Also mark these
methods as const so they work with the `SharedModule` definition.
Differential Revision: https://phabricator.services.mozilla.com/D78088
Implement Iterator.from static method from the Iterator Helpers proposal.
Involves adding a WrapForValidIterator object and prototype that is used
to wrap iterators returned by `Iterator.from`.
Differential Revision: https://phabricator.services.mozilla.com/D77178
This adds an extra pref for whether the cleanupSome method is exposed and renames the existing pref. We can turn on the pref to expose cleanupSome to get test262 coverage in the browser.
Differential Revision: https://phabricator.services.mozilla.com/D77267
The main problem here is that we sweep weak caches off-thread, and when we finish sweeping a hash table the Enum class' destructor can rehash or resize the table, causing store buffer entries to be added or removed (since the table may now contain nursery pointers).
To address this the patch adds a store buffer lock and establishes that all off-thread store buffer access from inside the GC must take place with this lock held. The changes to GCHashSet/Map are a little gross; perhaps it would be better to add an explicit API to hash tables to allow us to postpone the rehash/resize operations but I haven't done that here.
Other complications are:
The TypeSetRef generic buffer entries can contain pointers into TI data that is moved during sweeping. We therefore do need to collect the nursery if there are any of those present. This was relatively rare in testing.
Finally, swapping objects can result in pointers into dying objects being put in the whole cell store buffer (because we do tricks with skipping barriers when we remap wrappers to not keep otherwise dead wrappers alive). We need to collect the nursery if these are present to prevent them being accessed after the dying objects are finalized.
Differential Revision: https://phabricator.services.mozilla.com/D77831
Some platforms have variable polarity for the IEEE-754 NaN signalling bit.
This patch allows those platforms to compute the appropriate value at
startup. The NaN must still meet the requirements of ValueIsDouble.
Differential Revision: https://phabricator.services.mozilla.com/D76573
This patch removes the compile-time and run-time flags that disable BigInt/I64 so that the feature can be shipped. It also adjusts/removes tests as appropriate to account for the removed code paths.
Differential Revision: https://phabricator.services.mozilla.com/D74142
This patch will do :
- create a sub-category `Cubeb`
- add `Cubeb` profiling labels in related codes
The advantage of doing so :
- allow us to know the percentage of time respectively we spend on cubeb and non-cubeb codes
More details :
The profiling code would include `<atomic>` which is C++ only, so I can't use the label in `cubeb.c` directly. Instead, I add labels on the `AudioStream` and `AudioCallbackDriver` where we would call cubeb related methods.
Differential Revision: https://phabricator.services.mozilla.com/D74172
This patch will do :
- create a profiling category `Media`
- add `Media` profiling labels in related codes
The advantage of doing so :
- allow us to easily see what operations are related to media playback from the profiled report
More details :
According to the description in the `ProfilingCategory.h`, `topmost profiler label frame in the label stack determines the category pair of that stack`. Therefore, most labels I added are the first task would run on the thread, in order to ensure all its following tasks can be marked as the media playback label as well.
Differential Revision: https://phabricator.services.mozilla.com/D74171
From the existing usage of the function, it seems like it should either
take a JSErrorReport with no toStringResult, or a JS::ErrorReportBuilder
where it can get both the JSErrorReport and the toStringResult.
Differential Revision: https://phabricator.services.mozilla.com/D73523
From the existing usage of the function, it seems like it should either
take a JSErrorReport with no toStringResult, or a JS::ErrorReportBuilder
where it can get both the JSErrorReport and the toStringResult.
Differential Revision: https://phabricator.services.mozilla.com/D73523
Implements the spec changes from: https://github.com/tc39/proposal-weakrefs/pull/187
The spec change removes the `FinalizationRegistryCleanupIterator` in favour of
calling the clean-up callback for each finalised value. It also allows to call
`cleanupSome()` within the callback function.
`FinalizationRegistryObject::cleanupQueuedRecords()` has been changed to iterate
from back to front, because this allows us to call `GCVector::popCopy()`, which
makes it more efficient to remove entries from the `records` vector.
Differential Revision: https://phabricator.services.mozilla.com/D70821
To make sure that `<input>` elements with `pattern` attributes update their validation state (`:invalid`) properly, nsContentUtils::IsPatternMatching needs to be able to distinguish between parsing errors caused by an invalid pattern, vs parsing errors caused by OOM/overrecursion.
This patch also fixes up the places inside the new regexp engine where we can throw over-recursed to make sure that we set the right flag on the context, then fixes regexp/huge-01.js (and the binast variants) to accept a different error message.
Differential Revision: https://phabricator.services.mozilla.com/D74499
Implements the spec changes from: https://github.com/tc39/proposal-weakrefs/pull/187
The spec change removes the `FinalizationRegistryCleanupIterator` in favour of
calling the clean-up callback for each finalised value. It also allows to call
`cleanupSome()` within the callback function.
`FinalizationRegistryObject::cleanupQueuedRecords()` has been changed to iterate
from back to front, because this allows us to call `GCVector::popCopy()`, which
makes it more efficient to remove entries from the `records` vector.
Differential Revision: https://phabricator.services.mozilla.com/D70821
We can reduce the size of the SSO by removing the element slot entirely, and instead retrieve the element through a callback function. The callback will take in the value in the private slot of the SSO, which is either a LoadedScript* (from the browser) or a JSObject* (from the shell). In addition, this removes the requirement of having a script dom element ready when parsing a JS script which can open up new opportunities for performance.
Differential Revision: https://phabricator.services.mozilla.com/D70417