Since we will store mCompletionPromise in the sub-class of ThenValueBase,
ThenCommand needs to reference the sub-type in order to access mCompletionPromise.
MozReview-Commit-ID: BUi7jElOhP7
--HG--
extra : rebase_source : e94c7da8488bb51e543740149925c4cb6514ad54
extra : source : 21dc7e0202dcc64a781c1d92d4d1b7988b5d37a2
This allows us to remove 2 overloads of MozPromise::Then() using variadic template.
MozReview-Commit-ID: 5LHwDhIhh8e
--HG--
extra : rebase_source : 9b84a92858736b389bd2e60aa7392bfec526ab72
extra : source : 3cdd047583693a7abf479dd75620d1c7d07da70d
PLDHashTable takes the result of the hash function and multiplies it by
kGoldenRatio to ensure that it has a good distribution of bits across
the 32-bit hash value, and then zeroes out the low bit so that it can be
used for the collision flag. This result is called hash0. From hash0
it computes two different numbers used to find entries in the table
storage: hash1 is used to find an initial position in the table to
begin searching for an entry; hash2 is then used to repeatedly offset
that position (mod the size of the table) to build a chain of positions
to search.
In a table with capacity 2^c entries, hash1 is simply the upper c bits
of hash0. This patch does not change this.
Prior to this patch, hash2 was the c bits below hash1, padded at the low
end with zeroes when c > 16. (Note that bug 927705, changeset
1a02bec165e16f370cace3da21bb2b377a0a7242, increased the maximum capacity
from 2^23 to 2^26 since 2^23 was sometimes insufficient!) This manner
of computing hash2 is problematic because it increases the risk of long
chains for very large tables, since there is less variation in the hash2
result due to the zero padding.
So this patch changes the hash2 computation by using the low bits of
hash0 instead of shifting it around, thus avoiding 0 bits in parts of
the hash2 value that are significant.
Note that this changes what hash2 is in all cases except when the table
capacity is exactly 2^16, so it does change our hashing characteristics.
For tables with capacity less than 2^16, it should be using a different
second hash, but with the same amount of random-ish data. For tables
with capacity greater than 2^16, it should be using more random-ish
data.
Note that this patch depends on the patch for bug 1353458 in order to
avoid causing test failures.
MozReview-Commit-ID: JvnxAMBY711
--HG--
extra : transplant_source : 2%D2%C2%CE%E1%92%C8%F8H%D7%15%A4%86%5B%3Ac%0B%08%3DA
PLDHashTable's entry store has two types of unoccupied entries: free
entries and removed entries. The search of a chain of entries
(determined by the hash value) in the entry store to search for an entry
can stop at free entries, but it continues across removed entries,
because removed entries are entries that may have been skipped over when
we were adding the value we're searching for to the hash, but have since
been removed. For live entries, we also maintain this distinction by
using one bit of storage for a collision flag, which notes that if the
hashtable entry is removed, its place in the entry store must become a
removed entry rather than a free entry.
When we add a new entry to the table, Add's semantics require that we
return an existing entry if there is one, and only create a new entry if
no existing entry exists. (Bug 1352198 suggests the possibility of a
faster alternative Add API where the caller guarantees that the key is
not already in the hashtable.) When we search for the existing entry,
we must thus continue the search across removed entries, even though we
record the first removed entry found to return if the search for an
existing entry fails.
The existing code adds the collision flag through the entire table
search during an Add. This patch changes that behavior so that we only
add the collision flag prior to finding the first removed entry. Adding
it after we find the first removed entry is unnecessary, since we are
not making that entry part of a path to a new entry. If it is part of a
path to an existing entry, it will already have the collision flag set.
This patch effectively puts an if (!firstRemoved) around the else branch
of the if (MOZ_UNLIKELY(EntryIsRemoved(entry))), and then refactors that
condition outwards since it is now around the contents of both the if
and else branches.
MozReview-Commit-ID: CsXnMYttHVy
--HG--
extra : transplant_source : %80%9E%83%EC%CCY%B4%B0%86%86%18%99%B6U%21o%5D%29%AD%04
This includes renaming its fields to match SpiderMonkey naming conventions
instead of Gecko naming conventions.
This patch is just about moving the code. The next patch will change
SpiderMonkey to actually use PseudoStack directly.
--HG--
extra : rebase_source : 27e77ddf950201eb6bdba60003218056442cf7ab
A previous bug missed a few places where we could theoretically
reenter the CC.
MozReview-Commit-ID: I0otlAEwyZa
--HG--
extra : rebase_source : 064b127a2c28a52b2807cd78031de9af4f258f60
The code already handles this if the r32 is eax. This allows it to use the other 32-bit registers.
--HG--
extra : histedit_source : 1cff5b54640cc48a0574b0b4323ad909e8a7e7b2
At the same time, remove the MOZ_STATIC_ASSERT_VALID_ARG_COUNT, which
doesn't actually work for more than 50 arguments(*), and which is now not
useful to detect 0 arguments.
(*) the build fails, but not directly thanks to the static_assert it
expands to.
--HG--
extra : rebase_source : 8f0fe7b352c89b5a3ec87f42ef5464c370c362ef
http://searchfox.org/mozilla-central/rev/d441cb24482c2e5448accaf07379445059937080/xpcom/threads/MozPromise.h#953-958
MozPromiseRequestHolder is not thread-safe and it is possible for
mReceiver->ThenInternal() to trigger resolve/reject callbacks before
aRequestHolder.Track() is run. We should call aRequestHolder.Track()
before mReceiver->ThenInternal() to avoid the race condition.
MozReview-Commit-ID: K2R09m9UFBF
--HG--
extra : rebase_source : d656498e74d470fdbb82e7525d82af5fb3100735
Delegating to mozilla::Runnable caused us to return the wrong value once
SchedulerGroup started passing a non-empty name to the Runnable constructor.
MozReview-Commit-ID: 2zMlpiMnHwv
We don't need the tiny helper function for single-character assignment,
and the multi-argument assign() method is the only thing that uses it,
which is also itself unused. So remove them both.
ProfileEntry has |string|, which can be static or dynamic, and |dynamicString|.
If |string| is dynamic, the FRAME_LABEL_COPY flag must be set, and it will be
copied into profiler output.
But there is only one place that uses dynamic |string| values, in SpiderMonkey.
And that place doesn't use |dynamicString|. So this patch changes that place to
use an empty |string| and put the old dynamic |string| value in
|dynamicString|. This in turn removes the need for FRAME_LABEL_COPY.
One minor wrinkle is that when |dynamicString| is used the old code put a space
between |string| and |dynamicString|. The new code omits the space if |string|
is empty.
The patch also renames ProfileEntry::string as ProfileEntry::label_, which
better matches how it's used, and ProfileEntry::dynamicString as
ProfileEntry::dynamicString_ so the getter can be renamed dynamicString().
This patch uses the profiler_suspend_sample_thread method which was added in
part 1.
With this patch, we no longer manually run code to pause the target thread,
instead using the profiler's provided code to do so. In addition, we no longer
manually walk the stack to collect native stack frames, instead relying on the
profiler's cross-platform stack walking logic.
This helps remove some of the code from ThreadStackHelper which was redundant
with the profiler. Much of the pseudostack code in ThreadStackHelper is also
redundant, and should hopefully be eliminated in a follow-up.
MozReview-Commit-ID: 4RjLHt6inH9
Accumulates the time in between running unlabeled runnables in a histogram. This
measurement will be useful to see how much of a win the cooperative scheduler
will be, assuming we label no more runnables.
MozReview-Commit-ID: 9lgoGJcXLP9
Add breakpoint support for AArch64, and fix a scanf format specifier
warning. Also fix an #if line in xptcinvoke_arm.cpp to work as intended.
MozReview-Commit-ID: BSjYVD8Zq0t
This function is arguably nicer than calling NS_ProcessNextEvent
manually, is slightly more efficient, and will enable better auditing
for NS_ProcessNextEvent when we do Quantum DOM scheduling changes.
When IsExclusive is true, there is at most only one consumer.
So it is safe to move the ResolveOrRejectValue stored in the promise.
MozReview-Commit-ID: ED9fFr7TkvN
--HG--
extra : rebase_source : 57c6732279a2c76fc21c3e7bad17d70f9cd0834a
extra : intermediate-source : c55836b6708150e88c2ca3a804c2cf2fbcf79990
extra : source : 3285b59746e994283e101cc10267d3a642c96ab3
We now have code that unconditionally requires the rust
compiler and are committed to adding more. Remove this
last vestige of conditional support.
MozReview-Commit-ID: EK6FBnAbR
--HG--
extra : rebase_source : 6efda10a74f9ca0482304c2b1ffe6941e42138f8
This change adds some functions that should be used instead of PR_GetCurrentThread. They both return a PRThread. GetCurrentPhysicalThread does exactly the same thing as PR_GetCurrentThread, but it makes it clearer what you're getting when there are cooperatively scheduled threads running.
GetCurrentVirtualThread returns the same value for all threads in a cooperative thread pool. The actual value it returns is somewhat immaterial.
MozReview-Commit-ID: 4lFwsF2NuzC
A template won't be instatiated until used. We make it a compile error
to call Then() on ThenCommand<false> to disallow promise chaining.
MozReview-Commit-ID: BUCFgfX4FTJ
--HG--
extra : rebase_source : 0b46b9d10a9b7dec3f15ad4596d9726365f6f859
extra : source : 5676128a7a3ee68bf9eee01bbdf8b983117d5655
This annotates vsprintf-like functions with MOZ_FORMAT_PRINTF. This may
provide some minimal checking of such calls (the GCC docs say that it
checks for the string for "consistency"); but in any case shouldn't
hurt.
MozReview-Commit-ID: HgnAK1LiorE
--HG--
extra : rebase_source : 9c8d715d6560f89078c26ba3934e52a2b5778b6a
The OrInsert() method adds the new entry to the hashtable if needed, and
returns the newly added entry or the pre-existing one. It allows for a
more concise syntax at the call site.
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : source : 1c7df945505930d2d86a076ee20807104324c8cc
extra : histedit_source : 75e193839edf727874f01b2a9f6852f6c1f087fb%2C3ce966d7dcf2bd0454a7d673d0467097456bd782
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : rebase_source : 2ec24c8b0000f9187a9bf4a096ee8d93403d7ab2
extra : absorb_source : bb9d799d664a03941447a294ac43c54f334ef6f5
The lossy conversion to ASCII here can also fail; we should handle that
as well. Rewriting the code to use MakeScopeExit also avoids tangled
logic and/or duplicating calls to ensure the destination string is
truncated on failure.
PLDHashTable takes the result of the hash function and multiplies it by
kGoldenRatio to ensure that it has a good distribution of bits across
the 32-bit hash value, and then zeroes out the low bit so that it can be
used for the collision flag. This result is called hash0. From hash0
it computes two different numbers used to find entries in the table
storage: hash1 is used to find an initial position in the table to
begin searching for an entry; hash2 is then used to repeatedly offset
that position (mod the size of the table) to build a chain of positions
to search.
In a table with capacity 2^c entries, hash1 is simply the upper c bits
of hash0. This patch does not change this.
Prior to this patch, hash2 was the c bits below hash1, padded at the low
end with zeroes when c > 16. (Note that bug 927705, changeset
1a02bec165e16f370cace3da21bb2b377a0a7242, increased the maximum capacity
from 2^23 to 2^26 since 2^23 was sometimes insufficient!) This manner
of computing hash2 is problematic because it increases the risk of long
chains for very large tables, since there is less variation in the hash2
result due to the zero padding.
So this patch changes the hash2 computation by using the low bits of
hash0 instead of shifting it around, thus avoiding 0 bits in parts of
the hash2 value that are significant.
Note that this changes what hash2 is in all cases except when the table
capacity is exactly 2^16, so it does change our hashing characteristics.
For tables with capacity less than 2^16, it should be using a different
second hash, but with the same amount of random-ish data. For tables
with capacity greater than 2^16, it should be using more random-ish
data.
Note that this patch depends on the patch for bug 1353458 in order to
avoid causing test failures.
MozReview-Commit-ID: JvnxAMBY711
--HG--
extra : rebase_source : dbde93b9312dd10fb3a549ee4098b57e1df5cadf
See also bug 1361495 - PR_SI_HOSTNAME is implemented in NSPR on
Windows as initializing winsock which can be janky. There don't seem
to be any users of this property, and it has tracker concerns anyhow -
so remove it.
MozReview-Commit-ID: S2AwzMUgYk
--HG--
extra : rebase_source : 552bed219913f40c002b807be3239d4666a5284b
This possibly incurs an extra function call (depends on exactly how much inling
the compiler does). But it helps enormously with subsequent refactorings,
because PseudoStack (and related types) don't need to be visible in
GeckoProfiler.h, which is exported outside the profiler.
--HG--
extra : rebase_source : f2dc5952d7444dfe12e627e86e6d37632b283107
When we just had CycleCollectedJSContext (and no CycleCollectedJSRuntime) a
weird thing happened at shutdown:
1. We would call JS_DestroyContext from ~CycleCollectedJSContext. By that time,
the ~XPCJSContext destructor had already finished.
2. Destroying the context runs a final GC. That GC would call back into various
GC callbacks, such as TraceBlackJS and TraceGrayJS.
3. These callbacks would do a virtual method call:
http://searchfox.org/mozilla-central/rev/876c7dd30586f9c6f9c99ef7444f2d73c7acfe7c/xpcom/base/CycleCollectedJSRuntime.cpp#791
4. Normally this method call would call into
XPCContext::TraceNativeBlackRoots. However, C++ changes the vtable for an
object during destruction. So we would only call CycleCollectedJSContext's
version of TraceNativeBlackRoots, which is empty. So we never traced anything.
When I moved this code into the runtime, we actually do call into
XPCJSRuntime::TraceNativeBlackRoots at that time. So the behavior changed, and
that was causing crashes once I nulled out the TLS as you asked. So I removed
these callbacks for the last GC.
MozReview-Commit-ID: 3do13bjpwQj
This patch keeps a list of all the cooperatively scheduled contexts that are
linked to a runtime. In places where we need to iterate over all contexts (for
GC, specifically), it iterates over the list.
MozReview-Commit-ID: 3pKJX78f2l0
This removes the last uses of PR_smprintf from the tree (excluding the
security and nsprpub directories). It also fixes a related latent bug
in nsAppRunner.cpp (which was incorrectly freeing the pointer passed to
PR_SetEnv).
MozReview-Commit-ID: GynP2PhuWWO
--HG--
extra : rebase_source : c3b83c7bd08b1c222e137a00323caf5481352845
StoreCopyPassByRRef<> ensures a copy is stored in the runnable.
We don't have to worry about the concern of bug 1300476.
MozReview-Commit-ID: DHqlzlVLBFV
--HG--
extra : rebase_source : 77f2175611aa6fad88207a771de75fd28fd46f21
extra : source : 429c62928fd43185da45c905a150cfbe84cb3cf7
This change moves most of the logic for the threadsafety check into
nsAutoOwningThread, rather than having part of the logic live in
nsAutoOwningThread and part of the logic live in nsDebug.h. Changing
this also forces us to clean up a couple of places that replicated the
logic that lived in nsDebug.h as well.
TimeStamp::ProcessCreations()'s aIsInconsistent outparam is ignored by the
majority of its caller. This patch makes it optional. Notably, this makes
ProcessCreation() easier to use in a constructor's initializer list.
Other browsers do not support any of these (IIRC), telemetry reports
essentially zero usage, and supporting them is contrary to the DOM spec.
Notes on specific events:
CommandEvent and SimpleGestureEvent: These are not supposed to be
web-exposed APIs, so I hid the interfaces from web content too
(necessary to avoid test_all_synthetic_events.html failures).
DataContainerEvent: This was a non-standard substitute for CustomEvent
that seemed to have only one user, so I removed it entirely and switched
the user (MozillaFileLogger.js) to CustomEvent.
ScrollAreaEvent: This is entirely non-standard, but we apparently expose
it deliberately to web content, so I didn't see any reason to remove it
from createEvent.
SimpleGestureEvent and XULCommandEvent: Can still be created from
createEvent(), but not by content.
TimeEvent: This is still in because it has no constructor, so there's no
other way to create it. Ideally we'd update the SMIL spec to add a
constructor. I did remove TimeEvents.
MozReview-Commit-ID: 7Yi2oCl9SM2
--HG--
extra : rebase_source : 199ab921acfc531b8b85e77f90fcd799b03c887b
Change mozilla::Smprintf and friends to return a UniquePtr, rather than
relying on manual memory management. (Though after this patch there are
still a handful of spots needing SmprintfFree.)
MozReview-Commit-ID: COa4nzIX5qa
--HG--
extra : rebase_source : ab4a11b4d2e758099bd0794d5c25d799a7e42680
This allows more use of the implicit version of InvokeAsync() without specifying the storage types explicitly.
MozReview-Commit-ID: 40WisaVX8Jy
--HG--
extra : rebase_source : ba34515788f0bc8264fac9a6897e234966d8b762
extra : source : b651963fe562755c0b2998ae6a95ffad400060ad
NS_SetCurrentThreadName() is added as an alternative to PR_SetCurrentThreadName()
inside libxul. The thread names are collected in the form of crash annotation to
be processed on socorro.
MozReview-Commit-ID: 4RpAWzTuvPs
clang's -Wcomma warning warns about suspicious use of the comma operator such as calling a function for side effects within an expression. Check NS_SUCCEEDED() to use HasMoreElement() in an expression and confirm that it actually returned a legitimate out parameter.
xpcom/io/nsLocalFileUnix.cpp:725:48 [-Wcomma] possible misuse of comma operator here
xpcom/io/nsLocalFileUnix.cpp:1053:39 [-Wcomma] possible misuse of comma operator here
MozReview-Commit-ID: aebrgc5Wqk
--HG--
extra : rebase_source : 18a2bbeb55ff106cdcb88e4889c7fd2745952a31
extra : source : e62ac5b8744d342948ab12f21f369d21f857a706
Separate AbstractThread::InitTLS and
AbstractThread::InitMainThread. Init AbstractThread main thread when
init nsThreadManager. Init AbstractThread TLS for all content process
types because for plugin and gmp processes we are doing IPC even
without init XPCOM and for content process init XPCOM requires IPC.
MozReview-Commit-ID: DhLub23oZz8
--HG--
extra : rebase_source : 6e4bfa03ec69e1eb694924903f1fa5e7259cbba3
Everything depending on the widget being gonk can go away, as well as
everything depending on MOZ_AUDIO_CHANNEL_MANAGER, which was only
defined on gonk builds under b2g/ (which goes away in bug 1357326).
--HG--
extra : rebase_source : 9f0aeeb7eea8417fa4e06d662d566d67ecaf2a24
This patch centralizes all of the pref-checking code for e10s-multi in a
single function. It is intended to be used throughout the codebase to see if
e10s-multi is "on". It also introduces dom.ipc.multiOptOut, which can be set
by the user to indicate that they do not want to participate in the e10s-multi
experiment.
MozReview-Commit-ID: Kyq1fqNzwue
--HG--
extra : rebase_source : 321a44fb5909c067a20dbb3b739175ba08569a5f
This is necessary to allow helper threads to attempt large allocations and recover from fragmentation situations with the LargeAllocationFailureCallback.
MozReview-Commit-ID: AyA3pbXcaYy
--HG--
extra : rebase_source : 7a5feb779b690ec7f123481e76f2390c850dac91
To run JS in separate cooperative threads, we need to split up per-thread state
from per-runtime state. This patch does that for XPConnect.
MozReview-Commit-ID: 407SlJ7nR6v