Flushing the cache at startup is already handled automatically by the
AppStartup code, which removes the entire startupCache directory when
necessary. The add-on manager requires being able to flush the cache at
runtime, though, for the sake of updating bootstrapped add-ons.
MozReview-Commit-ID: LIdiNHrXYXu
--HG--
extra : source : 8f4637881ddc42a948c894e62c8486fe8677a938
extra : histedit_source : e69395a2b87b2b0edb394686ed6ee24731ba9fb8
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : source : 1c7df945505930d2d86a076ee20807104324c8cc
extra : histedit_source : 75e193839edf727874f01b2a9f6852f6c1f087fb%2C3ce966d7dcf2bd0454a7d673d0467097456bd782
Flushing the cache at startup is already handled automatically by the
AppStartup code, which removes the entire startupCache directory when
necessary. The add-on manager requires being able to flush the cache at
runtime, though, for the sake of updating bootstrapped add-ons.
MozReview-Commit-ID: LIdiNHrXYXu
--HG--
extra : rebase_source : e5b16490f47e20c78d081ad03dec02c6b2874fc3
extra : absorb_source : 6cd94504c8247f375161b2afdca5c61d59cf8f01
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : rebase_source : 2ec24c8b0000f9187a9bf4a096ee8d93403d7ab2
extra : absorb_source : bb9d799d664a03941447a294ac43c54f334ef6f5
The patch makes the following proxy changes:
* The number of slots in ProxyValueArray is now dynamic and depends on the number of reserved slots we get from the Class.
* "Extra slots" was renamed to "Reserved slots" to make this clearer.
* All proxy Classes now have 2 reserved slots, but it should be easy to change that for proxy Classes that need more than 2 slots.
* Proxies now store a pointer to these slots and this means GetReservedSlot and SetReservedSlot can be used on proxies as well. We no longer need GetReservedOrProxyPrivateSlot and SetReservedOrProxyPrivateSlot.
And some changes to make DOM Proxies work with this:
* We now store the C++ object in the first reserved slot (DOM_OBJECT_SLOT) instead of in the proxy's private slot. This is pretty nice because it matches what we do for non-proxy DOM objects.
* We now store the expando in the proxy's private slot so I removed GetDOMProxyExpandoSlot and changed the IC code to get the expando from the private slot instead.
When we just had CycleCollectedJSContext (and no CycleCollectedJSRuntime) a
weird thing happened at shutdown:
1. We would call JS_DestroyContext from ~CycleCollectedJSContext. By that time,
the ~XPCJSContext destructor had already finished.
2. Destroying the context runs a final GC. That GC would call back into various
GC callbacks, such as TraceBlackJS and TraceGrayJS.
3. These callbacks would do a virtual method call:
http://searchfox.org/mozilla-central/rev/876c7dd30586f9c6f9c99ef7444f2d73c7acfe7c/xpcom/base/CycleCollectedJSRuntime.cpp#791
4. Normally this method call would call into
XPCContext::TraceNativeBlackRoots. However, C++ changes the vtable for an
object during destruction. So we would only call CycleCollectedJSContext's
version of TraceNativeBlackRoots, which is empty. So we never traced anything.
When I moved this code into the runtime, we actually do call into
XPCJSRuntime::TraceNativeBlackRoots at that time. So the behavior changed, and
that was causing crashes once I nulled out the TLS as you asked. So I removed
these callbacks for the last GC.
MozReview-Commit-ID: 3do13bjpwQj
This patch keeps a list of all the cooperatively scheduled contexts that are
linked to a runtime. In places where we need to iterate over all contexts (for
GC, specifically), it iterates over the list.
MozReview-Commit-ID: 3pKJX78f2l0
This field assumes there is one XPCJSContext globally (i.e., per nsXPConnect
instance). This patch eliminates the field by using different paths to
reach the context.
MozReview-Commit-ID: FjR6cTZ5QfZ
XPCJSContext::Get() now does a TLS lookup, which is a little more expensive
than looking up a global variable. This patch removes as many of the TLS
lookups as possible.
MozReview-Commit-ID: GsqzJn55Lya
Once we have multiple XPCJSContext's, we may have to figure out which one we're
using with TLS. A later patch tries to remove as many of these TLS lookups
as possible so that performance doesn't suffer.
MozReview-Commit-ID: 50uHpDLZmUH
Change FormatStackDump to return UniqueChars and fix up the users. This
removes a bit more manual memory management.
MozReview-Commit-ID: 60GBgeS4rzg
--HG--
extra : rebase_source : 15060321f567816ca434cdf1ef816d8322ceefff
TimeStamp::ProcessCreations()'s aIsInconsistent outparam is ignored by the
majority of its caller. This patch makes it optional. Notably, this makes
ProcessCreation() easier to use in a constructor's initializer list.
Other browsers do not support any of these (IIRC), telemetry reports
essentially zero usage, and supporting them is contrary to the DOM spec.
Notes on specific events:
CommandEvent and SimpleGestureEvent: These are not supposed to be
web-exposed APIs, so I hid the interfaces from web content too
(necessary to avoid test_all_synthetic_events.html failures).
DataContainerEvent: This was a non-standard substitute for CustomEvent
that seemed to have only one user, so I removed it entirely and switched
the user (MozillaFileLogger.js) to CustomEvent.
ScrollAreaEvent: This is entirely non-standard, but we apparently expose
it deliberately to web content, so I didn't see any reason to remove it
from createEvent.
SimpleGestureEvent and XULCommandEvent: Can still be created from
createEvent(), but not by content.
TimeEvent: This is still in because it has no constructor, so there's no
other way to create it. Ideally we'd update the SMIL spec to add a
constructor. I did remove TimeEvents.
MozReview-Commit-ID: 7Yi2oCl9SM2
--HG--
extra : rebase_source : 199ab921acfc531b8b85e77f90fcd799b03c887b
This changes JS_smprintf to return UniqueChars, rather than relying on
manual memory management.
MozReview-Commit-ID: ENjQJODYdD1
--HG--
extra : rebase_source : 4c8ad4719dce205a7ef25e41eca25c5af793bb47
With changes introduced in bug 1356066 I made the xpc_LocalizeContext be
called on each app locale change to update the default locale in each context.
Unfortunately, this function is also assigning the locale callbacks and
with my change it started doing it on each language change.
In this patch I'm first checking if we do have XPCLocaleCallbacks for the
given context and only if we don't, I assign them.
MozReview-Commit-ID: 7AiCsJfKBID
--HG--
extra : rebase_source : 1efe65895759ffc07e0047d063a405d757cb1092
NS_SetCurrentThreadName() is added as an alternative to PR_SetCurrentThreadName()
inside libxul. The thread names are collected in the form of crash annotation to
be processed on socorro.
MozReview-Commit-ID: 4RpAWzTuvPs