Flushing the cache at startup is already handled automatically by the
AppStartup code, which removes the entire startupCache directory when
necessary. The add-on manager requires being able to flush the cache at
runtime, though, for the sake of updating bootstrapped add-ons.
MozReview-Commit-ID: LIdiNHrXYXu
--HG--
extra : source : 8f4637881ddc42a948c894e62c8486fe8677a938
extra : histedit_source : e69395a2b87b2b0edb394686ed6ee24731ba9fb8
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : source : 1c7df945505930d2d86a076ee20807104324c8cc
extra : histedit_source : 75e193839edf727874f01b2a9f6852f6c1f087fb%2C3ce966d7dcf2bd0454a7d673d0467097456bd782
Flushing the cache at startup is already handled automatically by the
AppStartup code, which removes the entire startupCache directory when
necessary. The add-on manager requires being able to flush the cache at
runtime, though, for the sake of updating bootstrapped add-ons.
MozReview-Commit-ID: LIdiNHrXYXu
--HG--
extra : rebase_source : e5b16490f47e20c78d081ad03dec02c6b2874fc3
extra : absorb_source : 6cd94504c8247f375161b2afdca5c61d59cf8f01
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : rebase_source : 2ec24c8b0000f9187a9bf4a096ee8d93403d7ab2
extra : absorb_source : bb9d799d664a03941447a294ac43c54f334ef6f5
The patch makes the following proxy changes:
* The number of slots in ProxyValueArray is now dynamic and depends on the number of reserved slots we get from the Class.
* "Extra slots" was renamed to "Reserved slots" to make this clearer.
* All proxy Classes now have 2 reserved slots, but it should be easy to change that for proxy Classes that need more than 2 slots.
* Proxies now store a pointer to these slots and this means GetReservedSlot and SetReservedSlot can be used on proxies as well. We no longer need GetReservedOrProxyPrivateSlot and SetReservedOrProxyPrivateSlot.
And some changes to make DOM Proxies work with this:
* We now store the C++ object in the first reserved slot (DOM_OBJECT_SLOT) instead of in the proxy's private slot. This is pretty nice because it matches what we do for non-proxy DOM objects.
* We now store the expando in the proxy's private slot so I removed GetDOMProxyExpandoSlot and changed the IC code to get the expando from the private slot instead.
When we just had CycleCollectedJSContext (and no CycleCollectedJSRuntime) a
weird thing happened at shutdown:
1. We would call JS_DestroyContext from ~CycleCollectedJSContext. By that time,
the ~XPCJSContext destructor had already finished.
2. Destroying the context runs a final GC. That GC would call back into various
GC callbacks, such as TraceBlackJS and TraceGrayJS.
3. These callbacks would do a virtual method call:
http://searchfox.org/mozilla-central/rev/876c7dd30586f9c6f9c99ef7444f2d73c7acfe7c/xpcom/base/CycleCollectedJSRuntime.cpp#791
4. Normally this method call would call into
XPCContext::TraceNativeBlackRoots. However, C++ changes the vtable for an
object during destruction. So we would only call CycleCollectedJSContext's
version of TraceNativeBlackRoots, which is empty. So we never traced anything.
When I moved this code into the runtime, we actually do call into
XPCJSRuntime::TraceNativeBlackRoots at that time. So the behavior changed, and
that was causing crashes once I nulled out the TLS as you asked. So I removed
these callbacks for the last GC.
MozReview-Commit-ID: 3do13bjpwQj
This patch keeps a list of all the cooperatively scheduled contexts that are
linked to a runtime. In places where we need to iterate over all contexts (for
GC, specifically), it iterates over the list.
MozReview-Commit-ID: 3pKJX78f2l0
This field assumes there is one XPCJSContext globally (i.e., per nsXPConnect
instance). This patch eliminates the field by using different paths to
reach the context.
MozReview-Commit-ID: FjR6cTZ5QfZ
XPCJSContext::Get() now does a TLS lookup, which is a little more expensive
than looking up a global variable. This patch removes as many of the TLS
lookups as possible.
MozReview-Commit-ID: GsqzJn55Lya
Once we have multiple XPCJSContext's, we may have to figure out which one we're
using with TLS. A later patch tries to remove as many of these TLS lookups
as possible so that performance doesn't suffer.
MozReview-Commit-ID: 50uHpDLZmUH
Change FormatStackDump to return UniqueChars and fix up the users. This
removes a bit more manual memory management.
MozReview-Commit-ID: 60GBgeS4rzg
--HG--
extra : rebase_source : 15060321f567816ca434cdf1ef816d8322ceefff
TimeStamp::ProcessCreations()'s aIsInconsistent outparam is ignored by the
majority of its caller. This patch makes it optional. Notably, this makes
ProcessCreation() easier to use in a constructor's initializer list.
Other browsers do not support any of these (IIRC), telemetry reports
essentially zero usage, and supporting them is contrary to the DOM spec.
Notes on specific events:
CommandEvent and SimpleGestureEvent: These are not supposed to be
web-exposed APIs, so I hid the interfaces from web content too
(necessary to avoid test_all_synthetic_events.html failures).
DataContainerEvent: This was a non-standard substitute for CustomEvent
that seemed to have only one user, so I removed it entirely and switched
the user (MozillaFileLogger.js) to CustomEvent.
ScrollAreaEvent: This is entirely non-standard, but we apparently expose
it deliberately to web content, so I didn't see any reason to remove it
from createEvent.
SimpleGestureEvent and XULCommandEvent: Can still be created from
createEvent(), but not by content.
TimeEvent: This is still in because it has no constructor, so there's no
other way to create it. Ideally we'd update the SMIL spec to add a
constructor. I did remove TimeEvents.
MozReview-Commit-ID: 7Yi2oCl9SM2
--HG--
extra : rebase_source : 199ab921acfc531b8b85e77f90fcd799b03c887b
This changes JS_smprintf to return UniqueChars, rather than relying on
manual memory management.
MozReview-Commit-ID: ENjQJODYdD1
--HG--
extra : rebase_source : 4c8ad4719dce205a7ef25e41eca25c5af793bb47
With changes introduced in bug 1356066 I made the xpc_LocalizeContext be
called on each app locale change to update the default locale in each context.
Unfortunately, this function is also assigning the locale callbacks and
with my change it started doing it on each language change.
In this patch I'm first checking if we do have XPCLocaleCallbacks for the
given context and only if we don't, I assign them.
MozReview-Commit-ID: 7AiCsJfKBID
--HG--
extra : rebase_source : 1efe65895759ffc07e0047d063a405d757cb1092
NS_SetCurrentThreadName() is added as an alternative to PR_SetCurrentThreadName()
inside libxul. The thread names are collected in the form of crash annotation to
be processed on socorro.
MozReview-Commit-ID: 4RpAWzTuvPs
clang's -Wcomma warning warns about suspicious use of the comma operator such as between two statements or to call a function for side effects within an expression.
js/src/builtin/MapObject.cpp:786:48 [-Wcomma] possible misuse of comma operator here
js/src/builtin/MapObject.cpp:1371:48 [-Wcomma] possible misuse of comma operator here
js/src/builtin/RegExp.cpp:1266:62 [-Wcomma] possible misuse of comma operator here
js/src/jit/x64/BaseAssembler-x64.h:624:99 [-Wcomma] possible misuse of comma operator here
js/src/jsarray.cpp:2416:27 [-Wcomma] possible misuse of comma operator here
js/src/jscompartment.cpp:120:48 [-Wcomma] possible misuse of comma operator here
js/src/jsstr.cpp:3346:14 [-Wcomma] possible misuse of comma operator here
js/xpconnect/src/XPCWrappedNativeJSOps.cpp:316:71 [-Wcomma] possible misuse of comma operator here
MozReview-Commit-ID: BbT4otUXczV
--HG--
extra : rebase_source : b232d10b5280c567f8fe390fcb56012b78da580a
This is necessary to allow helper threads to attempt large allocations and recover from fragmentation situations with the LargeAllocationFailureCallback.
MozReview-Commit-ID: AyA3pbXcaYy
--HG--
extra : rebase_source : 7a5feb779b690ec7f123481e76f2390c850dac91
mSystemPrincipal is an nsCOMPtr so it does not need to be explicitly
initialized.
Fields don't have to be lined up.
Empty function bodies don't need to be commented as such.
The blank line at the start of mozJSComponentLoader.cpp prevents Emacs
from using the mode line.
MozReview-Commit-ID: 7Az1x8jmxTI
--HG--
extra : rebase_source : 2d818074d82f9a6b1041983efd7a81bde9870619
Also, be consistent with the rest of the code and use a handle
typedef in one place.
MozReview-Commit-ID: KY3cnLemoUl
--HG--
extra : rebase_source : 4c3ae2750c868b5401a686dd164adbd04b45a0b5
To run JS in separate cooperative threads, we need to split up per-thread state
from per-runtime state. This patch does that for XPConnect.
MozReview-Commit-ID: 407SlJ7nR6v
Most of the time, the return value of this method should be checked,
because behavior should depend on whether or not an exception is
thrown. However, if it is called immediately after a throw it doesn't
need to be checked because it will always return true. bz said there
is no public API that lets you assume there is an exception because it
would be "too easy to misuse".
MozReview-Commit-ID: CqyicBbcNjW
--HG--
extra : rebase_source : a5b74ba88a927a90d491ceb8f0b750a67f62b0f4
This was only ever used for BeOS and OS2, which have likely long ago
stopped working for other reasons.
MozReview-Commit-ID: AT1jNEB1ydY
--HG--
extra : rebase_source : 8ab13e574c57f0ea96158af86e01ea8f22f6bff7
This is a pretty common use case of preferences.
MozReview-Commit-ID: 7tkdylPZxpY
--HG--
extra : transplant_source : %86%21%C1oXw%07%BB%C6.%CE%C5S%B6%E0%26%8Be%9A%E1
extra : histedit_source : 072f778d7959fa8c0f5838077116b7b9d87c5157
These files were being excluding because we thought they used plarena.h, but it
turns out they did not. A few tweaks needed to be made to clarify whether we
wanted to use mozilla::UniquePtr or js::UniquePtr.
MozReview-Commit-ID: 1su5dO3rR0T
These files were being excluding because we thought they used plarena.h, but it
turns out they did not. A few tweaks needed to be made to clarify whether we
wanted to use mozilla::UniquePtr or js::UniquePtr.
MozReview-Commit-ID: 1su5dO3rR0T
These files were being excluding because we thought they used plarena.h, but it
turns out they did not. A few tweaks needed to be made to clarify whether we
wanted to use mozilla::UniquePtr or js::UniquePtr.
MozReview-Commit-ID: 1su5dO3rR0T
By bug 1346674, we don't use NSILOCALE_* for unicode conversion in XPC/JS. So we should use NS_CopyNativeToUnicode to get a rid of nsIPlatformCharset.
MozReview-Commit-ID: Kp2jijN8On9
Sicne some tests doesn't work with --with-intl-api=no, we should skip it on Android
MozReview-Commit-ID: IjlH3aQdiqb
--HG--
extra : rebase_source : ac6d91bf291f6268ea29bcb29180b2f5197e012c
Now that we join on the thread exiting, we no longer need to have the thread
explicitly tell us it's shutting down.
MozReview-Commit-ID: LycPjUvyeX
--HG--
extra : rebase_source : 7b3d21c3d8de73a1ed511c7d6efefe2a73b2e02a
Currently, JS sampling has major problems.
- JS sampling is enabled for all JS threads from the thread that runs
locked_profiler_start() -- currently only the main thread -- but the JS
engine can't handle enabling from off-thread, and asserts. This makes
profiling workers impossible in a debug build.
- No JS thread will be JS sampled unless enableJSSampling() is called, but that
only happens in locked_profiler_start(). That means any worker threads
created while the profiler is active won't be JS sampled.
- Only the thread that runs locked_profiler_stop() -- currently only the main
thread -- ever calls disableJSSampling(). This means that worker threads that
start being JS sampled never stop being JS sampled.
This patch fixes these three problems in the following ways.
- locked_profiler_start() now sets a flag in PseudoStack that indicates
JS sampling is desired, but doesn't directly enable it. Instead, the JS
thread polls that flag and enables JS sampling itself when it sees the flag
is set. The polling is done by the interrupt callback. There was already a
flag of this sort (mJSSampling) but the new one is better.
This required adding a call to profiler_js_operation_callback() to the
InterruptCallback() in XPCJSContext.cpp. (In comparison, the
InterruptCallback() in dom/workers/RuntimeService.cpp already had such a
call.)
- RegisterCurrentThread() now requests JS sampling of a JS thread when the
profiler is active, the thread is being profiled, and JS sampling is enabled.
- locked_profiler_stop() now calls stopJSSampling() on all live threads.
The patch makes the following smaller changes as well.
- Renames profiler_js_operation_callback() as profiler_js_interrupt_callback(),
because "interrupt callback" is the standard name (viz.
JS_AddInterruptCallback()).
- Calls js::RegisterContextProfilingEventMarker() with nullptr when stopping
JS sampling, so that ProfilerJSEventMarker won't fire unnecessarily.
- Some minor formatting changes.
--HG--
extra : rebase_source : 372f94c963a9e5b2493389892499b1ca205ebc2f
PseudoContext::sampleContext() is always called immediately after
profiler_get_pseudo_stack(). This patch introduces profiler_set_js_context()
and profiler_clear_js_context(), which replace the profiler_get_pseudo_stack()
+ sampleContext() pairs. This takes us a step closer to not having to export
PseudoStack outside the profiler.
--HG--
extra : rebase_source : 8558d1600eafd395cc696d31f3d21fb52a1a74b0
In many cases, to speed up start, compiling a ScriptSource will not
compile the functions themselves, but will rather syntax-parse them
(to check for syntax errors), leaving full compilation for
later. However, if we find ourselves in a case in which the function
is needed almost immediately, we need to full-parse the function
immediately after the syntax-parse, which is wasteful.
This changeset intends to measure how often this happens, by exporting
through Telemetry the duration between the end of the syntax-parse and
the start of the full-parse for each function.
As a memory optimization, instead of storing a timestamp for the
syntax-parse of each function, we store a single timestamp for an
entire ScriptSource. This assumes that all functions of the
ScriptSource are syntax-parsed at approximately the same instant,
which should be mostly true for everything except perhaps `eval` and
`new Function`. Then, when time comes to delazify a function, we
simply determine the time elapsed since the ScriptSource was compiled.
Histogram JS_PARSER_COMPILE_LAZY_AFTER_MS starts at 10ms (anything
smaller is often not measurable) and stops at 10s (anything larger can
safely be said to be not wasteful).
MozReview-Commit-ID: 6Ycy2OIIiAt
--HG--
extra : rebase_source : 0ccd6f51189b3ad8056e9f39e267235d68f6e2db
This patch is generated by the following sed script:
find . ! -wholename '*/.hg*' -type f \( -iname '*.html' -o -iname '*.xhtml' -o -iname '*.xul' -o -iname '*.js' \) -exec sed -i -e 's/\(\(text\|application\)\/javascript\);version=1.[0-9]/\1/g' {} \;
MozReview-Commit-ID: AzhtdwJwVNg
--HG--
extra : rebase_source : e8f90249454c0779d926f87777f457352961748d
The profiler can currently handle nested calls to profiler_{init,shutdown}() --
only the first call to profiler_init() and the last call to profiler_shutdown()
do anything. And sure enough, we have the following.
- Outer init/shutdown pairs in XRE_main()/XRE_InitChildProcess() (via
GeckoProfilerInitRAII).
- Inner init/shutdown pairs in NS_InitXPCOM2()/NS_InitMinimalXPCOM() (both shut
down in ShutdownXPCOM()).
This is a bit silly, so the patch removes the inner pairs, and adds a
now-needed pair in XRE_XPCShellMain. This will allow gInitCount -- which tracks
the nesting depth -- to be removed in a future patch.
--HG--
extra : rebase_source : 7e8dc6ce81ce10269d2db6a7bf32852c396dba0e
Specifically, three changes:
1) valueOf should be non-enumerable.
2) valueOf should be === to Object.prototype.valueOf.
3) There should be no toJSON.
The tests come directly from https://github.com/w3c/web-platform-tests/pull/4623
so not much need to review them.
Without this change, we could do a "short" get of a string (e.g. from
XMLHttpRequest), then do another get that returns the same stringbuffer but a
longer length. But we wouldn't notice, hit our cache, and return the same,
too-short, external string. The site would not see the new data it should have
seen.
The problem is that JSJitGetterInfo doesn't contain a callee. So we can't use
xpc::XrayAwareCalleeGlobal in a specialized getter, because we don't have our
callee function available.
Intl API (ECMA-402) already supports most web browsers such as Edge (including Windows 10 Mobile), Chrome (including Android), and upcoming Safari 10(https://developer.apple.com/library/prerelease/content/releasenotes/General/WhatsNewInSafari/Articles/Safari_10_0.html).
Also IDN2008 support, and number control by <input type="number"> require ICU. And, although gfx has own unicode table, we want to remove this if ICU is supported on all platform.
Also, new L20N framework (aka L20N) wants to use Intl API to localize UI.
So I would like to use ICU as default on Fennec Android.
Negative issue is file size. ICU has big table, file size of libxul.so will be following. This support requires additional 4.4MB.
before this ... 26287379 bytes
after this ... 30692696 bytes
Although android OS already has ICU, ICU version is different per OS version. And our ICU requirement is 50.1+ (Android 4.0 uses ICU 48) and exported fucntion name of ICU has ICU version suffix. So we cannot use it.
MozReview-Commit-ID: 5R5iNMzWNjS
--HG--
extra : rebase_source : fceb3ee320f885e7c03749f76cf0ad7699a21c2c
nsIXPCScriptable flags handling in xpc_map_end.h is a bit of a mess.
- Half the flags relate to whether various functions are defined (PreCreate,
GetProperty, etc). These are set using the XPC_MAP_WANT_* macros;
for each one xpc_map_end.h inserts the corresponding flag using the
preprocessor (see XPC_MAP_CLASSNAME::GetScriptableFlags()).
- The other half of the flags relate to other things (IS_GLOBAL_OBJECT,
DONT_REFLECT_INTERFACE_NAMES, etc). These are set using the XPC_MAP_FLAGS
macro.
Having two similar but different mechanisms to set the flags for a class is
confusing. (Indeed, until recently we had some classes where a single flag was
redundantly specified via both mechanisms.) Note also that the classes done in
dom/base/nsIDOMClassInfo.h also specify all the flags in a single value,
similar to how XPC_MAP_FLAGS works.
This patch removes the XPC_MAP_WANT_* macros. All flags are now set
via XPC_MAP_FLAGS. This is a significant simplification to xpc_map_end.h and
all the places that use it.
The downside of this change is that I had to change the flag constants from
class constants (i.e. nsIXPCScriptable::FOO) to macros (i.e.
NSIXPCSCRIPTABLE_FOO) because they need to be used in #if statements like this
in xpc_map_end.h:
#if !((XPC_MAP_FLAGS) & NSIXPCSCRIPTABLE_WANT_PRECREATE)
and you can't use a '::'-qualified name inside a #if. I think this downside is
outweighed by the simplification described above.
Overall the patch removes 80 lines of code.
--HG--
extra : rebase_source : 6d5c341d0deba8f1529d81c17bb8819e09620b05