OS.File already only supports UTF-8 paths on non-Windows systems, so this
change makes our different ways of accessing file paths consistent with each
other.
MozReview-Commit-ID: 8HiC5xC8tJN
--HG--
extra : rebase_source : 24c77a2e9b4003694e8e96cffab301e7adc0b4e6
This was created for B2G and isn't really useful otherwise. It only works on
Linux, and it's behind the memory.system_memory_reporter pref, which is false
by default.
The patch also removes LinuxUtils.{h,cpp}, which is no longer used.
--HG--
extra : rebase_source : b97a018be11a79f83855a73b88020bfa86e60f78
And remove unreachable code after MOZ_CRASH_UNSAFE_OOL().
MOZ_CRASH_UNSAFE_OOL causes data collection because crash strings are annotated to crash-stats and are publicly visible. Firefox data stewards must do data review on usages of this macro. However, all the crash strings this patch collects with MOZ_CRASH_UNSAFE_OOL are already collected with NS_RUNTIMEABORT.
MozReview-Commit-ID: IHmJfuxXSqw
--HG--
extra : rebase_source : 031f30934b58a7b87f960e57179641d44aefe5c5
extra : source : fe9f638a56a53c8721eecc4273dcc074c988546e
Currently the Gecko Profiler defines a moderate amount of stuff when
MOZ_GECKO_PROFILER is undefined. It also #includes various headers, including
JS ones. This is making it difficult to separate Gecko's media stack for
inclusion in Servo.
This patch greatly simplifies how things are exposed. The starting point is:
- GeckoProfiler.h can be #included unconditionally;
- everything else from the profiler must be guarded by MOZ_GECKO_PROFILER.
In practice this introduces way too many #ifdefs, so the patch loosens it by
adding no-op macros for a number of the most common operations.
The net result is that #ifdefs and macros are used a bit more, but almost
nothing is exposed in non-MOZ_GECKO_PROFILER builds (including
ProfilerMarkerPayload.h and GeckoProfiler.h), and understanding what is exposed
is much simpler than before.
Note also that in BHR, ThreadStackHelper is now entirely absent in
non-MOZ_GECKO_PROFILER builds.
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
As well as the straightforward things, this lets us remove ReadSysFile and
WriteSysFile, which in turn lets us remove TestFileUtils.cpp.
--HG--
extra : rebase_source : fc90c05352e654ffc41009d8504a9c54f394fc3f
PROFILER_MARKER is now just a trivial wrapper for profiler_add_marker(). This
patch removes it.
--HG--
extra : rebase_source : 9858f34763bb343757896a91ab7ad8bd8e56b076
This patch reduces the differences between builds where the profiler is enabled
and those where the profiler is disabled. It does this by removing numerous
MOZ_GECKO_PROFILER checks.
These changes have the following consequences.
- Various functions and classes are now defined in all builds, and so can be
used unconditionally: profiler_add_marker(), profiler_set_js_context(),
profiler_clear_js_context(), profiler_get_pseudo_stack(), AutoProfilerLabel.
(They are effectively no-ops in non-profiler builds, of course.)
- The no-op versions of PROFILER_* are now gone. The remaining versions are
almost no-ops when the profiler isn't built.
--HG--
extra : rebase_source : 8fb5e8757600210c2f77865694d25162f0b7698a
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : source : 1c7df945505930d2d86a076ee20807104324c8cc
extra : histedit_source : 75e193839edf727874f01b2a9f6852f6c1f087fb%2C3ce966d7dcf2bd0454a7d673d0467097456bd782
One of the things that I've noticed in profiling startup overhead is that,
even with the startup cache, we spend about 130ms just loading and decoding
scripts from the startup cache on my machine.
I think we should be able to do better than that by doing some of that work in
the background for scripts that we know we'll need during startup. With this
change, we seem to consistently save about 3-5% on non-e10s startup overhead
on talos. But there's a lot of room for tuning, and I think we get some
considerable improvement with a few ongoing tweeks.
Some notes about the approach:
- Setting up the off-thread compile is fairly expensive, since we need to
create a global object, and a lot of its built-in prototype objects for each
compile. So in order for there to be a performance improvement for OMT
compiles, the script has to be pretty large. Right now, the tipping point
seems to be about 20K.
There's currently no easy way to improve the per-compile setup overhead, but
we should be able to combine the off-thread compiles for multiple smaller
scripts into a single operation without any additional per-script overhead.
- The time we spend setting up scripts for OMT compile is almost entirely
CPU-bound. That means that we have a chunk of about 20-50ms where we can
safely schedule thread-safe IO work during early startup, so if we schedule
some of our current synchronous IO operations on background threads during the
script cache setup, we basically get them for free, and can probably increase
the number of scripts we compile in the background.
- I went with an uncompressed mmap of the raw XDR data for a storage format.
That currently occupies about 5MB of disk space. Gzipped, it's ~1.2MB, so
compressing it might save some startup disk IO, but keeping it uncompressed
simplifies a lot of the OMT and even main thread decoding process, but, more
importantly:
- We currently don't use the startup cache in content processes, for a variety
of reasons. However, with this approach, I think we can safely store the
cached script data from a content process before we load any untrusted code
into it, and then share mmapped startup cache data between all content
processes. That should speed up content process startup *a lot*, and very
likely save memory, too. And:
- If we're especially concerned about saving per-process memory, and we keep
the cache data mapped for the lifetime of the JS runtime, I think that with
some effort we can probably share the static string data from scripts between
content processes, without any copying. Right now, it looks like for the main
process, there's about 1.5MB of string-ish data in the XDR dumps. It's
probably less for content processes, but if we could save .5MB per process
this way, it might make it easier to increase the number of content processes
we allow.
MozReview-Commit-ID: CVJahyNktKB
--HG--
extra : rebase_source : 2ec24c8b0000f9187a9bf4a096ee8d93403d7ab2
extra : absorb_source : bb9d799d664a03941447a294ac43c54f334ef6f5
Separate AbstractThread::InitTLS and
AbstractThread::InitMainThread. Init AbstractThread main thread when
init nsThreadManager. Init AbstractThread TLS for all content process
types because for plugin and gmp processes we are doing IPC even
without init XPCOM and for content process init XPCOM requires IPC.
MozReview-Commit-ID: DhLub23oZz8
--HG--
extra : rebase_source : 6e4bfa03ec69e1eb694924903f1fa5e7259cbba3
PseudoContext::sampleContext() is always called immediately after
profiler_get_pseudo_stack(). This patch introduces profiler_set_js_context()
and profiler_clear_js_context(), which replace the profiler_get_pseudo_stack()
+ sampleContext() pairs. This takes us a step closer to not having to export
PseudoStack outside the profiler.
--HG--
extra : rebase_source : 8558d1600eafd395cc696d31f3d21fb52a1a74b0
The profiler can currently handle nested calls to profiler_{init,shutdown}() --
only the first call to profiler_init() and the last call to profiler_shutdown()
do anything. And sure enough, we have the following.
- Outer init/shutdown pairs in XRE_main()/XRE_InitChildProcess() (via
GeckoProfilerInitRAII).
- Inner init/shutdown pairs in NS_InitXPCOM2()/NS_InitMinimalXPCOM() (both shut
down in ShutdownXPCOM()).
This is a bit silly, so the patch removes the inner pairs, and adds a
now-needed pair in XRE_XPCShellMain. This will allow gInitCount -- which tracks
the nesting depth -- to be removed in a future patch.
--HG--
extra : rebase_source : 7e8dc6ce81ce10269d2db6a7bf32852c396dba0e
Also remove the #includes of some unused header files.
MozReview-Commit-ID: 6mRoIazEA3j
--HG--
extra : rebase_source : 6f96d22543509bf09b684b0bfbfa624eafc58b94
SHGetKnownFolderPath() is available on Windows Vista+ so we no longer need to GetProcAddress("SHGetKnownFolderPath"). We can set _WIN32_WINNT 0x0600 to expose the SHGetKnownFolderPath function declaration in shlobj.h and call it directly.
Also remove a redundant #include <shlobj.h>.
MozReview-Commit-ID: AoAlrfvQ5AB
--HG--
extra : rebase_source : 2fa3a0d3d122ca31fb3369a43a03b6e2c5d5dec2
This marks the idl classes as deprecated, removes an unnecessary include that
was triggering deprecation warnings and wraps a necessary include in
XPCOMInit.cpp that is used for registering the component in deprecation
disabling pragmas.
MozReview-Commit-ID: BbNU5q8O4Q4
We skip running shutdown collection on the main thread, because we're
exiting right after, but the process still continues when we shut down
a worker, so we should always do collections there.
MozReview-Commit-ID: IQZItm1qWXW
--HG--
extra : rebase_source : 4fd89e36db124a524c240e806a885dee3b68979e
nsThreadManager::get() can return a reference. This lets us remove some
redundant assertions.
nsThreadArray elements can be NotNull<>s.
--HG--
extra : rebase_source : fd49010167101bc15f7f6d01bf95fd63b81d60fb
This patch removes checking of all the callback calls in memory reporter
CollectReport() functions, because it's not useful.
The patch also does some associated clean-up.
- Replaces some uses of nsIMemoryReporterCallback with the preferred
nsIHandleReportCallback typedef.
- Replaces aCallback/aCb/aClosure with aHandleRepor/aData for CollectReports()
parameter names, for consistency.
- Adds MOZ_MUST_USE/[must_use] in a few places in nsIMemoryReporter.idl.
- Uses the MOZ_COLLECT_REPORT macro in all suitable places.
Overall the patch reduces code size by ~300 lines and reduces the size of
libxul by about 37 KiB on my Linux64 builds.
--HG--
extra : rebase_source : e94323614bd10463a0c5134a7276238a7ca1cf23