EXC_CRASH exceptions wrap some of the other exceptions under certain
conditions. They are delivered to the exception handler when the wrapped
exception has not been caught by a previous handler. They carry no data
of their own but pack the data of the original exception instead, see:
2ff845c2e0/bsd/kern/kern_exit.c (L1056-L1066)
We match Crashpad behavior to not store these exceptions directly in the
minidump but rather unpack the original exception and store it instead.
Because of that I haven't added functionality to print them out.
Additionally it seems that they can also wrap uncaught signals and deliver
those instead of a mach exception. While extracting the original signal is
a trivial operation I was unable to actually generate one in the first
place. As such I've added no specific functionality to deal with those either.
This also matches crashpad behavior.
Differential Revision: https://phabricator.services.mozilla.com/D123532
Automatically generated path that adds flag `REQUIRES_UNIFIED_BUILD = True` to `moz.build`
when the module governed by the build config file is not buildable outside on the unified environment.
This needs to be done in order to have a hybrid build system that adds the possibility of combing
unified build components with ones that are built outside of the unified eco system.
Differential Revision: https://phabricator.services.mozilla.com/D122345
This introduces a few changes to the crash reporting machinery:
* The macOS exception handler now registers itself for catching EXC_RESOURCE
exceptions, those are thrown when the process exceeds a pre-set resource
limit (memory, CPU usage, I/O, etc...)
* The minidump writer has been updated to correctly store the subcode from the
EXC_RESOURCE exceptions, this involves widening to 64-bits the code and
subcode passed to the writer. The upper 32 bits of the code are now set in
the minidump's exception_flags field (vs the lower 32 bits for all other
exceptions). Additionally the exception type, code and subcode are now
stored in the exception_information array like Crashpad does. This preserves
the entirety of the data that came with the exception.
* The stackwalker has been modified to print out these type of exceptions as
well as the resource type and flavor.
Differential Revision: https://phabricator.services.mozilla.com/D122229
CLOSED TREE
Backed out changeset 8be159457667 (bug 1724388)
Backed out changeset fd5d00822477 (bug 1724368)
Backed out changeset 092785e2a7f8 (bug 1725154)
This introduces a few changes to the crash reporting machinery:
* The macOS exception handler now registers itself for catching EXC_RESOURCE
exceptions, those are thrown when the process exceeds a pre-set resource
limit (memory, CPU usage, I/O, etc...)
* The minidump writer has been updated to correctly store the subcode from the
EXC_RESOURCE exceptions, this involves widening to 64-bits the code and
subcode passed to the writer. The upper 32 bits of the code are now set in
the minidump's exception_flags field (vs the lower 32 bits for all other
exceptions). Additionally the exception type, code and subcode are now
stored in the exception_information array like Crashpad does. This preserves
the entirety of the data that came with the exception.
* The stackwalker has been modified to print out these type of exceptions as
well as the resource type and flavor.
Differential Revision: https://phabricator.services.mozilla.com/D122229
AFAICT, all roads lead though [`nsAppStartup::Quit`](https://searchfox.org/mozilla-central/rev/0fec57c05d3996cc00c55a66f20dd5793a9bfb5d/toolkit/components/startup/nsAppStartup.cpp#448),
which is responsible for firing the `"quit-application"` observer notification
and then posting the `nsAppExitEvent` that causes the `nsAppShell` to break out
of its event loop and proceed with shutdown.
If we trigger a native crash in the observer, we should be able to capture a
symbolicated stack of whatever called `Quit`. We might as well force-crash
anyway, since AC is going to throw an exception regardless...
Before we crash, we annotate the current `GeckoThread` state to enable us to
find out whether we were fully initialized at the time of the shutdown.
Differential Revision: https://phabricator.services.mozilla.com/D122256
AFAICT, all roads lead though [`nsAppStartup::Quit`](https://searchfox.org/mozilla-central/rev/0fec57c05d3996cc00c55a66f20dd5793a9bfb5d/toolkit/components/startup/nsAppStartup.cpp#448),
which is responsible for firing the `"quit-application"` observer notification
and then posting the `nsAppExitEvent` that causes the `nsAppShell` to break out
of its event loop and proceed with shutdown.
If we trigger a native crash in the observer, we should be able to capture a
symbolicated stack of whatever called `Quit`. We might as well force-crash
anyway, since AC is going to throw an exception regardless...
Before we crash, we annotate the current `GeckoThread` state to enable us to
find out whether we were fully initialized at the time of the shutdown.
Differential Revision: https://phabricator.services.mozilla.com/D122256
Correct the description of the MacMemoryPressureSysctl crash annotation to indicate 1 is the value for normal memory pressure.
The integer values are from the XNU kernel event.h header file and observable with `$ sysctl kern.memorystatus_vm_pressure_level`.
Differential Revision: https://phabricator.services.mozilla.com/D122032
Subscribe to memory pressure events on macOS and add crash report annotations to parent and content process crash reports that can be used to determine if the system was under memory pressure at the time of the crash.
Include the memory pressure level reported via the DISPATCH_SOURCE_TYPE_MEMORYPRESSURE dispatch with timestamps of transitions, the memory pressure level as read from the kern.memorystatus_vm_pressure_level sysctl, and a measurement of the percentage of available memory in the system read from the kern.memorystatus_level sysctl.
Differential Revision: https://phabricator.services.mozilla.com/D116725
Subscribe to memory pressure events on macOS and add crash report annotations to parent and content process crash reports that can be used to determine if the system was under memory pressure at the time of the crash.
Include the memory pressure level reported via the DISPATCH_SOURCE_TYPE_MEMORYPRESSURE dispatch with timestamps of transitions, the memory pressure level as read from the kern.memorystatus_vm_pressure_level sysctl, and a measurement of the percentage of available memory in the system read from the kern.memorystatus_level sysctl.
Differential Revision: https://phabricator.services.mozilla.com/D116725
There are a number of modules that we import from C++ and can't continue
running without. We have a number of crashes for some of those failed loads. A
lot of them are from OOMs or corruption, but we're not sure about the rest.
This patch adds a crash annotation with the details of the error wherever we
abort for failing to load a module.
Differential Revision: https://phabricator.services.mozilla.com/D120290
There are a number of modules that we import from C++ and can't continue
running without. We have a number of crashes for some of those failed loads. A
lot of them are from OOMs or corruption, but we're not sure about the rest.
This patch adds a crash annotation with the details of the error wherever we
abort for failing to load a module.
Differential Revision: https://phabricator.services.mozilla.com/D120290
Our Windows Error Reporting runtime module seems to be notified of all
sorts of non-fatal exceptions. Since there is no documentation clarifying
how to tell them apart from actual crashes we'll try using the bIsFatal
field in the WER_RUNTIME_EXCEPTION_INFORMATION structure for this
purpose. There is no documentation describing the contents of the field
so we can only assume that what its name implies is what we're looking
for.
Differential Revision: https://phabricator.services.mozilla.com/D118813
This patch adds a new field to the structures that WER reads from a
crashed process. This field contains a pointer to the global variable
that records the size of the last failed annotation.
When WER intercepts a crash it will use this address to read the
variable. If it's not zero it will add the corresponding annotation
to the crash report.
Depends on D116449
Differential Revision: https://phabricator.services.mozilla.com/D116450
Since this added the new flag to the crash ping I also took the time to update
the crash ping documentation with all the flags that have been added and
removed over the last few versions of Firefox.
Depends on D115380
Differential Revision: https://phabricator.services.mozilla.com/D116017
This also notifies the main process after the minidump has been generated.
I refactored the code a bit so the patch is probably larger than it should be
but the code should be a bit more readable overall.
With this change the minidump generation flow works like this:
- When the callback gets invoked in the WER process we read the structure that
is stored in every process' to figure out if it's the main process or a child
one. This is done by reading said process' memory, the pointer has been
passed to the runtime exception module when it was registered.
- If the main process crashed everything works like it used to.
- If it was a child process then we first capture a minidump of it.
- Then we read the structure representing it in the main process:
WindowsErrorReportingData. The address of this structure was passed into the
child process' command-line so we need to parse that first, then we read it
from the main process memory.
- We fill the structure and write it back into the main process memory.
- At this point if everything went fine we create a new thread in the main
process just to execute the WerNotifyProc function that will inform the main
process to the presence of the new minidump.
There's one important tidbit that's worth keeping in mind: the synchronization
between the main process and the WER process is implicit. The
WindowsErrorReportingData structure in the main process is kept alive until the
child process dies, the main process will destroy it only after that point. As
long as we're in the runtime exception module the crashed process is kept alive
so this will prevent the main process from touching that structure.
We explicitly terminate the crashed process **after** we're done with the
structure so nothing bad could happen... unless someone makes a change to
Gecko that breaks the previous assumption.
Another important thing to keep in mind: we wait for the newly created thread
to inform the main process but only for 5 seconds. We don't want to wait
indefinitely because the function that we're calling is taking a lock and if
it blocks for some reason WER will get stuck waiting for it, so it will never
kill the crashed process which in turn will prevent the main process from
moving ahead. In principle this should never happen but better be safe than
sorry.
Depends on D115379
Differential Revision: https://phabricator.services.mozilla.com/D115380
This patch sets up a few different things that will be used by the WER runtime
exception module when it needs to notify the main process of a child process
crash.
For every child process we allocate a structure in the main process called
WindowsErrorReportingData that contains three things:
- The address of the function used to notify the main process that there's a
pending minidump for a given child process
- The PID of said child process
- The name of the minidump that has been generated
The first field is filled up by the main process and will be read by the WER
process when running the runtime exception module, the second and third fields
on the other hand start empty and will be written into by the runtime exception
module after it has generated a minidump.
I know this sounds scary. It is. But bear with me please.
When we register the runtime exception module we can pass it a single
pointer-sized parameter but we need to pass it at least another pointer that
includes data coming from the child process itself (this one is called
InProcessWindowsErrorReportingData). This data currently includes only the
process type but will also include certain annotations in the future
(e.g. bug 1711418). So here's what we do: we store a pointer to the parent
data structure in the child process command-line (cringe) and we read it
from the runtime exception module by reading the crashed process command-line
arguments and parsing them (double-cringe).
Armed with this information the WER runtime exception module can populate
the info for the generated minidump and then push it into the main process
by calling CreateRemoteThread() (which creates a new thread in the main
process, triple-cringe at this point).
Differential Revision: https://phabricator.services.mozilla.com/D115379
This patch adds a new field to the structures that WER reads from a
crashed process. This field contains a pointer to the global variable
that records the size of the last failed annotation.
When WER intercepts a crash it will use this address to read the
variable. If it's not zero it will add the corresponding annotation
to the crash report.
Depends on D116449
Differential Revision: https://phabricator.services.mozilla.com/D116450
Since this added the new flag to the crash ping I also took the time to update
the crash ping documentation with all the flags that have been added and
removed over the last few versions of Firefox.
Depends on D115380
Differential Revision: https://phabricator.services.mozilla.com/D116017
This also notifies the main process after the minidump has been generated.
I refactored the code a bit so the patch is probably larger than it should be
but the code should be a bit more readable overall.
With this change the minidump generation flow works like this:
- When the callback gets invoked in the WER process we read the structure that
is stored in every process' to figure out if it's the main process or a child
one. This is done by reading said process' memory, the pointer has been
passed to the runtime exception module when it was registered.
- If the main process crashed everything works like it used to.
- If it was a child process then we first capture a minidump of it.
- Then we read the structure representing it in the main process:
WindowsErrorReportingData. The address of this structure was passed into the
child process' command-line so we need to parse that first, then we read it
from the main process memory.
- We fill the structure and write it back into the main process memory.
- At this point if everything went fine we create a new thread in the main
process just to execute the WerNotifyProc function that will inform the main
process to the presence of the new minidump.
There's one important tidbit that's worth keeping in mind: the synchronization
between the main process and the WER process is implicit. The
WindowsErrorReportingData structure in the main process is kept alive until the
child process dies, the main process will destroy it only after that point. As
long as we're in the runtime exception module the crashed process is kept alive
so this will prevent the main process from touching that structure.
We explicitly terminate the crashed process **after** we're done with the
structure so nothing bad could happen... unless someone makes a change to
Gecko that breaks the previous assumption.
Another important thing to keep in mind: we wait for the newly created thread
to inform the main process but only for 5 seconds. We don't want to wait
indefinitely because the function that we're calling is taking a lock and if
it blocks for some reason WER will get stuck waiting for it, so it will never
kill the crashed process which in turn will prevent the main process from
moving ahead. In principle this should never happen but better be safe than
sorry.
Depends on D115379
Differential Revision: https://phabricator.services.mozilla.com/D115380
This patch sets up a few different things that will be used by the WER runtime
exception module when it needs to notify the main process of a child process
crash.
For every child process we allocate a structure in the main process called
WindowsErrorReportingData that contains three things:
- The address of the function used to notify the main process that there's a
pending minidump for a given child process
- The PID of said child process
- The name of the minidump that has been generated
The first field is filled up by the main process and will be read by the WER
process when running the runtime exception module, the second and third fields
on the other hand start empty and will be written into by the runtime exception
module after it has generated a minidump.
I know this sounds scary. It is. But bear with me please.
When we register the runtime exception module we can pass it a single
pointer-sized parameter but we need to pass it at least another pointer that
includes data coming from the child process itself (this one is called
InProcessWindowsErrorReportingData). This data currently includes only the
process type but will also include certain annotations in the future
(e.g. bug 1711418). So here's what we do: we store a pointer to the parent
data structure in the child process command-line (cringe) and we read it
from the runtime exception module by reading the crashed process command-line
arguments and parsing them (double-cringe).
Armed with this information the WER runtime exception module can populate
the info for the generated minidump and then push it into the main process
by calling CreateRemoteThread() (which creates a new thread in the main
process, triple-cringe at this point).
Differential Revision: https://phabricator.services.mozilla.com/D115379
This patch adds a new field to the structures that WER reads from a
crashed process. This field contains a pointer to the global variable
that records the size of the last failed annotation.
When WER intercepts a crash it will use this address to read the
variable. If it's not zero it will add the corresponding annotation
to the crash report.
Depends on D116449
Differential Revision: https://phabricator.services.mozilla.com/D116450