зеркало из https://github.com/mozilla/gecko-dev.git
Bug 1651086 - Handle tgkill failure - r=canaltinova
On Linux (including Android), it was assumed that a registered thread could always be suspended through `tgkill`. However in some cases a thread may not be correctly unregistered, in which case this would trigger `MOZ_ASSERT` or wait forever in the following loop. This will especially be needed when `profiler_{,un}register_thread()` are made less strict in the following patch. Windows and Mac already handle suspension failures. Differential Revision: https://phabricator.services.mozilla.com/D83292
This commit is contained in:
Родитель
760f059b00
Коммит
7610ff4326
|
@ -329,66 +329,67 @@ void Sampler::SuspendAndSampleAndResumeThread(
|
||||||
|
|
||||||
// Send message 1 to the samplee (the thread to be sampled), by
|
// Send message 1 to the samplee (the thread to be sampled), by
|
||||||
// signalling at it.
|
// signalling at it.
|
||||||
|
// This could fail if the thread doesn't exist anymore.
|
||||||
int r = tgkill(mMyPid, sampleeTid, SIGPROF);
|
int r = tgkill(mMyPid, sampleeTid, SIGPROF);
|
||||||
MOZ_ASSERT(r == 0);
|
if (r == 0) {
|
||||||
|
// Wait for message 2 from the samplee, indicating that the context
|
||||||
// Wait for message 2 from the samplee, indicating that the context
|
// is available and that the thread is suspended.
|
||||||
// is available and that the thread is suspended.
|
while (true) {
|
||||||
while (true) {
|
r = sem_wait(&sSigHandlerCoordinator->mMessage2);
|
||||||
r = sem_wait(&sSigHandlerCoordinator->mMessage2);
|
if (r == -1 && errno == EINTR) {
|
||||||
if (r == -1 && errno == EINTR) {
|
// Interrupted by a signal. Try again.
|
||||||
// Interrupted by a signal. Try again.
|
continue;
|
||||||
continue;
|
}
|
||||||
|
// We don't expect any other kind of failure.
|
||||||
|
MOZ_ASSERT(r == 0);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
// We don't expect any other kind of failure.
|
|
||||||
|
//----------------------------------------------------------------//
|
||||||
|
// Sample the target thread.
|
||||||
|
|
||||||
|
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
||||||
|
//
|
||||||
|
// The profiler's "critical section" begins here. In the critical section,
|
||||||
|
// we must not do any dynamic memory allocation, nor try to acquire any lock
|
||||||
|
// or any other unshareable resource. This is because the thread to be
|
||||||
|
// sampled has been suspended at some entirely arbitrary point, and we have
|
||||||
|
// no idea which unsharable resources (locks, essentially) it holds. So any
|
||||||
|
// attempt to acquire any lock, including the implied locks used by the
|
||||||
|
// malloc implementation, risks deadlock. This includes TimeStamp::Now(),
|
||||||
|
// which gets a lock on Windows.
|
||||||
|
|
||||||
|
// The samplee thread is now frozen and sSigHandlerCoordinator->mUContext is
|
||||||
|
// valid. We can poke around in it and unwind its stack as we like.
|
||||||
|
|
||||||
|
// Extract the current register values.
|
||||||
|
Registers regs;
|
||||||
|
PopulateRegsFromContext(regs, &sSigHandlerCoordinator->mUContext);
|
||||||
|
aProcessRegs(regs, aNow);
|
||||||
|
|
||||||
|
//----------------------------------------------------------------//
|
||||||
|
// Resume the target thread.
|
||||||
|
|
||||||
|
// Send message 3 to the samplee, which tells it to resume.
|
||||||
|
r = sem_post(&sSigHandlerCoordinator->mMessage3);
|
||||||
MOZ_ASSERT(r == 0);
|
MOZ_ASSERT(r == 0);
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
//----------------------------------------------------------------//
|
// Wait for message 4 from the samplee, which tells us that it has
|
||||||
// Sample the target thread.
|
// finished with |sSigHandlerCoordinator|.
|
||||||
|
while (true) {
|
||||||
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
r = sem_wait(&sSigHandlerCoordinator->mMessage4);
|
||||||
//
|
if (r == -1 && errno == EINTR) {
|
||||||
// The profiler's "critical section" begins here. In the critical section,
|
continue;
|
||||||
// we must not do any dynamic memory allocation, nor try to acquire any lock
|
}
|
||||||
// or any other unshareable resource. This is because the thread to be
|
MOZ_ASSERT(r == 0);
|
||||||
// sampled has been suspended at some entirely arbitrary point, and we have
|
break;
|
||||||
// no idea which unsharable resources (locks, essentially) it holds. So any
|
|
||||||
// attempt to acquire any lock, including the implied locks used by the
|
|
||||||
// malloc implementation, risks deadlock. This includes TimeStamp::Now(),
|
|
||||||
// which gets a lock on Windows.
|
|
||||||
|
|
||||||
// The samplee thread is now frozen and sSigHandlerCoordinator->mUContext is
|
|
||||||
// valid. We can poke around in it and unwind its stack as we like.
|
|
||||||
|
|
||||||
// Extract the current register values.
|
|
||||||
Registers regs;
|
|
||||||
PopulateRegsFromContext(regs, &sSigHandlerCoordinator->mUContext);
|
|
||||||
aProcessRegs(regs, aNow);
|
|
||||||
|
|
||||||
//----------------------------------------------------------------//
|
|
||||||
// Resume the target thread.
|
|
||||||
|
|
||||||
// Send message 3 to the samplee, which tells it to resume.
|
|
||||||
r = sem_post(&sSigHandlerCoordinator->mMessage3);
|
|
||||||
MOZ_ASSERT(r == 0);
|
|
||||||
|
|
||||||
// Wait for message 4 from the samplee, which tells us that it has
|
|
||||||
// finished with |sSigHandlerCoordinator|.
|
|
||||||
while (true) {
|
|
||||||
r = sem_wait(&sSigHandlerCoordinator->mMessage4);
|
|
||||||
if (r == -1 && errno == EINTR) {
|
|
||||||
continue;
|
|
||||||
}
|
}
|
||||||
MOZ_ASSERT(r == 0);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
// The profiler's critical section ends here. After this point, none of the
|
// The profiler's critical section ends here. After this point, none of the
|
||||||
// critical section limitations documented above apply.
|
// critical section limitations documented above apply.
|
||||||
//
|
//
|
||||||
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
||||||
|
}
|
||||||
|
|
||||||
// This isn't strictly necessary, but doing so does help pick up anomalies
|
// This isn't strictly necessary, but doing so does help pick up anomalies
|
||||||
// in which the signal handler is running when it shouldn't be.
|
// in which the signal handler is running when it shouldn't be.
|
||||||
|
|
|
@ -323,66 +323,67 @@ void Sampler::SuspendAndSampleAndResumeThread(
|
||||||
|
|
||||||
// Send message 1 to the samplee (the thread to be sampled), by
|
// Send message 1 to the samplee (the thread to be sampled), by
|
||||||
// signalling at it.
|
// signalling at it.
|
||||||
|
// This could fail if the thread doesn't exist anymore.
|
||||||
int r = tgkill(mMyPid, sampleeTid, SIGPROF);
|
int r = tgkill(mMyPid, sampleeTid, SIGPROF);
|
||||||
MOZ_ASSERT(r == 0);
|
if (r == 0) {
|
||||||
|
// Wait for message 2 from the samplee, indicating that the context
|
||||||
// Wait for message 2 from the samplee, indicating that the context
|
// is available and that the thread is suspended.
|
||||||
// is available and that the thread is suspended.
|
while (true) {
|
||||||
while (true) {
|
r = sem_wait(&sSigHandlerCoordinator->mMessage2);
|
||||||
r = sem_wait(&sSigHandlerCoordinator->mMessage2);
|
if (r == -1 && errno == EINTR) {
|
||||||
if (r == -1 && errno == EINTR) {
|
// Interrupted by a signal. Try again.
|
||||||
// Interrupted by a signal. Try again.
|
continue;
|
||||||
continue;
|
}
|
||||||
|
// We don't expect any other kind of failure.
|
||||||
|
MOZ_ASSERT(r == 0);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
// We don't expect any other kind of failure.
|
|
||||||
|
//----------------------------------------------------------------//
|
||||||
|
// Sample the target thread.
|
||||||
|
|
||||||
|
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
||||||
|
//
|
||||||
|
// The profiler's "critical section" begins here. In the critical section,
|
||||||
|
// we must not do any dynamic memory allocation, nor try to acquire any lock
|
||||||
|
// or any other unshareable resource. This is because the thread to be
|
||||||
|
// sampled has been suspended at some entirely arbitrary point, and we have
|
||||||
|
// no idea which unsharable resources (locks, essentially) it holds. So any
|
||||||
|
// attempt to acquire any lock, including the implied locks used by the
|
||||||
|
// malloc implementation, risks deadlock. This includes TimeStamp::Now(),
|
||||||
|
// which gets a lock on Windows.
|
||||||
|
|
||||||
|
// The samplee thread is now frozen and sSigHandlerCoordinator->mUContext is
|
||||||
|
// valid. We can poke around in it and unwind its stack as we like.
|
||||||
|
|
||||||
|
// Extract the current register values.
|
||||||
|
Registers regs;
|
||||||
|
PopulateRegsFromContext(regs, &sSigHandlerCoordinator->mUContext);
|
||||||
|
aProcessRegs(regs, aNow);
|
||||||
|
|
||||||
|
//----------------------------------------------------------------//
|
||||||
|
// Resume the target thread.
|
||||||
|
|
||||||
|
// Send message 3 to the samplee, which tells it to resume.
|
||||||
|
r = sem_post(&sSigHandlerCoordinator->mMessage3);
|
||||||
MOZ_ASSERT(r == 0);
|
MOZ_ASSERT(r == 0);
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
//----------------------------------------------------------------//
|
// Wait for message 4 from the samplee, which tells us that it has
|
||||||
// Sample the target thread.
|
// finished with |sSigHandlerCoordinator|.
|
||||||
|
while (true) {
|
||||||
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
r = sem_wait(&sSigHandlerCoordinator->mMessage4);
|
||||||
//
|
if (r == -1 && errno == EINTR) {
|
||||||
// The profiler's "critical section" begins here. In the critical section,
|
continue;
|
||||||
// we must not do any dynamic memory allocation, nor try to acquire any lock
|
}
|
||||||
// or any other unshareable resource. This is because the thread to be
|
MOZ_ASSERT(r == 0);
|
||||||
// sampled has been suspended at some entirely arbitrary point, and we have
|
break;
|
||||||
// no idea which unsharable resources (locks, essentially) it holds. So any
|
|
||||||
// attempt to acquire any lock, including the implied locks used by the
|
|
||||||
// malloc implementation, risks deadlock. This includes TimeStamp::Now(),
|
|
||||||
// which gets a lock on Windows.
|
|
||||||
|
|
||||||
// The samplee thread is now frozen and sSigHandlerCoordinator->mUContext is
|
|
||||||
// valid. We can poke around in it and unwind its stack as we like.
|
|
||||||
|
|
||||||
// Extract the current register values.
|
|
||||||
Registers regs;
|
|
||||||
PopulateRegsFromContext(regs, &sSigHandlerCoordinator->mUContext);
|
|
||||||
aProcessRegs(regs, aNow);
|
|
||||||
|
|
||||||
//----------------------------------------------------------------//
|
|
||||||
// Resume the target thread.
|
|
||||||
|
|
||||||
// Send message 3 to the samplee, which tells it to resume.
|
|
||||||
r = sem_post(&sSigHandlerCoordinator->mMessage3);
|
|
||||||
MOZ_ASSERT(r == 0);
|
|
||||||
|
|
||||||
// Wait for message 4 from the samplee, which tells us that it has
|
|
||||||
// finished with |sSigHandlerCoordinator|.
|
|
||||||
while (true) {
|
|
||||||
r = sem_wait(&sSigHandlerCoordinator->mMessage4);
|
|
||||||
if (r == -1 && errno == EINTR) {
|
|
||||||
continue;
|
|
||||||
}
|
}
|
||||||
MOZ_ASSERT(r == 0);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
// The profiler's critical section ends here. After this point, none of the
|
// The profiler's critical section ends here. After this point, none of the
|
||||||
// critical section limitations documented above apply.
|
// critical section limitations documented above apply.
|
||||||
//
|
//
|
||||||
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
// WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
|
||||||
|
}
|
||||||
|
|
||||||
// This isn't strictly necessary, but doing so does help pick up anomalies
|
// This isn't strictly necessary, but doing so does help pick up anomalies
|
||||||
// in which the signal handler is running when it shouldn't be.
|
// in which the signal handler is running when it shouldn't be.
|
||||||
|
|
Загрузка…
Ссылка в новой задаче