Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
|
|
|
|
/* vim: set ts=8 sts=2 et sw=2 tw=80: */
|
|
|
|
/* This Source Code Form is subject to the terms of the Mozilla Public
|
|
|
|
* License, v. 2.0. If a copy of the MPL was not distributed with this
|
|
|
|
* file, You can obtain one at http://mozilla.org/MPL/2.0/. */
|
|
|
|
|
|
|
|
#ifndef mozilla_CPUUsageWatcher_h
|
|
|
|
#define mozilla_CPUUsageWatcher_h
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
|
|
|
#include "mozilla/HangAnnotations.h"
|
2020-11-23 18:49:02 +03:00
|
|
|
#include "mozilla/Result.h"
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
|
2017-08-29 00:00:22 +03:00
|
|
|
// We only support OSX and Windows, because on Linux we're forced to read
|
|
|
|
// from /proc/stat in order to get global CPU values. We would prefer to not
|
|
|
|
// eat that cost for this.
|
|
|
|
#if defined(NIGHTLY_BUILD) && (defined(XP_WIN) || defined(XP_MACOSX))
|
|
|
|
# define CPU_USAGE_WATCHER_ACTIVE
|
|
|
|
#endif
|
|
|
|
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
namespace mozilla {
|
|
|
|
|
2020-11-23 18:49:02 +03:00
|
|
|
// Start error values at 1 to allow using the UnusedZero Result
|
|
|
|
// optimization.
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
enum CPUUsageWatcherError : uint8_t {
|
2020-11-23 18:49:02 +03:00
|
|
|
ClockGetTimeError = 1,
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
GetNumberOfProcessorsError,
|
|
|
|
GetProcessTimesError,
|
|
|
|
GetSystemTimesError,
|
|
|
|
HostStatisticsError,
|
|
|
|
ProcStatError,
|
|
|
|
};
|
|
|
|
|
2020-11-23 18:49:02 +03:00
|
|
|
namespace detail {
|
|
|
|
|
|
|
|
template <>
|
|
|
|
struct UnusedZero<CPUUsageWatcherError> : UnusedZeroEnum<CPUUsageWatcherError> {
|
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace detail
|
|
|
|
|
2018-04-30 04:21:20 +03:00
|
|
|
class CPUUsageHangAnnotator : public BackgroundHangAnnotator {
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
public:
|
|
|
|
};
|
|
|
|
|
2018-04-30 04:21:20 +03:00
|
|
|
class CPUUsageWatcher : public BackgroundHangAnnotator {
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
public:
|
2017-08-29 00:00:22 +03:00
|
|
|
#ifdef CPU_USAGE_WATCHER_ACTIVE
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
CPUUsageWatcher()
|
|
|
|
: mInitialized(false),
|
|
|
|
mExternalUsageThreshold(0),
|
|
|
|
mExternalUsageRatio(0),
|
|
|
|
mProcessUsageTime(0),
|
|
|
|
mProcessUpdateTime(0),
|
|
|
|
mGlobalUsageTime(0),
|
|
|
|
mGlobalUpdateTime(0),
|
2018-06-15 14:41:20 +03:00
|
|
|
mNumCPUs(0) {}
|
2017-08-29 00:00:22 +03:00
|
|
|
#endif
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
|
|
|
|
Result<Ok, CPUUsageWatcherError> Init();
|
|
|
|
|
|
|
|
void Uninit();
|
|
|
|
|
|
|
|
// Updates necessary values to allow AnnotateHang to function. This must be
|
|
|
|
// called on some semi-regular basis, as it will calculate the mean CPU
|
|
|
|
// usage values between now and the last time it was called.
|
|
|
|
Result<Ok, CPUUsageWatcherError> CollectCPUUsage();
|
|
|
|
|
2018-04-30 04:21:20 +03:00
|
|
|
void AnnotateHang(BackgroundHangAnnotations& aAnnotations) final;
|
2018-11-30 13:46:48 +03:00
|
|
|
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
private:
|
2017-08-29 00:00:22 +03:00
|
|
|
#ifdef CPU_USAGE_WATCHER_ACTIVE
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
bool mInitialized;
|
|
|
|
// The threshold above which we will mark a hang as occurring under high
|
|
|
|
// external CPU usage conditions
|
|
|
|
float mExternalUsageThreshold;
|
|
|
|
// The CPU usage (0-1) external to our process, averaged between the two
|
|
|
|
// most recent monitor thread runs
|
|
|
|
float mExternalUsageRatio;
|
|
|
|
// The total cumulative CPU usage time by our process as of the last
|
|
|
|
// CollectCPUUsage or Startup
|
|
|
|
uint64_t mProcessUsageTime;
|
|
|
|
// A time value in the same units as mProcessUsageTime used to
|
|
|
|
// determine the ratio of CPU usage time to idle time
|
|
|
|
uint64_t mProcessUpdateTime;
|
|
|
|
// The total cumulative CPU usage time by all processes as of the last
|
|
|
|
// CollectCPUUsage or Startup
|
|
|
|
uint64_t mGlobalUsageTime;
|
|
|
|
// A time value in the same units as mGlobalUsageTime used to
|
|
|
|
// determine the ratio of CPU usage time to idle time
|
|
|
|
uint64_t mGlobalUpdateTime;
|
|
|
|
// The number of virtual cores on our machine
|
|
|
|
uint64_t mNumCPUs;
|
2017-08-29 00:00:22 +03:00
|
|
|
#endif
|
Bug 1382440 - Watch CPU usage in BHR r=froydnj
We would like to be able to see if a given hang in BHR occurred
under high CPU load, as this is an indication that the hang is
of less use to us, since it's likely that the external CPU use
is more responsible for it.
The way this works is fairly simple. We get the system CPU usage
on a scale from 0 to 1, and we get the current process's CPU
usage, also on a scale from 0 to 1, and we subtract the latter
from the former. We then compare this value to a threshold, which
is 1 - (1 / p), where p is the number of (virtual) cores on the
machine. This threshold might need to be tuned, so that we
require an entire physical core in order to not annotate the hang,
but for now it seemed the most reasonable line in the sand.
I should note that this considers CPU usage in child or parent
processes as external. While we are responsible for that CPU usage,
it still indicates that the stack we receive from BHR is of little
value to us, since the source of the actual hang is external to
that stack.
MozReview-Commit-ID: JkG53zq1MdY
--HG--
extra : rebase_source : 16553a9b5eac0a73cd1619c6ee01fa177ca60e58
2017-07-24 23:46:09 +03:00
|
|
|
};
|
|
|
|
|
|
|
|
} // namespace mozilla
|
|
|
|
|
|
|
|
#endif // mozilla_CPUUsageWatcher_h
|