This patch attempts to record the reason why we fall back to DNS.
I considered using categorical probes for this, but they have a max limit of
20 categories, so we have to use a linear probe. I chose 50 buckets to allow
us to add more failure reasons in the future.
The recorded values are defined in nsHostRecord::TRRSkippedReason.
nsHostRecord::RecordReason is called whenever we encounter a condition that
will cause us to skip TRR in nsHostResolver.
For failures that occur inside TRR.cpp, each TRR object holds its own reason
that is recorded in a similar way. When all TRR requests are complete we
report the one that failed (or if both failed we report the one for the A
request).
Due to the fact that we might also follow CNAME requests, and the final
TRR request might not be the one that was issued at first, TRR requests
must pass back the reason as an argument to CompleteLookup.
Finally, this patch records the reason in two probes:
TRR_SKIP_REASON_TRR_FIRST - only reported in TRR-first mode
TRR_SKIP_REASON_DNS_WORKED - only reported in TRR-first mode when the
fallback DNS request succeeded. This allows us to filter for complete
network failures.
Differential Revision: https://phabricator.services.mozilla.com/D82168
Due to a change in timing in this patch, when we reset the confirmation pref
at the end of the test, a TRR request would happen after we changed the prefs
leading to accessing a non-local IP in testing and causing a crash.
This should be gated on being in the correct mode anyway
Depends on D81517
Differential Revision: https://phabricator.services.mozilla.com/D82222
mTRRUsed is a variable that we check to gate several telemetry probes, and to
decide if TRR really failed and we should add a domain to the TRR blocklist.
The problem with setting this too early is that when this is true but we
don't actually send the TRR request, then we will report that we fell back
to Do53 and potentially skip next TRR requests in the future.
The solution here is to only set mTRRUsed if TRRServiceChannel::AsyncOpen
succeeds.
Differential Revision: https://phabricator.services.mozilla.com/D81517
Due to a change in timing in this patch, when we reset the confirmation pref
at the end of the test, a TRR request would happen after we changed the prefs
leading to accessing a non-local IP in testing and causing a crash.
This should be gated on being in the correct mode anyway
Depends on D81517
Differential Revision: https://phabricator.services.mozilla.com/D82222
mTRRUsed is a variable that we check to gate several telemetry probes, and to
decide if TRR really failed and we should add a domain to the TRR blocklist.
The problem with setting this too early is that when this is true but we
don't actually send the TRR request, then we will report that we fell back
to Do53 and potentially skip next TRR requests in the future.
The solution here is to only set mTRRUsed if TRRServiceChannel::AsyncOpen
succeeds.
Differential Revision: https://phabricator.services.mozilla.com/D81517
mTRRUsed is a variable that we check to gate several telemetry probes, and to
decide if TRR really failed and we should add a domain to the TRR blocklist.
The problem with setting this too early is that when this is true but we
don't actually send the TRR request, then we will report that we fell back
to Do53 and potentially skip next TRR requests in the future.
The solution here is to only set mTRRUsed if TRRServiceChannel::AsyncOpen
succeeds.
Differential Revision: https://phabricator.services.mozilla.com/D81517