243 строки
13 KiB
Markdown
243 строки
13 KiB
Markdown
# v8.0 Performance FAQ
|
||
|
||
### Give me a summary of the perf score changes in v8.0. What's new/different?
|
||
|
||
First, it may be useful to refresh on [the math behind Lighthouse's metric
|
||
scores and performance score.](https://developer.chrome.com/docs/lighthouse/performance/performance-scoring/)
|
||
|
||
In [Lighthouse v8.0](https://github.com/GoogleChrome/lighthouse/releases/tag/v8.0.0), we updated the score curves for FCP and TBT measurements,
|
||
making both a bit more strict. CLS has been updated to its new, [windowed
|
||
definition](https://web.dev/articles/evolving-cls). Additionally, the Performance
|
||
Score's weighted average was
|
||
[rebalanced](https://googlechrome.github.io/lighthouse/scorecalc/#FCP=3000&SI=5800&FMP=4000&TTI=7300&FCI=6500&LCP=4000&TBT=600&CLS=0.25&device=mobile&version=8&version=6&version=5),
|
||
giving more weight to CLS and TBT than before, and slightly decreasing the
|
||
weights of FCP, SI, and TTI.
|
||
|
||
From an analysis of HTTP Archive's latest [crawl of the
|
||
web](https://httparchive.org/faq#how-does-the-http-archive-decide-which-urls-to-test),
|
||
we project that the performance score for the majority of sites will stay the
|
||
same or improve in Lighthouse 8.0.
|
||
- ~20% of sites may see a drop of up to 5 points, though likely less
|
||
- ~20% of sites will see little detectable change
|
||
- ~30% of sites should see a moderate improvement of a few points
|
||
- ~30% could see a significant improvement of 5 points or more
|
||
|
||
The biggest drops in scores are due to TBT scoring becoming stricter and the
|
||
metric's slightly higher weight. The biggest improvements in scores are also due
|
||
to TBT changes in the long tail and the windowing of CLS, and both metrics'
|
||
higher weights.
|
||
|
||
### What are the exact score weighting changes?
|
||
|
||
#### Changes by metric
|
||
|
||
| metric | v6 weight | v8 weight | Δ |
|
||
|--------------------------------|-----------|-----------|----------|
|
||
| First Contentful Paint (FCP) | 15 | **10** | -5 |
|
||
| Speed Index (SI) | 15 | **10** | -5 |
|
||
| Largest Contentful Paint (LCP) | 25 | **25** | 0 |
|
||
| Time To Interactive (TTI) | 15 | **10** | -5 |
|
||
| Total Blocking Time (TBT) | 25 | **30** | 5 |
|
||
| Cumulative Layout Shift (CLS) | 5 | **15** | 10 |
|
||
|
||
#### Changes by phase
|
||
|
||
| phase | metric | v6 phase weight | v8 phase weight | Δ |
|
||
|----------------|--------------------------------|-----------------|-----------------|-----|
|
||
| early | First Contentful Paint (FCP) | 15 | 10 | -5 |
|
||
| mid | Speed Index (SI) | 40 | 35 | -5 |
|
||
| | Largest Contentful Paint (LCP) | | | |
|
||
| interactivity | Time To Interactive (TTI) | 40 | 40 | 0 |
|
||
| | Total Blocking Time (TBT) | | | |
|
||
| predictability | Cumulative Layout Shift (CLS) | 5 | 15 | 10 |
|
||
|
||
### Why did the weight of CLS go up?
|
||
|
||
When introduced in Lighthouse v6, it was still early days for the metric.
|
||
There've been [many improvements and
|
||
bugfixes](https://chromium.googlesource.com/chromium/src/+/refs/heads/main/docs/speed/metrics_changelog/cls.md)
|
||
to the CLS metric since then. Now, given its maturity and established placement in Core
|
||
Web Vitals, the weight increases from 5% to 15%.
|
||
|
||
### Why are the Core Web Vitals metrics weighted differently in the performance score?
|
||
|
||
The Core Web Vitals metrics are [independent signals in the Page Experience
|
||
ranking
|
||
update](https://support.google.com/webmasters/thread/104436075/core-web-vitals-page-experience-faqs-updated-march-2021).
|
||
Lighthouse weighs each lab equivalent metric based on what we believe creates
|
||
the best incentives to improve overall page experience for users.
|
||
|
||
LCP, CLS, and TBT are [very good
|
||
metrics](https://chromium.googlesource.com/chromium/src/+/lkgr/docs/speed/good_toplevel_metrics.md)
|
||
and that's why they are the three highest-weighted metrics in the performance
|
||
score.
|
||
|
||
### How should I think about the Lighthouse performance score in relation to Core Web Vitals?
|
||
|
||
[Core Web Vitals](https://web.dev/articles/vitals) refer to a specific set of key user
|
||
experience metrics, their passing thresholds, and percentile at which they're measured.
|
||
In general, CWV's primary focus is field data.
|
||
|
||
The Lighthouse score is a means to understand the degree of opportunity
|
||
available to improve critical elements of user experience. The lower the score,
|
||
the more likely the user will struggle with load performance, responsiveness, or
|
||
content stability.
|
||
|
||
Lighthouse's lab-based data overlaps with Core Web Vitals in a few key ways.
|
||
Lighthouse features two of the three core vitals (LCP and CLS) with the exact
|
||
same passing thresholds. There's no user input in a Lighthouse run, so it cannot
|
||
compute FID. Instead, we have TBT, which you can consider a proxy metric for
|
||
FID, and though they measure two different things they are both signals about a
|
||
page's interactivity.
|
||
|
||
_So CWV and Lighthouse have commonalities, but are different. How can you
|
||
rationalize paying attention to both?_
|
||
|
||
Ultimately, a combination approach is most effective. Use field data for the
|
||
long-term overview of your user's experience, and use lab data to iterate your
|
||
way to the best experience possible for your users. CrUX data summarizes [the
|
||
most recent 28
|
||
days](https://developers.google.com/web/tools/chrome-user-experience-report/api/reference#data-pipeline),
|
||
so it'll take some time to confidently determine that any change has definite
|
||
impact.
|
||
|
||
Lighthouse's analysis allows you to debug and optimize in an environment that is
|
||
repeatable with an immediate feedback loop. In addition, lab-based tooling can
|
||
provide significantly more detail than field instrumentation, as it's not
|
||
limited to web-exposed APIs and cross-origin restrictions.
|
||
|
||
The exact numbers of your lab and field metrics aren't expected to match, but
|
||
any substantial improvement to your lab metrics should be observable in the
|
||
field once it's been deployed. The higher the Lighthouse score, the less you're
|
||
leaving up to chance in the field.
|
||
|
||
### What blindspots from the field can lab tooling illuminate?
|
||
|
||
Field data analyzes all successful page loads. But if failed and aborted loads
|
||
are excluded, or reporting is blocked by extensions, the collected field data
|
||
can suffer from survivorship bias. Users who have better experiences use your
|
||
site more; that's why we care about performance in the first place! Lab tooling
|
||
shows you the quality of the experience for these users that field data might be
|
||
[missing entirely](https://blog.chriszacharias.com/page-weight-matters).
|
||
|
||
Lighthouse mobile reports emulate a slow 4G connection on a mid-tier Android
|
||
device. While field data might not indicate these conditions are especially
|
||
common for your site, analyzing how your site performs in these tougher
|
||
conditions helps expand your site's audience. Lighthouse identifies the worst
|
||
experiences, experiences you can't see in the field because they were so bad the
|
||
user never came back (or waited around in the first place).
|
||
|
||
As always, using both lab and field data to understand and optimize your user
|
||
experience is best practice. [Read more about field & lab](https://developers.google.com/web/fundamentals/performance/speed-tools).
|
||
|
||
### How should I work to optimize CLS differently given that it has been updated?
|
||
|
||
The [windowing adjustment](https://web.dev/articles/evolving-cls)
|
||
will likely not have much effect for the lab measurement, but instead will have
|
||
a large effect on the field CLS for long-lived pages.
|
||
|
||
Lighthouse 8 introduces another adjustment to our CLS definition: including
|
||
layout shift contributions from subframes. This brings our implementation in
|
||
line with how CrUX computes field CLS. This comes with the implication that
|
||
iframes (including ones you may not control) may be adding layout shifts which
|
||
ultimately affect your CLS score. Keep in mind that the subframe contributions
|
||
are [weighted by the in-viewport
|
||
portion](https://github.com/WICG/layout-instability#cumulative-scores) of the
|
||
iframe.
|
||
|
||
### Why don't the numbers for TBT and FID match, if TBT is a proxy metric for FID?
|
||
|
||
The commonality between TBT (collected in a lab environment) and FID (collected
|
||
in a field context) is that they measure the impact on input responsiveness from
|
||
long tasks on the main thread. Beyond that, they're quite different. FID
|
||
captures the delay in handling the first input event of the page, whenever that
|
||
input happened. TBT roughly captures how dangerous the length of all the main
|
||
thread's tasks are.
|
||
|
||
It's very possible to have a page that does well on FID, but poorly on TBT. And
|
||
it's slightly harder, but possible, to do well on TBT but poorly on FID\*. So,
|
||
you shouldn't expect your TBT and FID measurements to correlate strongly. A
|
||
large-scale analysis found their [Spearman's
|
||
ρ](https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient) at
|
||
about 0.40, which indicates a connection, but not one as strong as many would
|
||
prefer.
|
||
|
||
From the Lighthouse project's perspective, the current passing threshold for FID
|
||
is quite lenient but more importantly, the percentile-of-record for FID (75th
|
||
percentile) is not sufficient for detecting issues. The 95th percentile is a
|
||
much stronger indicator of problematic interactions for this metric. We
|
||
encourage user-centric teams to focus on the 95th percentile of all input delays
|
||
(not just the first) in their field data in order to identify and address
|
||
problems that surface just 5% of the time.
|
||
|
||
\*Aside: the [Chrome 91 FID change for
|
||
double-tap-to-zoom](https://chromium.googlesource.com/chromium/src.git/+/refs/heads/main/docs/speed/metrics_changelog/2021_05_fid.md)
|
||
fixes a lot of high FID / low TBT cases and may be observable in your field
|
||
metrics, with higher percentiles improving slightly. Most remaining high FID /
|
||
low TBT cases are likely due to incorrect meta viewport tags, which [Lighthouse
|
||
will flag](https://developer.chrome.com/docs/lighthouse/pwa/viewport/).
|
||
Delivering a mobile-friendly viewport, reducing main-thread blocking JS, and
|
||
keeping your TBT low is the best defense against bad FID in the field.
|
||
|
||
### Overall, what motivated the changes to the performance score?
|
||
|
||
As with all Lighthouse score updates, changes are made to reflect
|
||
the latest in how to measure user-experience quality holistically and accurately,
|
||
and to focus attention on key priorities.
|
||
|
||
Heavy JS and long tasks are a problem for the web that's
|
||
[worsening](https://httparchive.org/reports/state-of-javascript#bytesJs). Field
|
||
FID is currently too lenient and not sufficiently incentivizing action to
|
||
address the problem. Lighthouse has historically weighed its interactivity
|
||
metrics at 40-55% of the performance score and—as interactivity is key to user
|
||
experience—we maintain a 40% weighting (TBT and TTI together) in Lighthouse
|
||
8.0.
|
||
|
||
[FCP's score curve was
|
||
adjusted](https://github.com/GoogleChrome/lighthouse/pull/12556) to align with
|
||
the current de facto ["good" threshold](https://web.dev/articles/fcp#what_is_a_good_fcp_score),
|
||
and as a result will score a bit more strictly.
|
||
|
||
The curve for TBT was made stricter to [more closely
|
||
approach](https://github.com/GoogleChrome/lighthouse/pull/12576) the ideal score
|
||
curve. TBT has had (and still has) a more lenient curve than our methodology
|
||
dictates, but the new curve is more linear which means there's a larger range
|
||
where improvements in the metric are rewarded with improvements in the score. If
|
||
your page currently scores poorly with TBT, the new curve will be more
|
||
responsive to changes as page performance incrementally improves.
|
||
|
||
FCP's weight drops slightly from 15% to 10% because it's fairly gameable and is also partly
|
||
captured by Speed Index.
|
||
|
||
### What's the story with TTI?
|
||
|
||
TTI serves a useful role as it's the largest metric value reported (often >10
|
||
seconds) and helps anchor perceptions.
|
||
|
||
We see TBT as a stronger metric for evaluating the health of your main thread
|
||
and its impact on interactivity, plus it [has lower
|
||
variability](https://docs.google.com/document/d/1xCERB_X7PiP5RAZDwyIkODnIXoBk-Oo7Mi9266aEdGg/edit).
|
||
TTI serves as a nice complement that captures the cost of long tasks, often
|
||
from heavy JavaScript. That said, we expect to continue to reduce the weight
|
||
of TTI and will likely remove it in a future major Lighthouse release.
|
||
|
||
### How does the Lighthouse Perf score get calculated? What is it based on?
|
||
|
||
The Lighthouse perf score is calculated from a weighted, blended set of
|
||
performance metrics. You can see the current and previous Lighthouse score
|
||
compositions (which metrics we are blending together, and at what weights) in
|
||
the [score
|
||
calculator](https://googlechrome.github.io/lighthouse/scorecalc/#FCP=3000&SI=5800&FMP=4000&TTI=7300&FCI=6500&LCP=4000&TBT=600&CLS=0.25&device=mobile&version=8&version=6&version=5),
|
||
and learn more about the [calculation specifics
|
||
here](https://developer.chrome.com/docs/lighthouse/performance/performance-scoring/).
|
||
|
||
### What is the most exciting update in LH v8?
|
||
|
||
We're really excited about the [interactive
|
||
treemap](https://github.com/GoogleChrome/lighthouse/releases/tag/v7.5.0#:~:text=We%20are%20releasing%20the%20Lighthouse%20Treemap),
|
||
[filtering audits by
|
||
metric](https://github.com/GoogleChrome/lighthouse/releases/tag/v8.0.0#:~:text=The%20report%20includes%20a-,new%20metric%20filter,-.%20Pick%20a%20metric),
|
||
and the new [Content Security Policy
|
||
audit](https://web.dev/articles/strict-csp#adopting_a_strict_csp), which was a
|
||
collaboration with the Google Web Security team.
|