2019-11-01 04:06:29 +03:00
|
|
|
# bigquery-etl Code Graveyard
|
|
|
|
|
|
|
|
This document records interesting code that we've deleted for the sake of discoverability for the future.
|
|
|
|
|
2021-12-15 21:28:29 +03:00
|
|
|
## 2021-12 ASN aggregates
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/2580)
|
|
|
|
- [Bug](https://mozilla-hub.atlassian.net/browse/DSRE-197)
|
|
|
|
|
|
|
|
This dataset was no longer being actively used, and removing it allows us to
|
|
|
|
limit airflow's access to `payload_bytes_raw`.
|
|
|
|
|
2021-09-29 18:29:34 +03:00
|
|
|
## 2021-09 Remove document sampling queries
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/2389)
|
|
|
|
- [Bug](https://bugzilla.mozilla.org/show_bug.cgi?id=1731777)
|
|
|
|
|
|
|
|
We've removed the CI task from mozilla-pipeline-schemas that used this
|
|
|
|
document sample, so there is no further need for the ETL to support it.
|
|
|
|
|
2021-08-23 21:27:26 +03:00
|
|
|
## 2021-08 Remove amplitude views
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/2279)
|
|
|
|
- [DAG removal PR](https://github.com/mozilla/telemetry-airflow/pull/1328)
|
|
|
|
|
|
|
|
We no longer send data to Amplitude, so these views and scripts were
|
|
|
|
no longer being used.
|
|
|
|
|
2021-05-05 19:58:07 +03:00
|
|
|
## 2021-05 attitudes_daily
|
|
|
|
|
2021-08-23 21:27:26 +03:00
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/2003)
|
2021-05-05 19:58:07 +03:00
|
|
|
- [DAG removal PR](https://github.com/mozilla/telemetry-airflow/pull/1299)
|
|
|
|
- [Bug 1707965](https://bugzilla.mozilla.org/show_bug.cgi?id=)
|
|
|
|
|
|
|
|
This pipeline was no longer being actively used. There may be need for
|
|
|
|
a similar pipeline for an upcoming survey, so this removed code can
|
|
|
|
serve as a useful reference in that effort.
|
|
|
|
|
2021-03-16 16:22:43 +03:00
|
|
|
## 2021-03 Account Ecosystem Telemetry (AET) derived tables
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/1894)
|
|
|
|
|
|
|
|
AET was never released except for a short test in the beta population,
|
|
|
|
and now the project has been decommissioned, so there is no longer
|
|
|
|
any need for these derived tables.
|
|
|
|
|
2021-05-05 19:02:49 +03:00
|
|
|
## 2020-12 Deviations
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/2005)
|
|
|
|
- [DAG Removal PR](https://github.com/mozilla/bigquery-etl/pull/1637)
|
|
|
|
- [Blog Post](https://blog.mozilla.org/data/2020/03/30/opening-data-to-understand-social-distancing/)
|
|
|
|
|
|
|
|
The `deviations_v1` table was used to understand the change of Firefox
|
|
|
|
desktop usage during Covid-19 pandemic in 2020. The data is no longer being
|
|
|
|
actively used.
|
|
|
|
|
2020-04-22 19:48:18 +03:00
|
|
|
## 2020-04 Fenix baseline_daily and clients_last_seen
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/925)
|
|
|
|
|
|
|
|
We are now using dynamically generated queries for generic Glean
|
|
|
|
ETL on top of baseline pings, so we have deprecated previous versions
|
|
|
|
of daily and last_seen tables.
|
|
|
|
|
2019-11-01 04:06:29 +03:00
|
|
|
## Smoot Usage v1
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/460)
|
|
|
|
|
|
|
|
The `smoot_usage_*_v1*` tables used a python file to generate the desktop,
|
|
|
|
nondesktop, and FxA variants, but have been replaced by v2 tables that make
|
|
|
|
some different design decisions. One of the main drawbacks of v1 was that
|
|
|
|
we had to completely recreate the final `smoot_usage_all_mtr` table for all
|
|
|
|
history every day, which had started to take on order 1 hour to run. The
|
|
|
|
v2 tables instead define a `day_0` view and a `day_13` view and relies on
|
|
|
|
the Growth and Usage Dashboard (GUD) to query them separately and join the
|
|
|
|
results together at query time.
|
2020-02-18 23:07:40 +03:00
|
|
|
|
|
|
|
## Shredder support for per-cluster deletes
|
|
|
|
|
|
|
|
- [Removal PR](https://github.com/mozilla/bigquery-etl/pull/733)
|
|
|
|
|
|
|
|
For `telemetry_stable.main_v4` shredder used `SELECT` statements over single
|
|
|
|
clusters, then combined the result to remove rows from the table. This was an
|
|
|
|
attempt to improve performance so that reserved slots would be cheaper than
|
|
|
|
on-demand pricing, but it turned out to be slower than using `DELETE`
|
|
|
|
statements for whole partitions.
|