Граф коммитов

22 Коммитов

Автор SHA1 Сообщение Дата
Alessio Placitelli 76bac7a98e
Create an ETL job for the Internet Outages (#1058)
* Add aggregation by country

* Copy the initial Italy focus query

This initial commit provides a baseline for the
next commits to ease review, since this initial
code was already reviewed.

* Cleanup the country list and replace FULL OUTER with LEFT joins

* Aggregate by city for cities with more than 15k inhabitants

The actual 15k limit is enforced at ingestion time.
This further limits the resulting cities to ones with at
least 1000 active daily users.

* Produce hourly aggregates

* Move the query to the `internet_outage` dataset

* Provide automatic daily scheduling through AirFlow

* Tweak the SQL addressing review comments

This additionally changes the `CAST` to
`SAFE_CAST` to account for weirdnesses in
the data.

* Add ssl_error_prop

* Add missing_dns_success

* Add missing_dns_failure

* Lower the minimum reported bucket size to 50

This allows us to match the EDA by Saptarshi and
to have a better comparable baseline.

* Document the oddities around `submission_timestamp_min`
2020-07-01 06:44:40 +02:00
Anna Scholtz 49ee9b34b1 Rename bqetl_clients to bqetl_clients_daily 2020-06-24 12:18:04 -07:00
Anna Scholtz 38e6acbee6 DAGs for client queries 2020-06-24 12:18:04 -07:00
Anna Scholtz 48430fcfe1 Move addons queries to bqetl_addons DAG 2020-06-24 08:54:22 -07:00
Anna Scholtz 45fb7d41e5 Add bqetl_search DAG 2020-06-12 09:47:51 -07:00
Anna Scholtz d72918d6e7 Create bqetl_activity_stream DAG 2020-06-12 08:32:42 -07:00
Anna Scholtz 038b4b18e7 DAG for messaging system queries 2020-06-10 13:45:22 -07:00
Anna Scholtz 43e806c9e7 Move smoot queries to bqetl_gud DAG 2020-06-09 09:38:11 -07:00
Anna Scholtz 69d68c55c3 bqetl_mobile_search DAG 2020-06-08 13:33:19 -07:00
Anna Scholtz 823542b235 bqetl_nondesktop DAG 2020-06-08 12:36:06 -07:00
Anna Scholtz 494f8c1a5f bqetl_core DAG 2020-06-08 12:05:26 -07:00
Anna Scholtz b370c4710d DAG for vrbrowser queries 2020-06-05 10:52:12 -07:00
Anna Scholtz 5b906f859c Add version to error_aggregates query 2020-06-03 14:39:51 -07:00
Jeff Klukas 5cbdfa0dcc Add bqetl_amo_stats DAG 2020-06-03 14:28:59 -04:00
Anna Scholtz 914e74c9ce Add support for defining external task dependencies 2020-06-03 09:32:50 -04:00
Anna Scholtz 3a57c323ce Use consistent naming for DAG publishing JSON 2020-06-02 11:04:32 -07:00
Anna Scholtz 45961c331b Handle public data export in DAGs 2020-06-02 11:04:32 -07:00
Anna Scholtz ca287bdf3a Keep error_aggregates query without version 2020-05-28 14:12:24 -07:00
Anna Scholtz e7b2b56c01 KPI dashboard generated Airflow DAG 2020-05-28 14:12:24 -07:00
Anna Scholtz d5822b952d Generate error aggregates DAG 2020-05-28 14:12:24 -07:00
Anna Scholtz 22f1c39b5b Error aggregates as scheduled query 2020-05-28 14:12:24 -07:00
Anna Scholtz 1116be16b6 Pull in telemetry-airflow 2020-05-28 14:12:24 -07:00