* Add aggregation by country
* Copy the initial Italy focus query
This initial commit provides a baseline for the
next commits to ease review, since this initial
code was already reviewed.
* Cleanup the country list and replace FULL OUTER with LEFT joins
* Aggregate by city for cities with more than 15k inhabitants
The actual 15k limit is enforced at ingestion time.
This further limits the resulting cities to ones with at
least 1000 active daily users.
* Produce hourly aggregates
* Move the query to the `internet_outage` dataset
* Provide automatic daily scheduling through AirFlow
* Tweak the SQL addressing review comments
This additionally changes the `CAST` to
`SAFE_CAST` to account for weirdnesses in
the data.
* Add ssl_error_prop
* Add missing_dns_success
* Add missing_dns_failure
* Lower the minimum reported bucket size to 50
This allows us to match the EDA by Saptarshi and
to have a better comparable baseline.
* Document the oddities around `submission_timestamp_min`