* Exempt a few files from dry run due to new table-level ACLs
The dry run service can no longer perform queries with wildcard table
specifications or access raw AET data. See https://github.com/mozilla-services/cloudops-infra/pull/2599
* Verbose referenced_tables for AET logging clients daily
The `nondesktop_clients_last_seen_v1` view was developed mostly as an
internal implementation detail for downstream tables, but it has become
useful in its own right. This PR formalizes the view by providing an alias
without a version modifier and it adds a `product` field with application
names that are short but more meaningful than the `app_name` field.
See discussion in https://jira.mozilla.com/browse/DO-330 about confusion that
has resulted from the name "Fennec iOS" used in dashboards, etc. This is a
step toward reducing that kind of confusion.
This PR also adds `contributes_to_2019_kpi` and `contributes_to_2020_kpi` fields
as source of truth for how we count KPI metrics. That logic is currently
copied and pasted in several places, which could lead to errors.
This will need a fair amount of review from data users before moving forward.
It will also require backfilling several downstream tables and communicating
the change.
* Add initial incremental query for geckoview build dates
* Add initial tests for incremental query (WIP)
* Add files for initial tests
* Rework query so it doesn't fail during tests
* Fix schema so queries run
* Add passing test for init
* Add test for query aggregation
* Add metadata file for scheduling the query
* Move scripts from fenix_nightly to fenix
* Remove scheduling
* Add document strings.
* Change dataset reference and indent comments correctly
* Remove init and address feedback
* remove init file
* make query idempotent by appending window to each submission_date
* rename n_builds to n_pings
* reduce window size from 30 days to 14 days
* avoid use of subqueries
* Update tests for query
* Fix tests
* Add failing test for 100
* Fix query so it work across fx100 boundary
* Add linting fixes
While testing https://github.com/mozilla/bigquery-etl/pull/1380
I was getting "query too complex" errors. It looks like that stems from the
many calls to `mozfun.norm.os` within the `product_info` function.
* Add mozfun.norm.product_info UDF
Factoring this out from https://github.com/mozilla/bigquery-etl/pull/1380
since this UDF needs to be published before I can properly test the rest of the
changes there.
* Use assert.equals
While looking at Shredder logs, I noticed that all entries for FxA-related
derived tables show 0 deletions, like:
> 712215424909 bytes and deleted 0 rows from moz-fx-data-shared-prod.firefox_accounts_derived.fxa_users_daily_v1
It appears that `account.deleted` events populate a field named `uid` rather
than `user_id`. I was able to verify this by choosing a recent `uid` value
from a deletion event and counting events where that same value appears as
`user_id` in FxA logs. There were matching messages.
- Scrape all projects for routine defns when generating tests
- Create UDFs as non-temp for stored procedure tests
- Make assert functions default non-temp (to support above)
* Bug 1669516 Use `app_display_version` for Fenix AMO stats
* Use geckoview_version for fenix nightly
* Remove deprecated installs_v1
* Remove dev installs_v1
The way `total_searches` is defined as sum over several other search-related metrics is not proper (but ok in determining whether user searched as used in this case). Misusage of total_searches value can be misleading, so we prefer to remove it from the view.
This fixes a bug about missing uids in the daily table as reported in
https://bugzilla.mozilla.org/show_bug.cgi?id=1637926#c5
It also adds some boolean fields to allow the daily table to specifically
address questions about whether a given client was 5-uri active per day.