* android funnel test
* fix filter expression
* fix string comparison
* revise toml
* add completed event
* simplify by using events_unnested
* Funnel fixes
* Bump mkdocs from 1.5.2 to 1.5.3 (#4321)
Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.5.2...1.5.3)
---
updated-dependencies:
- dependency-name: mkdocs
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* [RS-826] New job to calculate newtab visits -> activity stream sessions (#4387)
* New job to calculate newtab visits -> activity stream sessions
* Removing newline chars at end of file
* Removing newline chars at end of file
* Removing newline chars at end of file
* Addressing comment suggestions
* Format
* Add bqetl_ads DAG
* Add ACL to nt_visits_to_sessions_conversion_factors_daily_v1
* Add metadata files
* Add view to dry_run skip list
* Oops, fix the view
---------
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
* Allow running multiple checks (#4471)
* Allow running multiple checks
* Don't yield anything on no matches
* Change pocket_available for new Pocket markets (#4472)
* FXA-6721 Setup import of accounts table from FxA production CloudSQL (#4423)
* Urlbar events: nested (long) instead of wide (#4373)
* feat: urlbar events final release
* feat: new result types
* feat: add interaction and group
* fix: date
* fix: use BQ builtin for UUIDs
* Add the view_v2'
* Add new table to the DAG
* fix CI error
fix ci error
* remove teon brooks
* Incorporate feedback by Curtis
Incorporate feedback from Curtis
---------
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
* DENG-1705 - Add startup_profile_selection_reason_first to clients_daily_v6 (#4473)
* Update experiment export query to include feature ids and branch feature config values (#4477)
* Update experiment export query to include feature ids and branch feature
config value.
* Add view skip for broken view
* add skip to dry run as well
* DENG-476 - Update monitoring ETLs to reference main_v5 (#4431)
* DENG-476 - Update sampled main ping tables to reference main_v5 (#4433)
* DENG-476 - Update experiment aggregates ETL to reference main_v5 (#4435)
* DENG-476 - Update internet outages to reference main_v5 (#4432)
* Fix test for mozfun.norm.result_type_to_product_name (#4487)
* Bug 1860814 - fix amo_prod__desktop_addons_by_client (#4481)
* quick fix
* fix spread out groupby
* move out sourcetable query
---------
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* fix for #4481 (#4489)
* DENG-1781- Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2 (#4486)
* Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2
* Include all sql_gen files in package (#4490)
When the bigquery-etl package is installed from pypi (or locally
via `pip install .`), the only non-py files included in the package
are those in the `package_data` section of setup.py.
Previously, with just those files, sql generation would fail due
to missing files. Because this directory is small, we should
include all files so no one accidentally runs into this problem
again.
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Bump types-requests from 2.31.0.2 to 2.31.0.10 (#4475)
Bumps [types-requests](https://github.com/python/typeshed) from 2.31.0.2 to 2.31.0.10.
- [Commits](https://github.com/python/typeshed/commits)
---
updated-dependencies:
- dependency-name: types-requests
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump mozilla-metric-config-parser from 2023.9.2 to 2023.10.2 (#4476)
Bumps [mozilla-metric-config-parser](https://github.com/mozilla/metric-config-parser) from 2023.9.2 to 2023.10.2.
- [Release notes](https://github.com/mozilla/metric-config-parser/releases)
- [Commits](https://github.com/mozilla/metric-config-parser/compare/2023.9.2...2023.10.2)
---
updated-dependencies:
- dependency-name: mozilla-metric-config-parser
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Glean server knobs monitoring table (#4491)
* Glean server knobs monitoring table
* fix code gen and skip dry-run
* Remove view creation in query
* DENG-1879 Setup import of emails table from FxA stage CloudSQL (#4493)
* DENG-1879 Setup import of emails table from FxA prod CloudSQL (#4494)
* Bump jsonschema from 4.19.0 to 4.19.2 (#4495)
Bumps [jsonschema](https://github.com/python-jsonschema/jsonschema) from 4.19.0 to 4.19.2.
- [Release notes](https://github.com/python-jsonschema/jsonschema/releases)
- [Changelog](https://github.com/python-jsonschema/jsonschema/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/python-jsonschema/jsonschema/compare/v4.19.0...v4.19.2)
---
updated-dependencies:
- dependency-name: jsonschema
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: akkomar <akkomar@users.noreply.github.com>
* Bump pytest from 7.4.2 to 7.4.3 (#4496)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.2 to 7.4.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.4.2...7.4.3)
---
updated-dependencies:
- dependency-name: pytest
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Enforce no date partition parameter in DAG (#4497)
* Use mozfun.glean.parse_datetime to parse ping_info fields (#4464)
In future versions of Glean that timestamp can be more precise, so we
need to ensure we correctly parse it.
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Remove mmccorquodale from DAG owners (#4492)
* Fix test for norm.glean_ping_info
* Bump black from 23.9.1 to 23.10.1
Bumps [black](https://github.com/psf/black) from 23.9.1 to 23.10.1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/23.9.1...23.10.1)
---
updated-dependencies:
- dependency-name: black
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* Bump sqlglot from 18.11.4 to 19.0.1 (#4500)
Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 18.11.4 to 19.0.1.
- [Changelog](https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tobymao/sqlglot/compare/v18.11.4...v19.0.1)
---
updated-dependencies:
- dependency-name: sqlglot
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Materialized views and aggregated tables for event monitoring (#4478)
* WIP event monitoring
* Add FxA custom events to view definition (#4483)
* Add FxA custom events to view definition
* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql
* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql
* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql
* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Move event monitoring to glean_usage generator
* Add cross-app event monitoring view
* Generate cross app monitoring
* Simplyfy event monitoring aggregation
---------
Co-authored-by: akkomar <akkomar@users.noreply.github.com>
* Remove generated DAGs from main (#4507)
* Add output_dir to command dag generate. (#4512)
* Add output_dir to command dag generate.
* output_dir to command dag generate.
* output_dir to command dag generate.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* Bump pyarrow from 13.0.0 to 14.0.0 (#4511)
Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.0.
- [Commits](https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.0)
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump pre-commit from 3.4.0 to 3.5.0 (#4510)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.4.0 to 3.5.0.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v3.4.0...v3.5.0)
---
updated-dependencies:
- dependency-name: pre-commit
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Remove distinct_docids query (#4449)
* Bump pip from 23.0 to 23.3 (#4516)
Bumps [pip](https://github.com/pypa/pip) from 23.0 to 23.3.
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/pip/compare/23.0...23.3)
---
updated-dependencies:
- dependency-name: pip
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump mkdocs-material from 9.3.1 to 9.4.7 (#4518)
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.3.1 to 9.4.7.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.3.1...9.4.7)
---
updated-dependencies:
- dependency-name: mkdocs-material
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Dont generate dags in bqetl query schedule command (#4517)
* Add query to load application information from probe info service (#4508)
* prefixing schema error message inside dryrun to "ERROR" to make it easier to find when searching logs for cause of exit code 1 (#4522)
* updated schema for telemetry_derived/clients_last_seen_joined_v1 to align it with the query results (#4523)
* Update scheduler of aggregates to run after upstreams. (#4503)
* Update scheduler of aggregates to run after upstreams.
* Update dags for new scheduler of analytics_aggregates
* Update dag bqetl_search
* Remove DAG.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* Set depend_on_past=False for warn checks (#4526)
* Add map.set_key to mozfun (#4527)
* Add map.set_key to mozfun
* Disallow NULL keys in maps
* DS-3281 - Add client adclicks history table (#4528)
* Add client adclicks history table
* Add alias to ad_click_history col
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Remove partition parameter on table write
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Add experiment information to event monitoring (#4519)
* feat(DENG-1774): adding fenix derived firefox android clients v2 (#4424)
* added fenix_derirved.firefox_android_clients_v2
* added ETL checks for fenix_derirved.firefox_android_clients_v2
* made changes as suggested by bani in PR#4424
* converting unique check for android clients v2 until duplication is resolved
* added install_source field to firefox_android_clients_v2 and formatting applied on checks
* added locale field and modified the query to suppot is_init()
* removed generated dag due to new generation process
* Add submission_date param to adclicks history (#4531)
* DS-3054. Support running an initialization query in parallel (#4322)
* DS-3054. Create functions to support running an initialization query for all sample_ids in parallel.
* DS-3054. Update _run_query function.
* DS-3054. Use _run_query and mapped values for initialization in parallel.
* DS-3054. Unify initialization to run in parallel and get sample_id range from metadata.
* DS-3054. Minimize formatting of query template and remove need to modify existing initialization queries. Validate if a query should use parallelized or regular update.
* DS-3054. Adding link to caveats.
* DS-3054. Update sample_id range for initialization.
* DS-3054. Use current implementation of run_query.
* DS-3054. Update using a parameter instead of initialization in metadata.
* DS-3054. DAG update with new parameter.
* Pass parameters before calling _run_query().
* Use --append_tablein favour of INSERT INTO.
* DS-3054 Separate parallel and non parallel init, plus some improvements.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* Add ios baseline_clients_yearly (#4506)
* DENG-1935 Change data ordering from pings in clients-first-seen-v2 (#4533)
* DENG-1935 Change data ordering from pings in clients-first-seen-v2
* Added main ping for client-3, maintain chosen ping
* Fix comments in event monitoring queries (#4535)
* DENG-1705 - Add missing client attribution columns to clients daily/first-seen (#4505)
* DENG-1705 Add missing client attribution columns to clients daily/firstseen
* Update clients_last_seen_joined
* Rename main_v4 -> main_v5 in ssl_ratios tests (#4536)
* Make base tables configurable in glean_usage generator (#4534)
* Make base tables configurable in glean_usage generator
* Fix event extras unnesting in event monitoring
* Bump sqlglot from 19.0.1 to 19.0.3 (#4521)
Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 19.0.1 to 19.0.3.
- [Changelog](https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tobymao/sqlglot/compare/v19.0.1...v19.0.3)
---
updated-dependencies:
- dependency-name: sqlglot
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* DS-3272 - Review checker data model for mobile (#4498)
* Add mobile shopping data
* Remove the ff desktop from sql_generator
* Fix build issue
* Incorporate feedback from Bruce
* Add clients table for mobile
* FIX CI issue
* Incorporate Bruce's feedback
* Incorporate Curtis' feedback
* Fix event_monitoring_aggregates_v1 template (#4537)
This will ensure that FxA tables are included in the aggregate.
* Fixing query error in fenix_derived/firefox_android_clients_v2/checks.sql (#4539)
* Add missing clients view to fenix review checker (#4540)
* add other projects to query from for bq usage, add for loop (#4529)
* add other projects to query from for bq usage, add for loop
* create new function to gather jobs_by_project data into temp table, update create_query function to join jobs_by_org table to jobs_by_project tmp table
* take out date from tmp table as it is unnecessary
* refactor to take out irrelevant function, rewrite SQL to look at other projects
* add date filter to jobs_by_project
* add comment for future refactoring
* add tmp_table for jobs_by_project table
* create function to loop through projects for jobs_by_project, revise query to join jobs_by_org with jobs_by_project tmp table
* take out ambiguous DATE filter
* take out r_prefix in regex from query string. Take out tmp table function. Add proper date filter
* take out r_prefix in regex from query string. Take out tmp table function. Add proper date filter
* add back in the r_prefix and add in the extra space in the Query ID regex that was needed
* updated two affected fields across task_instance and trigger airflow metadata tables to type JSON (#4545)
* Fix event monitoring template (#4546)
Nulls need to be casted to string to make the union work.
This will fix https://workflow.telemetry.mozilla.org/log?execution_date=2023-11-09T02%3A00%3A00%2B00%3A00&task_id=monitoring_derived__event_monitoring_aggregates__v1&dag_id=bqetl_monitoring&map_index=-1
* removed check for firefox_ios_clients_v1 which used different filtering settings causing result mismatch (#4547)
* iOS attributable_clients use metrics adclicks (#4543)
* iOS attributable_clients use metrics adclicks
* Remove project id from table name
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
---------
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
* Use correct submission_* field (#4549)
* Use correct app_version field (#4551)
* Revert "updated two affected fields across task_instance and trigger airflow metadata tables to type JSON (#4545)" (#4552)
This reverts commit 9750d331ca.
* DENG-1705 - Add startup_profile_selection_reason from first ping to clients_daily, clients_first_seen_v2 and downstream (#4482)
* DENG-1705 - Add startup_profile_selection_reason to clients_first_seen
* Add startup_profile_selection_reason_first_ping_only
* Query typo
* Update test schema
* Update sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_28_days_later_v1/schema.yaml
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
---------
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
* change filter on final query to go back to May 2023 - the min date in the Jobs by Project table as of 11/13/23 (#4559)
* change filter on final query to go back in history
* take out extraneous WHERE
* add DISTINCT to final query
* Add rust result types to product mapping (#4544)
* missing-mobile-fields-review-checker (#4553)
* noting that we are missing some fields
* adding is_fx_dau to android and ios clients
* add missing columns to schema.yaml
add schema.yaml
add schema.yaml
* Delete sql/moz-fx-data-shared-prod/firefox_desktop/serp_events/view.sql
---------
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
* Add aggregate table to monitor event errors (#4548)
* updated fenix_derived.funnel_retention_clients_* to use clients view instead of table directly (#4563)
* Bug 1864722 - Fix column name typo (#4567)
* add referenced tables to metadata.yaml to make sure jobs_by_org task … (#4568)
* add referenced tables to metadata.yaml to make sure jobs_by_org task runs before bigquery_usage_v2 task
* Update sql/moz-fx-data-shared-prod/monitoring_derived/bigquery_usage_v2/metadata.yaml
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Generate normal task dependencies from `depends_on` if the task is in the same DAG (#4569)
* Generate normal task dependencies from `depends_on` if the task is in the same DAG.
* Update `metadata.yaml` files to use `depends_on` rather than `upstream_dependencies`.
* Add a period-over-period check for revenue data (#4566)
* Check for period over period changes in column sum
* Fix percent change calculation
* Fix errors in navigation function logic
* Rename period over period check to specify revenue
* Remove references to period over period check
---------
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
* feat(): updated fenix_derived.firefox_android_clients_v2 to include reported_baseline_ping field (#4565)
* updated fenix_derived.firefox_android_clients_v2 to include reported_baseline_ping field
* Update sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v2/query.sql
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
---------
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
* summing sap and ad clicks (#4571)
* remove file that isn't ready yet (#4572)
* Add ga.nullify_string UDF (#4556)
* Add ga.nullify_string UDF
* Add README line
* added fenix_derived.firefox_android_clients_v2 to shredder config (#4564)
* Use client_info.app_channel for event monitoring channels (#4575)
* Add ga_sessions_v1 table & view (#4554)
* Add ga_sessions_v1 table & view
This table aggregates session-level data from GA.
* Rename nullify string func
* Apply suggestions from code review
Co-authored-by: Alexander <anicholson@mozilla.com>
* Add upstream backfill deps
* Move depends_on to correct section
---------
Co-authored-by: Alexander <anicholson@mozilla.com>
* Make sure that metadata `friendly_name` and `description` are not None (#4513)
* Fill empty description
* Assign a friendly name if the table doesn't have one
* Update metadata tests
* Update bigquery_etl/metadata/parse_metadata.py
Co-authored-by: Alexander <anicholson@mozilla.com>
* update test again
---------
Co-authored-by: Alexander <anicholson@mozilla.com>
* Add back normalized_app_id (#4580)
* Add session date param; fix checks CLI bug (#4579)
* Fix checks to filter on partitions
* Don't print "missing checks file" on success
Previously, the statement that checks.sql files
were missing was printed on any execution of the for
statement. ("else" clauses after "for"s execute after
completion of the "for" clause).
Instead, we want to print only when there are no files.
* Add derived stub attribution logs (#4557)
* Add derived stub attribution logs
This table keeps triplets from the stub attribution logs.
The triplet of (dl_token, ga_client_id, stub_session_id)
will only ever appear once here.
See the associated decision brief:
https://docs.google.com/document/d/1L4vOR0nCGawwSRPA9xiR8Hmu_8ozCGUecXAtBWmGGA0/edit
* Move stub attribution table to new dataset
In order to ensure limited access to the stub attribution service
data without significantly decreasing developer velocity, we
move these tables to a new dataset. That dataset has the defaults
we want for all stub attribution log data:
- Defaults to just read access to data-science/DUET workgroup
- No read/write access for DE
We will backfill via the bqetl_backfill DAG.
* Rename view
* Use correct dataset name in view
* Skip dryrun; no access
* Add gclid_conversions table & view (#4558)
* Add gclid_conversions table & view
This table will support the desktop conversion events.
Each valid GCLID will have any associated conversion events.
See the decision brief:
https://docs.google.com/document/d/1T8ArA9r8HDMTj1ES9NHfJFv2gUWo7w0MjG07iXtuUOI
* Use correct table name
* Use new stub attribution dataset; clarify activity_date
* Use correct date_partition_parameter
Co-authored-by: Alexander <anicholson@mozilla.com>
* Include activity_date as parameter
* Use INNER instead of LEFT joins
* Update doc strings to clarify GCLID vs GA Session
---------
Co-authored-by: Alexander <anicholson@mozilla.com>
* Include GA intraday sessions tables (#4582)
* Include GA intraday sessions tables
* Update doc string on backfilling ga_sessions
* Dont dryrun stub_attribution view
* Update min_row_count error text (#4586)
* Add conversion event; fix gclid conversions query (#4584)
* Add first_run conversion; use correct table names
* Ignore dryrun of query and view
* Remove HAVING clause; fix logical_or
* migrates old pingcentre onboarding artifacts to new firefox_desktop view (#4457)
* migrates old pingcentre onboarding artifacts to new firefox_desktop view
* generate event rollup dag
* generate review checker dag
* update messaging system dag
* incl project in table names
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Add ga_clients_v1 table & view (#4560)
* Add ga_clients_v1 table & view
- Query from ga_sessions
- Fix tests
* Use correct scheduling parameters
Co-authored-by: Alexander <anicholson@mozilla.com>
* Move HAVING clause to WHERE
Co-authored-by: Alexander <anicholson@mozilla.com>
* Change CTE name
Co-authored-by: Alexander <anicholson@mozilla.com>
---------
Co-authored-by: Alexander <anicholson@mozilla.com>
* Remove duplicate BQ query param (#4587)
* Firefox ios adclicks (#4585)
* Add Firefox iOS client adclicks history
* Add metadata description to view
* DS-3272 - Fix review checker clients to remove dups (#4583)
* Fix review checker clients to remove dups
* Fix CI issues
* Add row_num filter
* add submission_date to partition
* remove submission_date from partition
* Account for NULL handling in joins (#4590)
Previously, NULL values in the join keys didn't join, resulting
in duplicate rows. This change will coalesce those to empty
strings and NULLIFY them in the view.
* Bug 1865716 - Include errorGroups in legacy docker_fxa_admin_server_sanitized query (#4589)
`errorGroups` field was added in `docker_fxa_admin_server_sanitized_v2` and breaks the UNION.
* DS-3361. Update documentation of initialize command. (#4592)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* Link to full diff in git comments (#4593)
* Link to full diff in git comments
* Show full diff of new and deleted files
* Correct DAG description as DAG is currently active. (#4596)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* Login funnel conversions (#4591)
* Mozilla accounts login funnel conversion for overall, with email confirmation, and with two factor authentication
* Update sql_generators/funnels/configs/login_funnels.toml
* Update sql_generators/funnels/configs/login_funnels.toml
---------
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Use live tables to determine deletion request ping volume (#4442)
* Increase no_output_timeout for long-running CI jobs (#4602)
* SVCSE-1595 Setup import of tables from staging FxA databases (#4578)
* In generated diffs explicitly list the files being added or deleted. (#4600)
* Glam accounts for sampling when calculating sample_count for windows & release probes (#4581)
* Glam - fix legacy windows & release probes' sample count going fwd
* Glam FOG accounts for sampling when calculating total_sample for windows & release probes
* fog - fix client count and sample count
* Add channel filtering for fog
* SVCSE-1595 Setup import of tables from production FxA databases (#4597)
* Bug 1866469 - Exclude use_counters from GLAM ETL (#4603)
* Bug 1866469 - Exclude use_counters from GLAM ETL
* Attempt to fix tests
---------
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
* feat(): updating fxa android funnel to support install_source filtering downstream (#4561)
* Added a filter to only include playstore data
In keeping the bottom of the funnel consistent with the upper funnel, we have to only include installs from play store in the bottom of the funnel metrics
* for fenix_derived.funnel_retention_clients_week_* tables making sure we only include playstore users
* updating the changes as requested by soGaussian to expose to users the install_source field to enable filtering
---------
Co-authored-by: richard baffour <baffour345@gmail.com>
* Add schema.yaml to urlbar_events (sql_generator) (#4595)
* Add schema.yaml to urlbar_events
* SVCSE-1595 Update accounts_db schemas to match deployed tables. (#4604)
* SVCSE-1595 Update more accounts_db schemas to match deployed tables (#4605)
* Fix num_chars_typed in urlbar_events schema (#4607)
* Add init clause to ga_clients table (#4611)
* Give census access to gclid conversions data (#4613)
* Don't nest SQL generated from `main` branch in extra `sql` directory. (#4614)
* Add desktop_acquisition_funnel view (#4616)
* Add desktop_acquisition_funnel view
* Update reference
* Update view.sql
Took out some of the TODO comments around naming to stay consistent with the table it is reading as well as reduce effort to make changes to the spoke-default view that is currently setup with test data.
---------
Co-authored-by: gkabbz <gkabbz@gmail.com>
* added ETL checks to fenix_derived.firefox_android_clients_v1 (#4609)
* DENG-2013 - Add explicit dependencies & checks for history (#4620)
* Fix the source table to point to unified view to include all apps (#4622)
* Deng 1662 move google ads to ads google mmc connector (#4525)
* DENG-1662 move from google_ads connector to ads_google_mmc connector
* format queries
* add code for cohort_daily_statistics using clients_first_seen_v2 with… (#4404)
* add code for cohort_daily_statistics using clients_first_seen_v2 with new columns from clients_first_seen_v2
* take out extra sample_id
* Update sql/moz-fx-data-shared-prod/telemetry_derived/cohort_daily_stats_clients_frst_seen_v2/query.sql
switching column names - original was swapped
Co-authored-by: Alexander <anicholson@mozilla.com>
* update column names- change cohort_date to first_seen_date, make more descriptive; take out client_id and sample_id in the final table; take out extraneous columns that are not used in final table
* fix group by - days_seen_bits not days_interacted_bits
* take out second_seen_date, irrelevant
* change date _activity to submission_date
* replace submission_date_activity with client_activity
* add new line at end of schema.yaml file
* refactor code to use clients_first_seen_v2, originally commited cohorts_daily_statistics_v1 code in the v2 file
* add cohort_daily_statistics_v2 job to DAG
* add cohort_daily_statistics_v2 job to DAG, take out submission_date and add activity_date to query.sql
* delete now needless dags folder
* correct alias of table
* change submission_date to activity_date
* fix column name apple_model to apple_model_id
* add days_seen_dau_bits and other calculations based on this
* add attribution_dlsource to table
* take out underscore from column name, attribution_dlsource
* revise comment - 196 days not 180 days
* add all the other columns from clients_first_seen_v2, update schema.yaml file with new columns
* take out sample_id, fix schema
* take out document_id, dl_token, app_build_id columns, rename activity_date to submission_date, rename cohort_date to first_seen_date to match clients_first_seen_28_days_later
* move files from cohort_daily_statistics_v2 to desktop_cohort_daily_retention_v1 to reflect name change, take out extraneous colums such as xpcom_abi, attribution_dlsource, engine_data columns
---------
Co-authored-by: Alexander <anicholson@mozilla.com>
* add --project_id command, take out extraneous dashes in start and end commands in creating dataset cookbook (#4626)
* change docs (#4629)
* fix typo in project name (#4628)
* fix typo in project name
* remove shared-prod project from sql for google_ads_derived
* Fixes#4624 - Add a view for firefox_desktop.broken_site_report (#4625)
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Separate Airflow tasks for glean_usage (#4588)
* Add support for assigning Airflow tasks to task groups
* Generate separate Airflow tasks for glean_usage
* Remove Airflow dependencies from old glean_usage tasks
* Update dataset_metadata.yaml for broken site reports (#4630)
* Add user-facing view to fxa_oauth.clients (#4623)
* Fix jinja templating in glean usage metadata (#4636)
* feat(DENG-1774 / cancelled): deleting fenix_derived/firefox_android_clients_v2, v1 will remains the active model (#4610)
* deleting fenix_derived/firefox_android_clients_v2, v1 will remain the active model
* removed fenix_derived.firefox_android_clients_v2 from shredder config
* firefox_ios source added to shredder config (#4638)
* Skip check for baseline_clients_last_seen for Fire TV (#4640)
* Resolve correct task_id for tasks nested in a group (#4637)
* Android LTV UDFs (#4633)
* Add Android State UDF
* Add Android Markov States UDFs for LTV
* Make docstrings consistent
* Update doc string
Co-authored-by: Leif Oines <leifdoines@gmail.com>
---------
Co-authored-by: Leif Oines <leifdoines@gmail.com>
* Migrated DIM checks over to ETL checks for internet_outages.global_outages_v1 (#4639)
* Speed up glean_usage generation by caching the table getter (#4644)
`get_tables` is deterministic under the assumption that the tables don't
change in between invocations. Which I hope holds here.
We therefore can just cache that value so that subsequent runs quickly
return without needing a roundtrip to BigQuery again.
* fixing broken test for firefox_ios_derived.baseline_clients_yearly_v1 (#4645)
* Feat/deng 2046/migrating telemetry derived active users aggregates v1 dim checks to etl checks (#4641)
* Migrated DIM checks over to ETL checks for telemetry_derived.active_users_aggregates_v1
* rewrite
* code review suggestions
* add doc
* rename
---------
Co-authored-by: kik-kik <kignasiak@mozilla.com>
* Minimize previous PR diff comments when CI posts a new diff comment (#4635)
* Minimize previous PR diff comments when CI posts a new diff comment.
* Update Node image to latest version available from CircleCI and pin Node packages.
* GLAM avoid scientific notation for big sample counts (#4647)
* GLAM avoid scientific notation for big sample counts
* Cast to bignumeric instead of numeric
* feat(DENG-2083): added firefox_ios_derived.clients_activation_v1 and corresponding view (#4631)
* added firefox_ios_derived.clients_activation_v1 and corresponding view
* fixing a missing seperator in firefox_ios_derived.clients_activation_v1 checks
* adding firefox_ios_derived.clients_activation_v1 to shredder configuration
* removed is_suspicious_device_client as it should not be there, thanks bani for pointing this out
* fixed black formatting error inside shredder/config.py
* applied bqetl formatting
* minor styling tweak as suggested by bani in PR#4631
* Remove baseline_clients_daily DAG dependency for FF ios baseline clients yearly (#4651)
* Support offset backfills, require metadata (#4627)
* Skip backfills for queries without metadata.yaml
* Support date_partition_offset
* Fixed exclude, modified exception
* Add test for offset backfill
* Apply suggestions from code review
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Formatting
---------
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* add dau_clients_days_since_seen to CTE and num_clients_dau_on_day column to table in query and schema (#4652)
* Docs: Avoid newline in link
mkdocs doesn't like that newline and will treat the URL as a relative URL, thus breaking the link
* Docs: Use 3rd level heading for UDFs
mkdocs' ToC generator will stop when the header level goes up again.
Because the UDF name itself is generated as a first level heading, any
UDF with a first-level header documentation will thus stop rendering any
subsequent headers.
Most notably on /mozfun/hist where only the very first UDF got a ToC
entry.
* Docs: Link to section on the same page
The separate chapter was removed in #4293
* Migrated DIM checks over to ETL checks for telemetry_derived.unified_metrics_v1 (#4649)
* feat(DENG-2120): migrated over checks defined in DIM for baseline_clients_last_seen fenix. (#4656)
* migrated over checks defined in DIM for this type of dataset
* Update sql_generators/glean_usage/templates/baseline_clients_last_seen_v1.checks.sql
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Create tables that have state values per day (#4634)
* Create tables that have state values per day
* Change Airflow DAG
* Move markov states to cols rather than array
* Move bot/bad client filter to materialized table
* Add install_source and consecutive_days_seen features
* Add field to CTE
* Use jinja vars instead of sql variables
* Use correct UDF incantation
* Use live tables for structured error counts (#4598)
* Use live tables for structured error counts
* Prevent from old records being deleted
* Fix structured_error_counts query (#4659)
* Authorize view and add workgroup access for taskcluster (#4661)
* Add metadata.yaml for socorro_crash_v2 (#4664)
* Temporarily add curtis to CODEOWNERS until he can be added to group (#4665)
* Add clients_daily_joined view (#4660)
* add view.sql to telemetry and desktop_cohort_daily_retention view (#4666)
* Skip accounts_db.fxa_oauth_clients in view validation (#4667)
* Public GLAM datasets (#4606)
* Public GLAM datasets
* Remove Fenix GLAM datasets
* DENG-1352 - Migrate contextual services ETL to desktop glean pings (#4474)
* Have `bqetl query` commands fail if they don't find a matching query (#4662)
* Have `bqetl query` commands fail if they don't find a matching query.
* Update `test_run_query_no_query_file` test.
* Skip accounts_db.fxa_oauth_clients dryrun (#4671)
* Remove referenced_table from firefox_android_clients (#4674)
* Define `event_monitoring_live_v1` views in `view.sql` files (#4576)
* Define `event_monitoring_live_v1` views in `view.sql` files.
So they get automatically deployed by the `bqetl_artifact_deployment.publish_views` Airflow task.
* Support materialized views in view naming validation.
* Handle `IF NOT EXISTS` in view naming validation.
* Use regular expression to extract view ID in view naming validation.
This simplifies the logic and avoids a sqlparse bug where it doesn't recognize the `MATERIALIZED` keyword.
* Update other view regular expressions to allow for materialized views.
* Add state location for US & Canadian VPN subscriptions (DENG-2099) (#4675)
* add triage/confidential tag to docs (#4678)
* feat(DENG-2156): added value_length check and updated some of the ETL checks to use the macro (#4672)
* added value_length check and updated some of the ETL checks to use the macro
* added the new check macro to the data checks docs
* implemented lelilia feedback from PR#4672
* simplified the sql logic for the value_length check
* Skipping copying checks for baseline tables for apps marked as not receiving the baseline ping (#4670)
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Revert "Define `event_monitoring_live_v1` views in `view.sql` files (#4576)" (#4680)
This reverts commit 2c4cc5eefe.
* Change directory to generate private DAGs so `sql_file_path` values are relative to the repo root. (#4668)
* `cd` into `private-bigquery-etl` repo when generating DAGs.
To avoid generated DAGs having incorrect absolute paths for ETLs using SQL scripts.
* Revert "Temporarily add curtis to CODEOWNERS until he can be added to group (#4665)" (#4669)
This reverts commit 8d94a86017.
* ci-fix Ignore dataset.update required permissions when dryrunning authorized views (#4681)
* Refactor, add typehint
* Add datasets.update clause denied for authorized views
* add country dimension
* remove generated and old files
* delete genertated files
* regenerate sql and delete more files
* last edits to android funnel before review
* change description fields
* modify config to add retention outcomes
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sergio E. Betancourt <37666064+sergiosonline@users.noreply.github.com>
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: m-d-bowerman <107562575+m-d-bowerman@users.noreply.github.com>
Co-authored-by: akkomar <akkomar@users.noreply.github.com>
Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
Co-authored-by: Jan-Erik Rediger <jrediger@mozilla.com>
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
Co-authored-by: David Zeber <dzeber@mozilla.com>
Co-authored-by: betling <betling@mozilla.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: Linh Nguyen <linhnguyen@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: ksiegler1 <ksiegler@mozilla.com>
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: richard baffour <baffour345@gmail.com>
Co-authored-by: gkabbz <gkabbz@gmail.com>
Co-authored-by: Ksenia <kberezina@mozilla.com>
Co-authored-by: kik-kik <kignasiak@mozilla.com>
* added value_length check and updated some of the ETL checks to use the macro
* added the new check macro to the data checks docs
* implemented lelilia feedback from PR#4672
* simplified the sql logic for the value_length check
* Define `event_monitoring_live_v1` views in `view.sql` files.
So they get automatically deployed by the `bqetl_artifact_deployment.publish_views` Airflow task.
* Support materialized views in view naming validation.
* Handle `IF NOT EXISTS` in view naming validation.
* Use regular expression to extract view ID in view naming validation.
This simplifies the logic and avoids a sqlparse bug where it doesn't recognize the `MATERIALIZED` keyword.
* Update other view regular expressions to allow for materialized views.
* migrated over checks defined in DIM for this type of dataset
* Update sql_generators/glean_usage/templates/baseline_clients_last_seen_v1.checks.sql
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
`get_tables` is deterministic under the assumption that the tables don't
change in between invocations. Which I hope holds here.
We therefore can just cache that value so that subsequent runs quickly
return without needing a roundtrip to BigQuery again.
* Add support for assigning Airflow tasks to task groups
* Generate separate Airflow tasks for glean_usage
* Remove Airflow dependencies from old glean_usage tasks
* Mozilla accounts login funnel conversion for overall, with email confirmation, and with two factor authentication
* Update sql_generators/funnels/configs/login_funnels.toml
* Update sql_generators/funnels/configs/login_funnels.toml
---------
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Add mobile shopping data
* Remove the ff desktop from sql_generator
* Fix build issue
* Incorporate feedback from Bruce
* Add clients table for mobile
* FIX CI issue
* Incorporate Bruce's feedback
* Incorporate Curtis' feedback
In future versions of Glean that timestamp can be more precise, so we
need to ensure we correctly parse it.
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Add review checker derived datasets
* Add bqetl_review_checker dag
Fix
* Fix CI validate dag step
* Incorporate feedback from Alex
* Fix CI
* change client last seen to clients first seen
change client last seen to clients first seen
* fix dag
* Respect sql_dir in dryrun skip
* Update bigquery_etl/dryrun.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Update bigquery_etl/dryrun.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Set sql_dir when using Schema.from_query_file()
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Quote unqualified field names in `glean_usage` app ping select expressions.
The field names may be reserved keywords in BigQuery (e.g. `range`).
* Generate correct data type for float fields in `glean_usage` app ping select expressions.
BigQuery currently requires the type be `FLOAT64`, not just `FLOAT`.
* Simplify `(SELECT ARRAY_AGG(...))` to `ARRAY(SELECT ...)` in `glean_usage` app ping select expressions.
* DS-2998- Add attribution data to aua
Add att to aua
Fix sql format CI error
Fix CI Issue
* Fix the mobile scripts
Generate dags to fix CI
* Fix desktop query to remove um reference
Fix dag generate
Generate dags
* Regenerate firefox_ios and kpi_shredder dag
* Remove desktop query and bump mobile versions to v2
* Change to v2 for CODEOWNERS file
* revert desktop sql template to remove attr cols
* Incorporate changes based on Lucia's feedback
* Remove search columns from the final view
* Setup query and metadata templates to generate the queries for active_users_aggregates_deletion_request tables.
* Separate the active_users KPIs from the calculation based on deletion requests.
* Move mobile view outside of the browsers loop.
* Update DAG, move partition_date to the end of the query.
* Update DAG, move partition_date to the end of the query, change from parameters in the metadata to filter dates in the query.
* Revert change to fenix_derived.
* Remove 7-day window update and update tests. Fix indentation.
* Set query parameters using jija template.
* Set query parameters using jija template.
* Using parameters in the query.
* Formatting.
* Update sql_generators/active_users_deletion_requests/templates/mobile_deletion_request_query.sql
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Formatting.
* Search table's date filter.
* Query only searches for clients with deletion request. Add required filter on submission_timestamp for table deletion_request.
* Format
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Bug 1741487 - Rename url2 and related fields in stable views
This removes the following unpopulated fields from Glean views: `metrics.url`, `metrics.text`, `metrics.jwe`, and `metrics.labeled_rate`. If any of these metrics exist in the source table under `2`-suffixed name, it is also aliased to its original name (`url2` to `url` and so on).
Suffixed fields are still preserved until view consumers migrate.
* Remove redundant comma from generated sql
* Ignore missing fields in views if any of them were removed
* added a todo comment
* Added additional context around why we are excluding some of the non-suffixed fields and why alising to remove suffix 2 from some fields
---------
Co-authored-by: Arkadiusz Komarzewski <akomarzewski@mozilla.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* DENG-970 Only Glean in Focus Android view.
* DENG-970 Only Glean in Focus Android view.
* DENG-970 Only Glean in Focus Android view.
* DENG-970 Only Glean in Focus Android view.
* DENG-970 Only Glean in Focus Android view.
* DENG-970 Only Glean in Focus Android view.
* DENG-970 CI fix
* DENG-970 CI failure fix. Related to issue 3889.
* Fix UDF dependencies deploy on stage
* DENG-970 Revert specific calling to dataset for UDF.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Brad Ochocki <brad.ochocki@gmail.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Bug 1823627 - Normalize the channel based on probeinfo data in UNIONized views
* Handle fenix channel normalization for app pings
* Parallelize stage schema deploys
* Fix schema field order
---------
Co-authored-by: Jan-Erik Rediger <jrediger@mozilla.com>
* DENG-742 Add os_grouped to active_users views.
* DENG-516 Split Browser version (app_version).
* DENG-710 Add os_grouped and update telemetry.active_users_aggregates view to use the new views per browser.
* DENG-710 Rename QDAU to DAU.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
* DENG-774 Add change control to active_users_aggregates and test.
* DENG-774 Add test coverage.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* DENG-682 Refactor active_users_aggregates to be generated per-browser, change source for mobile data to clients_last_seen_joined and for desktop data to telemetry.main.
* DENG-682 header fix.
* DENG-682 format fix.
* DENG-682 fixed header.
* DENG-682. Adjust parameters for write_sql function.
* DENG-682. Adjust for dag generation
* DENG-682. Adjust for dag generation
* DENG-682 Refactor active_users_aggregates to be generated per-browser, change source for mobile data to clients_last_seen_joined and for desktop data to telemetry.main.
* DENG-682 header fix.
* DENG-682 format fix.
* DENG-682 fixed header.
* DENG-682. Adjust parameters for write_sql function.
* DENG-682. Adjust for dag generation
* DENG-682. Adjust for dag generation
* regenerated bqetl_analytics_aggregations.py
* regenerated dags which use active_users_aggregates sql_generators
* DENG-698 Focus Android query includes Legacy and Glean.
* DENG-698 DAG bqetl_core external dependency added.
* DENG-698 Column name update.
* DENG-698 Syntax in DAG.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: kik-kik <kignasiak@mozilla.com>
* Prevent glean_usage generation if referenced dataset doesn't exist
* Update sql_generators/glean_usage/common.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Indent `WHEN` and `ELSE` clauses one level more than `CASE`.
* Indent `THEN` clauses one level more than the corresponding `WHEN` clause.
* Have the content of `WHEN`, `THEN`, and `ELSE` clauses start on the same line as the clause keyword.
* Allow an alias, comma, or dot right after a `CASE` statement's `END`.
* DENG-452. Update the query template for mobile to set the correct query for uri_count and is_default_browser.
* Update metrics_templating.yaml
Add space.
* Removing placeholder comment in template.
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
If there isn't a separate app_id per channel, we'll get "release" overriding
the already-correct-and-normalized `normalized_channel` from the source table.
* added dedup step at the end of the query to ensure we only end up with a single entry per client_id
* ensuring no duplicates in core_first_seen macro
* Update sql_generators/glean_usage/templates/baseline_clients_first_seen_v1.query.sql
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Update sql_generators/glean_usage/templates/baseline_clients_first_seen_v1.init.sql
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Update sql_generators/glean_usage/templates/macros.sql
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* removed min at the end
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
This allows manual overrides of generated files by checking them into the sql/ tree of the default branch of the repository.
Correspondingly, this also deletes previously generated glean_usage files that weren't intended to be overrides.
* Bug 1757216 Add ISP field to baseline_clients_daily
Replaces #2919
This will enable use to filter out clients sent by BrowserStack in downstream tables and queries.
* Add refs to PR and bug
* Remove test table in script
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Generate metadata file for baseline_clients_daily
* Remove generated org_mozilla_fenix_derived content
This used to be needed in-tree for testing, but we now generate before
we run tests in CI.
* Fixup metadata templates
* Updated the data source for `country_names_v1`
The new source is a YAML file and sql generator
for easier verification of duplicates and contribution.
* Update static README
Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
* Changed feature_usage string to number comparisons
* Update sql/moz-fx-data-shared-prod/telemetry_derived/feature_usage_v2/metadata.yaml
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Updated feature_usage DAG owners
Co-authored-by: Anna Scholtz <anna@scholtzan.net>