Граф коммитов

349 Коммитов

Автор SHA1 Сообщение Дата
Leif Oines 6bdd9dcd8b
Firefox android onboarding funnel dataset (#4479)
* android funnel test

* fix filter expression

* fix string comparison

* revise toml

* add completed event

* simplify by using events_unnested

* Funnel fixes

* Bump mkdocs from 1.5.2 to 1.5.3 (#4321)

Bumps [mkdocs](https://github.com/mkdocs/mkdocs) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/mkdocs/mkdocs/releases)
- [Commits](https://github.com/mkdocs/mkdocs/compare/1.5.2...1.5.3)

---
updated-dependencies:
- dependency-name: mkdocs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [RS-826] New job to calculate newtab visits -> activity stream sessions (#4387)

* New job to calculate newtab visits -> activity stream sessions

* Removing newline chars at end of file

* Removing newline chars at end of file

* Removing newline chars at end of file

* Addressing comment suggestions

* Format

* Add bqetl_ads DAG

* Add ACL to nt_visits_to_sessions_conversion_factors_daily_v1

* Add metadata files

* Add view to dry_run skip list

* Oops, fix the view

---------

Co-authored-by: Curtis Morales <cmorales@mozilla.com>

* Allow running multiple checks (#4471)

* Allow running multiple checks

* Don't yield anything on no matches

* Change pocket_available for new Pocket markets (#4472)

* FXA-6721 Setup import of accounts table from FxA production CloudSQL (#4423)

* Urlbar events: nested (long) instead of wide (#4373)

* feat: urlbar events final release

* feat: new result types

* feat: add interaction and group

* fix: date

* fix: use BQ builtin for UUIDs

* Add the view_v2'

* Add new table to the DAG

* fix CI error

fix ci error

* remove teon brooks

* Incorporate feedback by Curtis

Incorporate feedback from Curtis

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* DENG-1705 - Add startup_profile_selection_reason_first to clients_daily_v6 (#4473)

* Update experiment export query to include feature ids and branch feature config values (#4477)

* Update experiment export query to include feature ids and branch feature
config value.

* Add view skip for broken view

* add skip to dry run as well

* DENG-476 - Update monitoring ETLs to reference main_v5 (#4431)

* DENG-476 - Update sampled main ping tables to reference main_v5 (#4433)

* DENG-476 - Update experiment aggregates ETL to reference main_v5 (#4435)

* DENG-476 - Update internet outages to reference main_v5 (#4432)

* Fix test for mozfun.norm.result_type_to_product_name (#4487)

* Bug 1860814 -  fix amo_prod__desktop_addons_by_client (#4481)

* quick fix

* fix spread out groupby

* move out sourcetable query

---------

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* fix for #4481 (#4489)

* DENG-1781- Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2 (#4486)

* Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2

* Include all sql_gen files in package (#4490)

When the bigquery-etl package is installed from pypi (or locally
via `pip install .`), the only non-py files included in the package
are those in the `package_data` section of setup.py.

Previously, with just those files, sql generation would fail due
to missing files. Because this directory is small, we should
include all files so no one accidentally runs into this problem
again.

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Bump types-requests from 2.31.0.2 to 2.31.0.10 (#4475)

Bumps [types-requests](https://github.com/python/typeshed) from 2.31.0.2 to 2.31.0.10.
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-requests
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump mozilla-metric-config-parser from 2023.9.2 to 2023.10.2 (#4476)

Bumps [mozilla-metric-config-parser](https://github.com/mozilla/metric-config-parser) from 2023.9.2 to 2023.10.2.
- [Release notes](https://github.com/mozilla/metric-config-parser/releases)
- [Commits](https://github.com/mozilla/metric-config-parser/compare/2023.9.2...2023.10.2)

---
updated-dependencies:
- dependency-name: mozilla-metric-config-parser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Glean server knobs monitoring table (#4491)

* Glean server knobs monitoring table

* fix code gen and skip dry-run

* Remove view creation in query

* DENG-1879 Setup import of emails table from FxA stage CloudSQL (#4493)

* DENG-1879 Setup import of emails table from FxA prod CloudSQL (#4494)

* Bump jsonschema from 4.19.0 to 4.19.2 (#4495)

Bumps [jsonschema](https://github.com/python-jsonschema/jsonschema) from 4.19.0 to 4.19.2.
- [Release notes](https://github.com/python-jsonschema/jsonschema/releases)
- [Changelog](https://github.com/python-jsonschema/jsonschema/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/python-jsonschema/jsonschema/compare/v4.19.0...v4.19.2)

---
updated-dependencies:
- dependency-name: jsonschema
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: akkomar <akkomar@users.noreply.github.com>

* Bump pytest from 7.4.2 to 7.4.3 (#4496)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.2 to 7.4.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.4.2...7.4.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Enforce no date partition parameter in DAG (#4497)

* Use mozfun.glean.parse_datetime to parse ping_info fields (#4464)

In future versions of Glean that timestamp can be more precise, so we
need to ensure we correctly parse it.

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Remove mmccorquodale from DAG owners (#4492)

* Fix test for norm.glean_ping_info

* Bump black from 23.9.1 to 23.10.1

Bumps [black](https://github.com/psf/black) from 23.9.1 to 23.10.1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/23.9.1...23.10.1)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump sqlglot from 18.11.4 to 19.0.1 (#4500)

Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 18.11.4 to 19.0.1.
- [Changelog](https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tobymao/sqlglot/compare/v18.11.4...v19.0.1)

---
updated-dependencies:
- dependency-name: sqlglot
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Materialized views and aggregated tables for event monitoring (#4478)

* WIP event monitoring

* Add FxA custom events to view definition (#4483)

* Add FxA custom events to view definition

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Move event monitoring to glean_usage generator

* Add cross-app event monitoring view

* Generate cross app monitoring

* Simplyfy event monitoring aggregation

---------

Co-authored-by: akkomar <akkomar@users.noreply.github.com>

* Remove generated DAGs from main (#4507)

* Add output_dir to command dag generate. (#4512)

* Add output_dir to command dag generate.

* output_dir to command dag generate.

* output_dir to command dag generate.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>

* Bump pyarrow from 13.0.0 to 14.0.0 (#4511)

Bumps [pyarrow](https://github.com/apache/arrow) from 13.0.0 to 14.0.0.
- [Commits](https://github.com/apache/arrow/compare/go/v13.0.0...go/v14.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump pre-commit from 3.4.0 to 3.5.0 (#4510)

Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 3.4.0 to 3.5.0.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v3.4.0...v3.5.0)

---
updated-dependencies:
- dependency-name: pre-commit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove distinct_docids query (#4449)

* Bump pip from 23.0 to 23.3 (#4516)

Bumps [pip](https://github.com/pypa/pip) from 23.0 to 23.3.
- [Changelog](https://github.com/pypa/pip/blob/main/NEWS.rst)
- [Commits](https://github.com/pypa/pip/compare/23.0...23.3)

---
updated-dependencies:
- dependency-name: pip
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump mkdocs-material from 9.3.1 to 9.4.7 (#4518)

Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.3.1 to 9.4.7.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.3.1...9.4.7)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Dont generate dags in bqetl query schedule command (#4517)

* Add query to load application information from probe info service (#4508)

* prefixing schema error message inside dryrun to "ERROR" to make it easier to find when searching logs for cause of exit code 1 (#4522)

* updated schema for telemetry_derived/clients_last_seen_joined_v1 to align it with the query results (#4523)

* Update scheduler of aggregates to run after upstreams. (#4503)

* Update scheduler of aggregates to run after upstreams.

* Update dags for new scheduler of analytics_aggregates

* Update dag bqetl_search

* Remove DAG.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>

* Set depend_on_past=False for warn checks (#4526)

* Add map.set_key to mozfun (#4527)

* Add map.set_key to mozfun

* Disallow NULL keys in maps

* DS-3281 - Add client adclicks history table (#4528)

* Add client adclicks history table

* Add alias to ad_click_history col

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Remove partition parameter on table write

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Add experiment information to event monitoring (#4519)

* feat(DENG-1774): adding fenix derived firefox android clients v2 (#4424)

* added fenix_derirved.firefox_android_clients_v2

* added ETL checks for fenix_derirved.firefox_android_clients_v2

* made changes as suggested by bani in PR#4424

* converting unique check for android clients v2 until duplication is resolved

* added install_source field to firefox_android_clients_v2 and formatting applied on checks

* added locale field and modified the query to suppot is_init()

* removed generated dag due to new generation process

* Add submission_date param to adclicks history (#4531)

* DS-3054. Support running an initialization query in parallel (#4322)

* DS-3054. Create functions to support running an initialization query for all sample_ids in parallel.

* DS-3054. Update _run_query function.

* DS-3054. Use _run_query and mapped values for initialization in parallel.

* DS-3054. Unify initialization to run in parallel and get sample_id range from metadata.

* DS-3054. Minimize formatting of query template and remove need to modify existing initialization queries. Validate if a query should use parallelized or regular update.

* DS-3054. Adding link to caveats.

* DS-3054. Update sample_id range for initialization.

* DS-3054. Use current implementation of run_query.

* DS-3054. Update using a parameter instead of initialization in metadata.

* DS-3054. DAG update with new parameter.

* Pass parameters before calling _run_query().

* Use --append_tablein favour of INSERT INTO.

* DS-3054 Separate parallel and non parallel init, plus some improvements.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>

* Add ios baseline_clients_yearly (#4506)

* DENG-1935 Change data ordering from pings in clients-first-seen-v2 (#4533)

* DENG-1935 Change data ordering from pings in clients-first-seen-v2
* Added main ping for client-3, maintain chosen ping

* Fix comments in event monitoring queries (#4535)

* DENG-1705 - Add missing client attribution columns to clients daily/first-seen (#4505)

* DENG-1705 Add missing client attribution columns to clients daily/firstseen
* Update clients_last_seen_joined

* Rename main_v4 -> main_v5 in ssl_ratios tests (#4536)

* Make base tables configurable in glean_usage generator (#4534)

* Make base tables configurable in glean_usage generator

* Fix event extras unnesting in event monitoring

* Bump sqlglot from 19.0.1 to 19.0.3 (#4521)

Bumps [sqlglot](https://github.com/tobymao/sqlglot) from 19.0.1 to 19.0.3.
- [Changelog](https://github.com/tobymao/sqlglot/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tobymao/sqlglot/compare/v19.0.1...v19.0.3)

---
updated-dependencies:
- dependency-name: sqlglot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* DS-3272 - Review checker data model for mobile (#4498)

* Add mobile shopping data

* Remove the ff desktop from sql_generator

* Fix build issue

* Incorporate feedback from Bruce

* Add clients table for mobile

* FIX CI issue

* Incorporate Bruce's feedback

* Incorporate Curtis' feedback

* Fix event_monitoring_aggregates_v1 template (#4537)

This will ensure that FxA tables are included in the aggregate.

* Fixing query error in fenix_derived/firefox_android_clients_v2/checks.sql (#4539)

* Add missing clients view to fenix review checker (#4540)

* add other projects to query from for bq usage, add for loop (#4529)

* add other projects to query from for bq usage, add for loop

* create new function to gather jobs_by_project data into temp table, update create_query function to join jobs_by_org table to jobs_by_project tmp table

* take out date from tmp table as it is unnecessary

* refactor to take out irrelevant function, rewrite SQL to look at other projects

* add date filter to jobs_by_project

* add comment for future refactoring

* add tmp_table for jobs_by_project table

* create function to loop through projects for jobs_by_project, revise query to join jobs_by_org with jobs_by_project tmp table

* take out ambiguous DATE filter

* take out r_prefix in regex from query string. Take out tmp table function. Add proper date filter

* take out r_prefix in regex from query string. Take out tmp table function. Add proper date filter

* add back in the r_prefix and add in the extra space in the Query ID regex that was needed

* updated two affected fields across task_instance and trigger airflow metadata tables to type JSON (#4545)

* Fix event monitoring template (#4546)

Nulls need to be casted to string to make the union work.

This will fix https://workflow.telemetry.mozilla.org/log?execution_date=2023-11-09T02%3A00%3A00%2B00%3A00&task_id=monitoring_derived__event_monitoring_aggregates__v1&dag_id=bqetl_monitoring&map_index=-1

* removed check for firefox_ios_clients_v1 which used different filtering settings causing result mismatch (#4547)

* iOS attributable_clients use metrics adclicks (#4543)

* iOS attributable_clients use metrics adclicks

* Remove project id from table name

Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>

---------

Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>

* Use correct submission_* field (#4549)

* Use correct app_version field (#4551)

* Revert "updated two affected fields across task_instance and trigger airflow metadata tables to type JSON (#4545)" (#4552)

This reverts commit 9750d331ca.

* DENG-1705 - Add startup_profile_selection_reason from first ping to clients_daily, clients_first_seen_v2 and downstream (#4482)

* DENG-1705 - Add startup_profile_selection_reason to clients_first_seen

* Add startup_profile_selection_reason_first_ping_only

* Query typo

* Update test schema

* Update sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_28_days_later_v1/schema.yaml

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* change filter on final query to go back to May 2023 - the min date in the Jobs by Project table as of 11/13/23 (#4559)

* change filter on final query to go back in history

* take out extraneous WHERE

* add DISTINCT to final query

* Add rust result types to product mapping (#4544)

* missing-mobile-fields-review-checker (#4553)

* noting that we are missing some fields

* adding is_fx_dau to android and ios clients

* add missing columns to schema.yaml

add schema.yaml

add schema.yaml

* Delete sql/moz-fx-data-shared-prod/firefox_desktop/serp_events/view.sql

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* Add aggregate table to monitor event errors (#4548)

* updated fenix_derived.funnel_retention_clients_* to use clients view instead of table directly (#4563)

* Bug 1864722 - Fix column name typo (#4567)

* add referenced tables to metadata.yaml to make sure jobs_by_org task … (#4568)

* add referenced tables to metadata.yaml to make sure jobs_by_org task runs before bigquery_usage_v2 task

* Update sql/moz-fx-data-shared-prod/monitoring_derived/bigquery_usage_v2/metadata.yaml

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* Generate normal task dependencies from `depends_on` if the task is in the same DAG (#4569)

* Generate normal task dependencies from `depends_on` if the task is in the same DAG.

* Update `metadata.yaml` files to use `depends_on` rather than `upstream_dependencies`.

* Add a period-over-period check for revenue data (#4566)

* Check for period over period changes in column sum

* Fix percent change calculation

* Fix errors in navigation function logic

* Rename period over period check to specify revenue

* Remove references to period over period check

---------

Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>

* feat(): updated fenix_derived.firefox_android_clients_v2 to include reported_baseline_ping field (#4565)

* updated fenix_derived.firefox_android_clients_v2 to include reported_baseline_ping field

* Update sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v2/query.sql

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* summing sap and ad clicks (#4571)

* remove file that isn't ready yet (#4572)

* Add ga.nullify_string UDF (#4556)

* Add ga.nullify_string UDF

* Add README line

* added fenix_derived.firefox_android_clients_v2 to shredder config (#4564)

* Use client_info.app_channel for event monitoring channels (#4575)

* Add ga_sessions_v1 table & view (#4554)

* Add ga_sessions_v1 table & view

This table aggregates session-level data from GA.

* Rename nullify string func

* Apply suggestions from code review

Co-authored-by: Alexander <anicholson@mozilla.com>

* Add upstream backfill deps

* Move depends_on to correct section

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Make sure that metadata `friendly_name` and `description` are not None (#4513)

* Fill empty description

* Assign a friendly name if the table doesn't have one

* Update metadata tests

* Update bigquery_etl/metadata/parse_metadata.py

Co-authored-by: Alexander <anicholson@mozilla.com>

* update test again

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Add back normalized_app_id (#4580)

* Add session date param; fix checks CLI bug (#4579)

* Fix checks to filter on partitions

* Don't print "missing checks file" on success

Previously, the statement that checks.sql files
were missing was printed on any execution of the for
statement. ("else" clauses after "for"s execute after
completion of the "for" clause).

Instead, we want to print only when there are no files.

* Add derived stub attribution logs (#4557)

* Add derived stub attribution logs

This table keeps triplets from the stub attribution logs.
The triplet of (dl_token, ga_client_id, stub_session_id)
will only ever appear once here.

See the associated decision brief:
https://docs.google.com/document/d/1L4vOR0nCGawwSRPA9xiR8Hmu_8ozCGUecXAtBWmGGA0/edit

* Move stub attribution table to new dataset

In order to ensure limited access to the stub attribution service
data without significantly decreasing developer velocity, we
move these tables to a new dataset. That dataset has the defaults
we want for all stub attribution log data:
- Defaults to just read access to data-science/DUET workgroup
- No read/write access for DE

We will backfill via the bqetl_backfill DAG.

* Rename view

* Use correct dataset name in view

* Skip dryrun; no access

* Add gclid_conversions table & view (#4558)

* Add gclid_conversions table & view

This table will support the desktop conversion events.
Each valid GCLID will have any associated conversion events.

See the decision brief:
https://docs.google.com/document/d/1T8ArA9r8HDMTj1ES9NHfJFv2gUWo7w0MjG07iXtuUOI

* Use correct table name

* Use new stub attribution dataset; clarify activity_date

* Use correct date_partition_parameter

Co-authored-by: Alexander <anicholson@mozilla.com>

* Include activity_date as parameter

* Use INNER instead of LEFT joins

* Update doc strings to clarify GCLID vs GA Session

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Include GA intraday sessions tables (#4582)

* Include GA intraday sessions tables

* Update doc string on backfilling ga_sessions

* Dont dryrun stub_attribution view

* Update min_row_count error text (#4586)

* Add conversion event; fix gclid conversions query (#4584)

* Add first_run conversion; use correct table names

* Ignore dryrun of query and view

* Remove HAVING clause; fix logical_or

* migrates old pingcentre onboarding artifacts to new firefox_desktop view (#4457)

* migrates old pingcentre onboarding artifacts to new firefox_desktop view

* generate event rollup dag

* generate review checker dag

* update messaging system dag

* incl project in table names

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Add ga_clients_v1 table & view (#4560)

* Add ga_clients_v1 table & view

- Query from ga_sessions
- Fix tests

* Use correct scheduling parameters

Co-authored-by: Alexander <anicholson@mozilla.com>

* Move HAVING clause to WHERE

Co-authored-by: Alexander <anicholson@mozilla.com>

* Change CTE name

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* Remove duplicate BQ query param (#4587)

* Firefox ios adclicks (#4585)

* Add Firefox iOS client adclicks history

* Add metadata description to view

* DS-3272 - Fix review checker clients to remove dups (#4583)

* Fix review checker clients to remove dups

* Fix CI issues

* Add row_num filter

* add submission_date to partition

* remove submission_date from partition

* Account for NULL handling in joins (#4590)

Previously, NULL values in the join keys didn't join, resulting
in duplicate rows. This change will coalesce those to empty
strings and NULLIFY them in the view.

* Bug 1865716 - Include errorGroups in legacy docker_fxa_admin_server_sanitized query (#4589)

`errorGroups` field was added in `docker_fxa_admin_server_sanitized_v2` and breaks the UNION.

* DS-3361. Update documentation of initialize command. (#4592)

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>

* Link to full diff in git comments (#4593)

* Link to full diff in git comments

* Show full diff of new and deleted files

* Correct DAG description as DAG is currently active. (#4596)

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>

* Login funnel conversions (#4591)

* Mozilla accounts login funnel conversion for overall, with email confirmation, and with two factor authentication

* Update sql_generators/funnels/configs/login_funnels.toml

* Update sql_generators/funnels/configs/login_funnels.toml

---------

Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Use live tables to determine deletion request ping volume (#4442)

* Increase no_output_timeout for long-running CI jobs (#4602)

* SVCSE-1595 Setup import of tables from staging FxA databases (#4578)

* In generated diffs explicitly list the files being added or deleted. (#4600)

* Glam accounts for sampling when calculating sample_count for windows & release probes (#4581)

* Glam - fix legacy windows & release probes' sample count going fwd

* Glam FOG accounts for sampling when calculating total_sample for windows & release probes

* fog - fix client count and sample count

* Add channel filtering for fog

* SVCSE-1595 Setup import of tables from production FxA databases (#4597)

* Bug 1866469 - Exclude use_counters from GLAM ETL (#4603)

* Bug 1866469 - Exclude use_counters from GLAM ETL

* Attempt to fix tests

---------

Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>

* feat(): updating fxa android funnel to support install_source filtering downstream (#4561)

* Added a filter to only include playstore data

In keeping the bottom of the funnel consistent with the upper funnel, we have to only include installs from play store in the bottom of the funnel metrics

* for fenix_derived.funnel_retention_clients_week_* tables making sure we only include playstore users

* updating the changes as requested by soGaussian to expose to users the install_source field to enable filtering

---------

Co-authored-by: richard baffour <baffour345@gmail.com>

* Add schema.yaml to urlbar_events (sql_generator) (#4595)

* Add schema.yaml to urlbar_events

* SVCSE-1595 Update accounts_db schemas to match deployed tables. (#4604)

* SVCSE-1595 Update more accounts_db schemas to match deployed tables (#4605)

* Fix num_chars_typed in urlbar_events schema (#4607)

* Add init clause to ga_clients table (#4611)

* Give census access to gclid conversions data (#4613)

* Don't nest SQL generated from `main` branch in extra `sql` directory. (#4614)

* Add desktop_acquisition_funnel view (#4616)

* Add desktop_acquisition_funnel view

* Update reference

* Update view.sql

Took out some of the TODO comments around naming to stay consistent with the table it is reading as well as reduce effort to make changes to the spoke-default view that is currently setup with test data.

---------

Co-authored-by: gkabbz <gkabbz@gmail.com>

* added ETL checks to fenix_derived.firefox_android_clients_v1 (#4609)

* DENG-2013 - Add explicit dependencies & checks for history (#4620)

* Fix the source table to point to unified view to include all apps (#4622)

* Deng 1662 move google ads to ads google mmc connector (#4525)

* DENG-1662 move from google_ads connector to ads_google_mmc connector

* format queries

* add code for cohort_daily_statistics using clients_first_seen_v2 with… (#4404)

* add code for cohort_daily_statistics using clients_first_seen_v2 with new columns from clients_first_seen_v2

* take out extra sample_id

* Update sql/moz-fx-data-shared-prod/telemetry_derived/cohort_daily_stats_clients_frst_seen_v2/query.sql

switching column names - original was swapped

Co-authored-by: Alexander <anicholson@mozilla.com>

* update column names- change cohort_date to first_seen_date, make more descriptive; take out client_id and sample_id in the final table; take out extraneous columns that are not used in final table

* fix group by - days_seen_bits not days_interacted_bits

* take out second_seen_date, irrelevant

* change date _activity to submission_date

* replace submission_date_activity with client_activity

* add new line at end of schema.yaml file

* refactor code to use clients_first_seen_v2, originally commited cohorts_daily_statistics_v1 code in the v2 file

* add cohort_daily_statistics_v2 job to DAG

* add cohort_daily_statistics_v2 job to DAG, take out submission_date and add activity_date to query.sql

* delete now needless dags folder

* correct alias of table

* change submission_date to activity_date

* fix column name apple_model to apple_model_id

* add days_seen_dau_bits and other calculations based on this

* add attribution_dlsource to table

* take out underscore from column name, attribution_dlsource

* revise comment - 196 days not 180 days

* add all the other columns from clients_first_seen_v2, update schema.yaml file with new columns

* take out sample_id, fix schema

* take out document_id, dl_token, app_build_id columns, rename activity_date to submission_date, rename cohort_date to first_seen_date to match clients_first_seen_28_days_later

* move files from cohort_daily_statistics_v2 to desktop_cohort_daily_retention_v1 to reflect name change, take out extraneous colums such as xpcom_abi, attribution_dlsource, engine_data columns

---------

Co-authored-by: Alexander <anicholson@mozilla.com>

* add --project_id command, take out extraneous dashes in start and end commands in creating dataset cookbook (#4626)

* change docs (#4629)

* fix typo in project name (#4628)

* fix typo in project name

* remove shared-prod project from sql for google_ads_derived

* Fixes #4624 - Add a view for firefox_desktop.broken_site_report (#4625)

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Separate Airflow tasks for glean_usage (#4588)

* Add support for assigning Airflow tasks to task groups

* Generate separate Airflow tasks for glean_usage

* Remove Airflow dependencies from old glean_usage tasks

* Update dataset_metadata.yaml for broken site reports (#4630)

* Add user-facing view to fxa_oauth.clients (#4623)

* Fix jinja templating in glean usage metadata (#4636)

* feat(DENG-1774 / cancelled): deleting fenix_derived/firefox_android_clients_v2, v1 will remains the active model (#4610)

* deleting fenix_derived/firefox_android_clients_v2, v1 will remain the active model

* removed fenix_derived.firefox_android_clients_v2 from shredder config

* firefox_ios source added to shredder config (#4638)

* Skip check for baseline_clients_last_seen for Fire TV (#4640)

* Resolve correct task_id for tasks nested in a group (#4637)

* Android LTV UDFs (#4633)

* Add Android State UDF

* Add Android Markov States UDFs for LTV

* Make docstrings consistent

* Update doc string

Co-authored-by: Leif Oines <leifdoines@gmail.com>

---------

Co-authored-by: Leif Oines <leifdoines@gmail.com>

* Migrated DIM checks over to ETL checks for internet_outages.global_outages_v1 (#4639)

* Speed up glean_usage generation by caching the table getter (#4644)

`get_tables` is deterministic under the assumption that the tables don't
change in between invocations. Which I hope holds here.
We therefore can just cache that value so that subsequent runs quickly
return without needing a roundtrip to BigQuery again.

* fixing broken test for firefox_ios_derived.baseline_clients_yearly_v1 (#4645)

* Feat/deng 2046/migrating telemetry derived active users aggregates v1 dim checks to etl checks (#4641)

* Migrated DIM checks over to ETL checks for telemetry_derived.active_users_aggregates_v1

* rewrite

* code review suggestions

* add doc

* rename

---------

Co-authored-by: kik-kik <kignasiak@mozilla.com>

* Minimize previous PR diff comments when CI posts a new diff comment (#4635)

* Minimize previous PR diff comments when CI posts a new diff comment.

* Update Node image to latest version available from CircleCI and pin Node packages.

* GLAM avoid scientific notation for big sample counts (#4647)

* GLAM avoid scientific notation for big sample counts

* Cast to bignumeric instead of numeric

* feat(DENG-2083): added firefox_ios_derived.clients_activation_v1 and corresponding view (#4631)

* added firefox_ios_derived.clients_activation_v1 and corresponding view

* fixing a missing seperator in firefox_ios_derived.clients_activation_v1 checks

* adding firefox_ios_derived.clients_activation_v1 to shredder configuration

* removed is_suspicious_device_client as it should not be there, thanks bani for pointing this out

* fixed black formatting error inside shredder/config.py

* applied bqetl formatting

* minor styling tweak as suggested by bani in PR#4631

* Remove baseline_clients_daily DAG dependency for FF ios baseline clients yearly (#4651)

* Support offset backfills, require metadata  (#4627)

* Skip backfills for queries without metadata.yaml

* Support date_partition_offset

* Fixed exclude, modified exception

* Add test for offset backfill

* Apply suggestions from code review

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Formatting

---------

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* add dau_clients_days_since_seen to CTE and num_clients_dau_on_day column to table in query and schema (#4652)

* Docs: Avoid newline in link

mkdocs doesn't like that newline and will treat the URL as a relative URL, thus breaking the link

* Docs: Use 3rd level heading for UDFs

mkdocs' ToC generator will stop when the header level goes up again.
Because the UDF name itself is generated as a first level heading, any
UDF with a first-level header documentation will thus stop rendering any
subsequent headers.
Most notably on /mozfun/hist where only the very first UDF got a ToC
entry.

* Docs: Link to section on the same page

The separate chapter was removed in #4293

* Migrated DIM checks over to ETL checks for telemetry_derived.unified_metrics_v1 (#4649)

* feat(DENG-2120): migrated over checks defined in DIM for baseline_clients_last_seen fenix. (#4656)

* migrated over checks defined in DIM for this type of dataset

* Update sql_generators/glean_usage/templates/baseline_clients_last_seen_v1.checks.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Create tables that have state values per day (#4634)

* Create tables that have state values per day

* Change Airflow DAG

* Move markov states to cols rather than array

* Move bot/bad client filter to materialized table

* Add install_source and consecutive_days_seen features

* Add field to CTE

* Use jinja vars instead of sql variables

* Use correct UDF incantation

* Use live tables for structured error counts (#4598)

* Use live tables for structured error counts

* Prevent from old records being deleted

* Fix structured_error_counts query (#4659)

* Authorize view and add workgroup access for taskcluster (#4661)

* Add metadata.yaml for socorro_crash_v2 (#4664)

* Temporarily add curtis to CODEOWNERS until he can be added to group (#4665)

* Add clients_daily_joined view (#4660)

* add view.sql to telemetry and desktop_cohort_daily_retention view (#4666)

* Skip accounts_db.fxa_oauth_clients in view validation (#4667)

* Public GLAM datasets (#4606)

* Public GLAM datasets

* Remove Fenix GLAM datasets

* DENG-1352 - Migrate contextual services ETL to desktop glean pings (#4474)

* Have `bqetl query` commands fail if they don't find a matching query (#4662)

* Have `bqetl query` commands fail if they don't find a matching query.

* Update `test_run_query_no_query_file` test.

* Skip accounts_db.fxa_oauth_clients dryrun (#4671)

* Remove referenced_table from firefox_android_clients (#4674)

* Define `event_monitoring_live_v1` views in `view.sql` files (#4576)

* Define `event_monitoring_live_v1` views in `view.sql` files.

So they get automatically deployed by the `bqetl_artifact_deployment.publish_views` Airflow task.

* Support materialized views in view naming validation.

* Handle `IF NOT EXISTS` in view naming validation.

* Use regular expression to extract view ID in view naming validation.

This simplifies the logic and avoids a sqlparse bug where it doesn't recognize the `MATERIALIZED` keyword.

* Update other view regular expressions to allow for materialized views.

* Add state location for US & Canadian VPN subscriptions (DENG-2099) (#4675)

* add triage/confidential tag to docs (#4678)

* feat(DENG-2156): added value_length check and updated some of the ETL checks to use the macro (#4672)

* added value_length check and updated some of the ETL checks to use the macro

* added the new check macro to the data checks docs

* implemented lelilia feedback from PR#4672

* simplified the sql logic for the value_length check

* Skipping copying checks for baseline tables for apps marked as not receiving the baseline ping (#4670)

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Revert "Define `event_monitoring_live_v1` views in `view.sql` files (#4576)" (#4680)

This reverts commit 2c4cc5eefe.

* Change directory to generate private DAGs so `sql_file_path` values are relative to the repo root. (#4668)

* `cd` into `private-bigquery-etl` repo when generating DAGs.

To avoid generated DAGs having incorrect absolute paths for ETLs using SQL scripts.

* Revert "Temporarily add curtis to CODEOWNERS until he can be added to group (#4665)" (#4669)

This reverts commit 8d94a86017.

* ci-fix Ignore dataset.update required permissions when dryrunning authorized views (#4681)

* Refactor, add typehint
* Add datasets.update clause denied for authorized views

* add country dimension

* remove generated and old files

* delete genertated files

* regenerate sql and delete more files

* last edits to android funnel before review

* change description fields

* modify config to add retention outcomes

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sergio E. Betancourt <37666064+sergiosonline@users.noreply.github.com>
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
Co-authored-by: m-d-bowerman <107562575+m-d-bowerman@users.noreply.github.com>
Co-authored-by: akkomar <akkomar@users.noreply.github.com>
Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: wil stuckey <wstuckey@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
Co-authored-by: Jan-Erik Rediger <jrediger@mozilla.com>
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
Co-authored-by: David Zeber <dzeber@mozilla.com>
Co-authored-by: betling <betling@mozilla.com>
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
Co-authored-by: Linh Nguyen <linhnguyen@mozilla.com>
Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: ksiegler1 <ksiegler@mozilla.com>
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
Co-authored-by: richard baffour <baffour345@gmail.com>
Co-authored-by: gkabbz <gkabbz@gmail.com>
Co-authored-by: Ksenia <kberezina@mozilla.com>
Co-authored-by: kik-kik <kignasiak@mozilla.com>
2023-12-14 12:14:02 -08:00
kik-kik 34d6a463dc
feat(DENG-2175): added matches_pattern etl checks macro and updated tests to use it (#4683)
* added matches_pattern etl checks macro and updated tests to use it

* added matches_pattern etl check macro to the data checks docs
2023-12-13 16:10:34 +01:00
Leli 4ca6afea07
feat/DENG-2051 migrate active users aggregates (#4673)
* status quo

* fix file_ending

* added checks for all

* fixed comment

* fix typo

* remove fenix per channel checks

* reformat checks templates

* readd endraw

* reformat

* combined some templates

* move logic to dict
2023-12-13 14:15:44 +01:00
Anna Scholtz 74e39946b1
Remove glean_usage checks to use fewer Airflow resources (#4687) 2023-12-12 09:55:14 -08:00
Lucia 18c717325e
DENG 1051 Shredder Prototype (#4677)
* DENG-1051. Collect KPIs for Desktop's deletion requests.
* DENG-1051. Add partition field and format query.
2023-12-12 14:30:22 +01:00
Anna Scholtz c31ae16efb
Revert "Define `event_monitoring_live_v1` views in `view.sql` files (#4576)" (#4680)
This reverts commit 2c4cc5eefe.
2023-12-11 10:15:30 -08:00
kik-kik 6e2e8f6f68
Skipping copying checks for baseline tables for apps marked as not receiving the baseline ping (#4670)
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
2023-12-11 10:51:42 -05:00
kik-kik 2db2f3e2de
feat(DENG-2156): added value_length check and updated some of the ETL checks to use the macro (#4672)
* added value_length check and updated some of the ETL checks to use the macro

* added the new check macro to the data checks docs

* implemented lelilia feedback from PR#4672

* simplified the sql logic for the value_length check
2023-12-11 16:31:25 +01:00
Sean Rose 2c4cc5eefe
Define `event_monitoring_live_v1` views in `view.sql` files (#4576)
* Define `event_monitoring_live_v1` views in `view.sql` files.

So they get automatically deployed by the `bqetl_artifact_deployment.publish_views` Airflow task.

* Support materialized views in view naming validation.

* Handle `IF NOT EXISTS` in view naming validation.

* Use regular expression to extract view ID in view naming validation.

This simplifies the logic and avoids a sqlparse bug where it doesn't recognize the `MATERIALIZED` keyword.

* Update other view regular expressions to allow for materialized views.
2023-12-08 11:54:02 -08:00
kik-kik b7b6c58e23
feat(DENG-2120): migrated over checks defined in DIM for baseline_clients_last_seen fenix. (#4656)
* migrated over checks defined in DIM for this type of dataset

* Update sql_generators/glean_usage/templates/baseline_clients_last_seen_v1.checks.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-12-06 17:47:50 +01:00
Jan-Erik Rediger ca21c91b28
Speed up glean_usage generation by caching the table getter (#4644)
`get_tables` is deterministic under the assumption that the tables don't
change in between invocations. Which I hope holds here.
We therefore can just cache that value so that subsequent runs quickly
return without needing a roundtrip to BigQuery again.
2023-12-04 15:46:34 +01:00
Anna Scholtz 9399d3ecf0
Fix jinja templating in glean usage metadata (#4636) 2023-11-30 16:30:15 -08:00
Anna Scholtz 7087dbff30
Separate Airflow tasks for glean_usage (#4588)
* Add support for assigning Airflow tasks to task groups

* Generate separate Airflow tasks for glean_usage

* Remove Airflow dependencies from old glean_usage tasks
2023-11-30 09:48:17 -08:00
Anna Scholtz 1bdd2016af
Fix num_chars_typed in urlbar_events schema (#4607) 2023-11-27 16:10:08 -08:00
Alekhya 85fa52c9c1
Add schema.yaml to urlbar_events (sql_generator) (#4595)
* Add schema.yaml to urlbar_events
2023-11-27 11:55:31 -05:00
ksiegler1 57013e241e
Login funnel conversions (#4591)
* Mozilla accounts login funnel conversion for overall, with email confirmation, and with two factor authentication

* Update sql_generators/funnels/configs/login_funnels.toml

* Update sql_generators/funnels/configs/login_funnels.toml

---------

Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MBP-2.attlocal.net>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-11-22 09:52:18 -08:00
Anna Scholtz 1fe5f2d095
Add back normalized_app_id (#4580) 2023-11-17 12:06:53 -08:00
Anna Scholtz 8d7e0fa264
Use client_info.app_channel for event monitoring channels (#4575) 2023-11-16 10:22:45 -08:00
akkomar dc2382771c
Add aggregate table to monitor event errors (#4548) 2023-11-14 15:29:00 +01:00
akkomar 2a0be11f34
Fix event monitoring template (#4546)
Nulls need to be casted to string to make the union work.

This will fix https://workflow.telemetry.mozilla.org/log?execution_date=2023-11-09T02%3A00%3A00%2B00%3A00&task_id=monitoring_derived__event_monitoring_aggregates__v1&dag_id=bqetl_monitoring&map_index=-1
2023-11-10 12:06:12 +01:00
akkomar 0c6197812a
Fix event_monitoring_aggregates_v1 template (#4537)
This will ensure that FxA tables are included in the aggregate.
2023-11-09 08:38:47 -08:00
Alekhya f55f6b64ce
DS-3272 - Review checker data model for mobile (#4498)
* Add mobile shopping data

* Remove the ff desktop from sql_generator

* Fix build issue

* Incorporate feedback from Bruce

* Add clients table for mobile

* FIX CI issue

* Incorporate Bruce's feedback

* Incorporate Curtis' feedback
2023-11-09 08:54:18 -05:00
Anna Scholtz 3cfaf4871c
Make base tables configurable in glean_usage generator (#4534)
* Make base tables configurable in glean_usage generator

* Fix event extras unnesting in event monitoring
2023-11-08 16:30:11 -08:00
akkomar 2ca7b4be76
Fix comments in event monitoring queries (#4535) 2023-11-08 19:21:52 +01:00
Anna Scholtz 0ae76d0971
Add experiment information to event monitoring (#4519) 2023-11-06 16:13:02 -08:00
Anna Scholtz 2c5ca38e81
Fix funnel generation logic (#4514) 2023-11-02 13:19:40 -07:00
Anna Scholtz 185f833f2a
Materialized views and aggregated tables for event monitoring (#4478)
* WIP event monitoring

* Add FxA custom events to view definition (#4483)

* Add FxA custom events to view definition

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

* Update sql_generators/event_monitoring/templates/event_monitoring_live.init.sql

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Move event monitoring to glean_usage generator

* Add cross-app event monitoring view

* Generate cross app monitoring

* Simplyfy event monitoring aggregation

---------

Co-authored-by: akkomar <akkomar@users.noreply.github.com>
2023-11-01 14:20:20 -07:00
Jan-Erik Rediger b592df7b5b
Use mozfun.glean.parse_datetime to parse ping_info fields (#4464)
In future versions of Glean that timestamp can be more precise, so we
need to ensure we correctly parse it.

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-10-31 09:25:15 -07:00
Alekhya 28e7dbfa12
DENG-1781- Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2 (#4486)
* Remove urlbar_events_temp_v2 view and repoint urlbar_events view to v2
2023-10-27 14:19:19 -04:00
Daniel Thorn a9eb4a8242
DENG-476 - Update experiment aggregates ETL to reference main_v5 (#4435) 2023-10-26 10:12:19 -07:00
Alekhya a685e17279
CRemove the filter on messaging_system_event (#4465) 2023-10-23 09:53:10 -04:00
Alekhya 6f3d34ba67
DS3244 - Add derived datasets for review checker data (#4447)
* Add review checker derived datasets

* Add bqetl_review_checker dag

Fix

* Fix CI validate dag step

* Incorporate feedback from Alex

* Fix CI

* change client last seen to clients first seen

change client last seen to clients first seen

* fix dag
2023-10-18 16:57:15 -04:00
Frank Bertsch 164ba19abf
Glean usage checks (#4445)
* WIP: Add checks for glean_usage

* Ignore pycache in autogenerated click cmds

* Move check to backfill command

* Remove view checks
2023-10-17 17:03:41 -04:00
Alekhya 7f0b7d522a
DENG1546 - Fix bqetl_serp DAG failure (#4429) 2023-10-17 15:51:48 -04:00
Anna Scholtz 358ea9d574
Fix Glean usage generation; don't skip query generation when tables don't exist (#4446) 2023-10-17 10:54:57 -07:00
Alekhya f0c53d7afe
DENG1546 - Fix bqetl serp DAG failure due to missing parameter (#4426)
* DENG1546 - Fix DAG failure due to missing parameter

* Fix CI issue

* Fix logic in parameter
2023-10-16 13:23:57 -04:00
Alekhya f0af9885c4
DENG1546 - Fix BQ partition issue (#4419)
* DENG1546 - Fix BQ partition issue

FIX CI issue

fix CI issue

* Update sql_generators/serp_events/templates/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/serp_events/templates/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/serp_events/templates/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* fix dag generation

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-10-14 13:12:44 -04:00
Alekhya 22c44fc62a
DENG1546 -Add a derived dataset for serp events (#4407)
* Add a derived dataset for serp events

* Fix CI issues

Add bqetl_dag

Fix CI issues

* reapply missing updates

* Fix the query logic

Fix the query logic

* Fix sql query

* fix partition field
2023-10-12 19:38:29 -04:00
Yashika Khurana 148d4eb539
feat: Cirrus monitor query (#4393)
* feat: Cirrus monitor query

* Update sql_generators/experiment_monitoring/templates/experiments_daily_active_clients_v1/query.sql

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>

* feat: Monitor cirrus events

* fix: Use events

* Update sql/moz-fx-data-shared-prod/telemetry_derived/experiments_daily_active_clients_v1/query.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/experiment_enrollment_aggregates_v1/query.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/experiment_events_live_v1/init.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/experiment_search_events_live_v1/init.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/experiments_daily_active_clients_v1/query.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/experiments_daily_active_clients_v1/query.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/templating.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/experiment_monitoring/templates/templating.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql/moz-fx-data-shared-prod/monitor_cirrus_derived/experiment_search_events_live_v1/init.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql/moz-fx-data-shared-prod/monitor_cirrus_derived/experiment_events_live_v1/init.sql

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* feat: Update view to table

* feat: Update view to table

* use live tables instead of stable for live events

---------

Co-authored-by: Mike Williams <102263964+mikewilli@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
Co-authored-by: Mike Williams <mwilliams@mozilla.com>
2023-10-12 14:46:26 -07:00
Anna Scholtz 35ae323487
Funnel generators POC (#4390)
* Add funnel generation logic

* Example funnel config

* Fix funnel columns

* funnel generation dimensions

* Optimize segmenting generated funnels

* Add funnel generation docs

* Schedule generated funnels

* Skip DAGs with no tasks

* Add background info funnel generator

* Add funnel generation tests

* Fix join_previous_step_on

* Add funnel example config
2023-10-12 14:05:08 -07:00
Anna Scholtz 61da5cca03
Respect sql_dir in dryrun skip (#4334)
* Respect sql_dir in dryrun skip

* Update bigquery_etl/dryrun.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* Update bigquery_etl/dryrun.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* Set sql_dir when using Schema.from_query_file()

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2023-10-12 13:27:54 -07:00
Frank Bertsch 79fa5487c3
Bug 1852517 - Add execution delta to metadata & dag (#4239)
* Add execution delta to metdata & dag

* Spelling fix

* Update execution hour

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
2023-10-12 18:18:24 +02:00
Alekhya 880f386fbf
Revert "DENG1546 -Add a derived dataset for serp events (#4325)" (#4406)
This reverts commit f0b6089b86.
2023-10-10 21:06:12 -04:00
Alekhya f0b6089b86
DENG1546 -Add a derived dataset for serp events (#4325)
* Add a derived dataset for serp events

* Fix CI issues
2023-10-10 14:40:44 -04:00
Daniel Thorn b2a211eebc
DS-2642 - Add plan, product, and customer shipping info to stripe itemized report (#3998) 2023-10-10 10:33:31 -07:00
Sean Rose 95b32ad3ed
Fix `mozilla_vpn.main` view generated SQL (#4377)
* Quote unqualified field names in `glean_usage` app ping select expressions.

The field names may be reserved keywords in BigQuery (e.g. `range`).

* Generate correct data type for float fields in `glean_usage` app ping select expressions.

BigQuery currently requires the type be `FLOAT64`, not just `FLOAT`.

* Simplify `(SELECT ARRAY_AGG(...))` to `ARRAY(SELECT ...)` in `glean_usage` app ping select expressions.
2023-10-04 09:01:59 -07:00
David Zeber 989cdb094e
Fix typo in open tab count (#4328) 2023-09-19 15:03:45 -04:00
Sean Rose 8a97f74794
Include VPN iOS native Glean events in VPN ETLs. (#4269) 2023-09-12 17:14:00 -07:00
Alekhya 199f027b87
Add clustering info to urlbar events (#4282) 2023-09-12 14:00:07 -04:00
Alekhya be07165ae0
Address Bar metrics (#4118)
* Initial metadata files

* Update dag to schedule the dataset

* correct the submission date

* Incorporate feedback

* Add num of results

* Additional improvements to query.sql

* Fix schema.yaml

* Fix query and schema.yaml

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* feat: UDF for mapping

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* Update sql/moz-fx-data-shared-prod/telemetry_derived/urlbar_search_sessions_v1/query.sql

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>

* UDF related fix

* fix: cast to int

* fix: typos

* fix: remove cross join

* feat: remove intermediate events

* Fix sql error missing comma

* feat: use is_terminal instead of removing rows

* fix: typo in udf

* Move the urlbar events to sql_generators and remove from telemetry_derived

* Move urlbar events to sql_generators

* fix: rename to urlbar_events

* Bug fix

* Fix CI issues

---------

Co-authored-by: Rebecca BurWei <rebecca.burwei@gmail.com>
Co-authored-by: Rebecca BurWei <rburwei@mozilla.com>
Co-authored-by: Dave Zeber <dzeber@mozilla.com>
2023-09-05 12:42:40 -04:00
Alekhya fa47c7055b
DS-2998- Add attribution data to aua to mobile tables (#4153)
* DS-2998- Add attribution data to aua

Add att to aua

Fix sql format CI error

Fix CI Issue

* Fix the mobile scripts

Generate dags to fix CI

* Fix desktop query to remove um reference

Fix dag generate

Generate dags

* Regenerate firefox_ios and kpi_shredder dag

* Remove desktop query and bump mobile versions to v2

* Change to v2 for CODEOWNERS file

* revert desktop sql template to remove attr cols

* Incorporate changes based on Lucia's feedback

* Remove search columns from the final view
2023-08-10 14:47:20 -04:00
Lucia 8fa35469ab
DENG-1379 Remove app_name from searches filter and use join by client_id. (#4176)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-08-10 19:25:18 +02:00
Lucia d0d67d5592
Shredder prototype (#3977)
* Setup query and metadata templates to generate the queries for active_users_aggregates_deletion_request tables.

* Separate the active_users KPIs from the calculation based on deletion requests.

* Move mobile view outside of the browsers loop.

* Update DAG, move partition_date to the end of the query.

* Update DAG, move partition_date to the end of the query, change from parameters in the metadata to filter dates in the query.

* Revert change to fenix_derived.

* Remove 7-day window update and update tests. Fix indentation.

* Set query parameters using jija template.

* Set query parameters using jija template.

* Using parameters in the query.

* Formatting.

* Update sql_generators/active_users_deletion_requests/templates/mobile_deletion_request_query.sql

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Formatting.

* Search table's date filter.

* Query only searches for clients with deletion request. Add required filter on submission_timestamp for table deletion_request.

* Format

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
2023-07-31 19:47:15 +02:00
Anna Scholtz a3100d7394
Fix skipping client_dedup (#4103) 2023-07-20 10:05:12 -07:00
Anna Scholtz a489ea11d8
Fix skip paths when generating app ping views (#4099) 2023-07-19 14:54:27 -07:00
Anna Scholtz a048ddbd1b
[Bug 1821784] Be explicit about skipping generation for certain Glean artifacts (#4069)
* Fix skipping certain Glean artifacts

* Explicitly skip specific glean_usage generated artifacts

* Review feedback
2023-07-14 11:54:00 -07:00
kik-kik 9b5c04a7bb
bug(1741487): Rename url2 and related fields in stable views (#4029)
* Bug 1741487 - Rename url2 and related fields in stable views

This removes the following unpopulated fields from Glean views: `metrics.url`, `metrics.text`, `metrics.jwe`, and `metrics.labeled_rate`. If any of these metrics exist in the source table under `2`-suffixed name, it is also aliased to its original name (`url2` to `url` and so on).
Suffixed fields are still preserved until view consumers migrate.

* Remove redundant comma from generated sql

* Ignore missing fields in views if any of them were removed

* added a todo comment

* Added additional context around why we are excluding some of the non-suffixed fields and why alising to remove suffix 2 from some fields

---------

Co-authored-by: Arkadiusz Komarzewski <akomarzewski@mozilla.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-07-10 09:31:15 -07:00
Anna Scholtz 0f3e54346c
Split main_1pct (#4022) 2023-07-05 13:00:29 -07:00
Anna Scholtz 1bade17436
[Bug 1839733] Fix stable view generation for use counter tables (#3981) 2023-06-23 14:13:39 -07:00
Sean Rose 1d3030e698
Remove ZetaSQL kludges. (#3898)
ZetaSQL was removed in #3755.
2023-06-05 18:03:07 +00:00
Lucia dd4789c8aa
DENG-970 Only Glean in Focus Android view. (#3877)
* DENG-970 Only Glean in Focus Android view.

* DENG-970 Only Glean in Focus Android view.

* DENG-970 Only Glean in Focus Android view.

* DENG-970 Only Glean in Focus Android view.

* DENG-970 Only Glean in Focus Android view.

* DENG-970 Only Glean in Focus Android view.

* DENG-970 CI fix

* DENG-970 CI failure fix. Related to issue 3889.

* Fix UDF dependencies deploy on stage

* DENG-970 Revert specific calling to dataset for UDF.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Brad Ochocki <brad.ochocki@gmail.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-06-02 16:37:40 +00:00
Anna Scholtz 94d28a329f
Review for #3787 (#3791)
* Bug 1823627 - Normalize the channel based on probeinfo data in UNIONized views

* Handle fenix channel normalization for app pings

* Parallelize stage schema deploys

* Fix schema field order

---------

Co-authored-by: Jan-Erik Rediger <jrediger@mozilla.com>
2023-06-01 07:22:27 -07:00
Anna Scholtz c793ef14b8
Add normalized_app_id manually to ping view schemas (#3854) 2023-05-24 13:25:43 -07:00
Daniel Thorn a0d810275b
Remove java dependency in favor of sqlglot (#3755) 2023-05-17 14:56:42 -07:00
Lucia 7c4b84ed32
DENG-774. Test change control for active users after updating the CODEOWNERS file. (#3773)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-05-15 23:21:05 +02:00
Lucia be0c2861cb
DENG 710 Update telemetry active users aggregates view (#3692)
* DENG-742 Add os_grouped to active_users views.

* DENG-516 Split Browser version (app_version).

* DENG-710 Add os_grouped and update telemetry.active_users_aggregates view to use the new views per browser.

* DENG-710 Rename QDAU to DAU.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
2023-04-14 17:00:47 +02:00
Lucia c34653778f
DENG-774 Change Control for active_users_aggregates (#3687)
* DENG-774 Add change control to active_users_aggregates and test.

* DENG-774 Add test coverage.
---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-04-14 15:49:19 +02:00
Lucia 3988294941
DENG-774 Implement Glean migration for Focus Android in query (#3715)
* DENG-774 Implement Glean migration for Focus Android in query instead of view.
* DENG-774 DAGs updated.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-04-12 13:04:15 -07:00
Anna Scholtz 48d8c7603d
Metric hub integration - rewrite SSL ratios to use metrics (#3698)
* Add metrics.data_source()

* Rewrite SSL ratios to use metrics

* Fix docs formatting
2023-04-04 15:41:44 -07:00
Lucia 586eacd666
DENG-762 Adjust mobile views to query Glean data (#3690)
* DENG-762 Adjust the Focus Android view to query Glean data starting 2023-01-01.

* DENG-762 Align views and DAG.

* DENG-762 Update DAG bqetl_core

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Winnie Chan <10429026+wwyc@users.noreply.github.com>
2023-04-03 11:29:35 -07:00
Anna Scholtz 08b45a40fe
Jinja queries support (#3691)
* Support Jinja templating in query files

* Formatting for Jinja

* ./bqetl query render command

* Fix running templates
2023-03-30 11:00:12 -07:00
kik-kik 51def19185
Bug 1825545 - Revert "Support Jinja templating in query files (#3685)" (#3689)
Bug 1825545 - This reverts commit a1c51124ec.
2023-03-30 10:47:17 -04:00
Anna Scholtz a1c51124ec
Support Jinja templating in query files (#3685)
* Support Jinja templating in query files

* Formatting for Jinja

* ./bqetl query render command
2023-03-29 10:38:08 -07:00
Glenda Leonard 3bda8c1d8c
Adding 3 more Google Analytic country spellings. (#3686) 2023-03-27 14:12:37 -04:00
Lucia fe8cf4ddcd
DENG-742 Add os_grouped to active_users views. (#3666)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-03-22 19:51:12 +01:00
Glenda Leonard 25c5fede25
Added new country mappings for Google Analytics. (#3675) 2023-03-22 10:26:31 -04:00
Winnie Chan 763c316a96
Updated locale field (#3663) 2023-03-15 15:53:01 -07:00
Anna Scholtz ef75f1704b
Add validation_failed counts to experiment monitoring datasets (#3623) 2023-03-15 14:34:05 -07:00
Lucia 278159fc38
DENG-742 Unioned view for mobile active users (#3658)
* DENG-742 Add unioned view for mobile browsers.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-03-13 17:57:52 +01:00
Winnie Chan 66a365499f
Updated os version (#3646)
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
2023-03-13 08:38:15 -07:00
Lucia 95629682af
DENG-682 sql-generators to create active_users_aggregates per-browser (#3609)
* DENG-682 Refactor active_users_aggregates to be generated per-browser, change source for mobile data to clients_last_seen_joined and for desktop data to telemetry.main.

* DENG-682 header fix.

* DENG-682 format fix.

* DENG-682 fixed header.

* DENG-682. Adjust parameters for write_sql function.

* DENG-682. Adjust for dag generation

* DENG-682. Adjust for dag generation

* DENG-682 Refactor active_users_aggregates to be generated per-browser, change source for mobile data to clients_last_seen_joined and for desktop data to telemetry.main.

* DENG-682 header fix.

* DENG-682 format fix.

* DENG-682 fixed header.

* DENG-682. Adjust parameters for write_sql function.

* DENG-682. Adjust for dag generation

* DENG-682. Adjust for dag generation

* regenerated bqetl_analytics_aggregations.py

* regenerated dags which use active_users_aggregates sql_generators

* DENG-698 Focus Android query includes Legacy and Glean.

* DENG-698 DAG bqetl_core external dependency added.

* DENG-698 Column name update.

* DENG-698 Syntax in DAG.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: kik-kik <kignasiak@mozilla.com>
2023-03-08 13:35:22 +01:00
Sean Rose 0d9e2ce7c1
Add `telemetry_agent` column to VPN `events_daily` table (VPN-3934) (#3603) 2023-02-22 10:10:35 -08:00
Anna Scholtz df92a47b55
[Bug 1816621] Generate nested type information for unioned Glean ping views (#3592) 2023-02-14 12:30:13 -08:00
Anna Scholtz cd7a5c3730
Prevent glean_usage generation if referenced dataset doesn't exist (#3574)
* Prevent glean_usage generation if referenced dataset doesn't exist

* Update sql_generators/glean_usage/common.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2023-02-09 11:26:52 -08:00
Anna Scholtz 172a88fdfd
Add normalized_app_id to cross channel views (#3571) 2023-02-07 09:56:52 -08:00
Sean Rose 2d84a1d3b7
Change `bqetl format` to improve readability of `CASE` statements (#3546)
* Indent `WHEN` and `ELSE` clauses one level more than `CASE`.
* Indent `THEN` clauses one level more than the corresponding `WHEN` clause.
* Have the content of `WHEN`, `THEN`, and `ELSE` clauses start on the same line as the clause keyword.
* Allow an alias, comma, or dot right after a `CASE` statement's `END`.
2023-02-03 14:35:59 -08:00
Anna Scholtz 20f764835f
Add --use_cloud_function option when generating SQL queries (#3565) 2023-02-03 14:10:04 -08:00
Anna Scholtz f656fc467b
Add type information to glean usage generated app views (#3563) 2023-02-02 10:07:40 -08:00
Lucia 6560936305
DENG-601 Update sql-generators README with deployment info. (#3562)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-02-02 18:48:43 +01:00
Lucia 500c2d2dc6
DENG-601 Add is_default_browser and uri_count for Klar iOS (#3560)
* DENG-601 Add is_default_browser and uri_count to org_mozilla_ios_klar.metrics.
* DENG-601 Add DAGs required by CI.
---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-02-02 17:13:16 +01:00
Anna Scholtz 6f0f576d66
Fix Glean Usage view generation for repeated nested fields (#3559) 2023-02-01 16:41:25 -08:00
Anna Scholtz 30d7507f02
[Bug 1712332] Generate UNIONed app ping views (#3545)
* Generate UNIONed app ping views

* Address review feedback
2023-02-01 14:00:06 -08:00
akkomar a0a4e08a83
Bug 1805722 - Deprecate bqetl_event_rollup (#3542) 2023-01-30 09:35:12 -08:00
Sean Rose a46678ed96
Change `bqetl format` to uppercase built-in function names (#3536) 2023-01-26 16:07:10 -08:00
Sean Rose afea1c35b9
Remove dependencies on the `mozdata` project from ETL (#3496)
Having dependencies on things in `mozdata` can cause issues with deployments, as deploying things to `mozdata` is usually a separate secondary step.
2023-01-23 08:59:04 -08:00
Anna Scholtz 031320dfcf
Move check for dags being up to date after DAGs have been generated (#3508) 2023-01-17 08:49:29 -08:00
Sean Rose 495ddcf39f
Correct BigQuery partitioning/clustering metadata in ETL `metadata.yaml` files (#3500)
* Add missing BigQuery partitioning/clustering metadata.
* Correct existing BigQuery partitioning/clustering metadata.
* Allow partition `field` metadata field to be omitted.
* List partition `type` metadata field first.
2023-01-12 13:58:53 -08:00
Anna Scholtz 0ce9c1dc2b
Write public-facing dataset metadata for glean_usage only if tables exist (#3494) 2023-01-10 15:04:42 -08:00
Daniel Thorn d196c3d8ac
Skip main and its clones when autogenerating stable view schemas (#3478) 2022-12-20 10:15:44 -08:00
Sean Rose 756f4aea23
Reapply "Have Glean stable views parse datetime metrics as timestamps (bug 1771733) (#3343)" (#3453)
This reverts commit c43bb6c608.
2022-12-12 16:54:30 -08:00
Anna Scholtz 88a7c94422 Apply skip list for glean app generation 2022-12-08 12:25:53 -08:00
Anna Scholtz 5612836323 Automatically generate tables for Glean apps 2022-12-08 11:09:09 -08:00
Sean Rose c43bb6c608
Revert "Have Glean stable views parse datetime metrics as timestamps (bug 1771733) (#3343)" (#3394)
This reverts commit b9fcdd9470.
2022-12-01 12:01:17 -08:00
Sean Rose 32ab51ba83
Only merge descriptions from Glean stable table schemas to views (#3391) 2022-12-01 09:50:01 -08:00
Sean Rose b9fcdd9470
Have Glean stable views parse datetime metrics as timestamps (bug 1771733) (#3343) 2022-11-30 15:13:18 -08:00
Anna Scholtz 27fbf59ba0
Ensure target path exists when generating country_code_lookup (#3362) 2022-11-17 10:59:41 +01:00
Lucia 9d650106d6
DENG-452. Update the query template for mobile to set the correct que… (#3328)
* DENG-452. Update the query template for mobile to set the correct query for uri_count and is_default_browser.

* Update metrics_templating.yaml

Add space.

* Removing placeholder comment in template.

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2022-11-03 10:38:49 +01:00
Lucia 87055148a9
DENG-437. Make columns available for sql generator Focus Android (is_default_browser, uri_count) (#3315)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2022-11-01 09:15:56 -04:00
Anna Scholtz adaa9ab52e Add docs for glean_usage 2022-10-25 09:15:09 -07:00
Alekhya 56983f3e8d
Add Glean iOS Focus and Klar to search metrics (#3285)
correct the column for default search engine

add tests and  ios to views
2022-10-18 13:52:53 -04:00
Chris H-C 4d0a587798
Bug 1792965 - Use the established normalized_channel in events_unnested for single-dataset apps (#3253)
If there isn't a separate app_id per channel, we'll get "release" overriding
the already-correct-and-normalized `normalized_channel` from the source table.
2022-10-11 19:11:59 +02:00
Daniel Thorn 386bd0ccc0
Remove duplicates from core_clients_first_seen macro (#3268) 2022-10-11 09:56:05 -07:00
kik-kik ccd81f8116
bug(1788650): Added dedup step at the end of the query to ensure we only end up with a single entry per client_id (#3201)
* added dedup step at the end of the query to ensure we only end up with a single entry per client_id

* ensuring no duplicates in core_first_seen macro

* Update sql_generators/glean_usage/templates/baseline_clients_first_seen_v1.query.sql

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Update sql_generators/glean_usage/templates/baseline_clients_first_seen_v1.init.sql

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Update sql_generators/glean_usage/templates/macros.sql

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* removed min at the end

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
2022-10-10 20:24:18 +01:00
Sean Rose 04d4c3ef5f
Fix `events_unnested` views to consistently select the correct fields (#3167) 2022-08-19 21:29:13 +00:00
Sean Rose eaa5f1834f
Don't overwrite existing files in glean_usage generation (DENG-210) (#3149)
This allows manual overrides of generated files by checking them into the sql/ tree of the default branch of the repository.

Correspondingly, this also deletes previously generated glean_usage files that weren't intended to be overrides.
2022-08-16 16:37:32 +00:00
Sean Rose 566b1f601a
Create unified views of VPN apps' Glean data (DENG-210) (#3140) 2022-08-12 22:59:23 +00:00
Alexander f99f112336
Android Focus search ETL - DO-824, Bug 1749833 (#2682)
Added glean data for Focus on Android to `mobile_search_clients_daily_v1`
2022-07-26 16:25:32 -04:00
Anna Scholtz daae227108 Add date_partition_offset tests 2022-07-05 09:26:21 -07:00
Anna Scholtz fc608f098c Add experiment enrollment and search monitoring for focus nightly and beta 2022-06-29 14:28:58 -07:00
Anna Scholtz d59be2a336 [Bug 1770015] Allowlist for glean apps in glean_usage 2022-06-27 14:56:37 -07:00
Anna Scholtz 818d29c097 Handle duplicates for fennec for baseline_clients_first_seen 2022-05-06 16:53:25 -07:00
Jeff Klukas 0523713c2f Bug 1768029 Duplicates in baseline_clients_first_seen
See https://bugzilla.mozilla.org/show_bug.cgi?id=1768029
2022-05-06 16:53:25 -07:00
Jeff Klukas e5dd8f8aba
Bug 1757216 Add ISP field to baseline_clients_daily (#2928)
* Bug 1757216 Add ISP field to baseline_clients_daily

Replaces #2919

This will enable use to filter out clients sent by BrowserStack in downstream tables and queries.

* Add refs to PR and bug

* Remove test table in script

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2022-05-02 12:04:28 -04:00
Jeff Klukas a1ac4b975b
More generated metadata for glean_usage queries (#2925)
* Generate metadata file for baseline_clients_daily

* Remove generated org_mozilla_fenix_derived content

This used to be needed in-tree for testing, but we now generate before
we run tests in CI.

* Fixup metadata templates
2022-04-29 12:29:18 -07:00
Alexander Nicholson 9bbde247b5
Added ROW entry for country (#2893) 2022-04-14 17:11:32 -04:00
Alexander Nicholson 8bdb5cf92b
Added missing aliases (#2858) 2022-04-04 12:11:00 -04:00
Alexander Nicholson 72f8b898f5
Added unknown country aliases and entry (#2844) 2022-03-30 13:45:21 -04:00
Alexander Nicholson 5fc167f328
Added generator for country_names (#2829)
* Updated the data source for `country_names_v1`
The new source is a YAML file and sql generator
for easier verification of duplicates and contribution.

* Update static README

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2022-03-28 17:14:32 -04:00
Anna Scholtz c37a152282
Ignore VPN when generating events_unnested (#2813) 2022-03-21 14:19:33 -07:00
Anna Scholtz db4bd02f61 Fix clients_last_seen joins 2022-03-01 11:58:38 -08:00
Alexander Nicholson 93d1c590c9
Changed feature_usage string to number comparisons (#2666)
* Changed feature_usage string to number comparisons

* Update sql/moz-fx-data-shared-prod/telemetry_derived/feature_usage_v2/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Updated feature_usage DAG owners

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2022-01-17 16:44:45 -05:00
Alexander Nicholson a837866bc5
Skip views to stable tables and UDFs; Dynamically extract view refs (#2663) 2022-01-17 13:38:02 -05:00
Alexander Nicholson 45655229d3
Added sql_generators script to create schema.yaml files for derived views (#2657) 2022-01-13 15:46:06 -05:00
Anna Scholtz d58c86b928 metrics_clients_last_seen join on channel 2022-01-11 13:42:59 -08:00
Anna Scholtz f6da873783 Move DAG generation to CI 2022-01-07 14:25:04 -08:00
Anna Scholtz 95332daeff Use generate all command in CI 2022-01-07 14:25:04 -08:00
Anna Scholtz 82b26c48c8 Move glean_usage to sql_generators/ 2022-01-07 14:25:04 -08:00
Anna Scholtz 5c35838dda Fix schema generation when generating SQL 2022-01-06 09:05:45 -08:00
Anna Scholtz cf966d2280 Move stable view generation into separate module 2022-01-05 12:26:52 -08:00
Anna Scholtz 89e0da4f2b Move stable view generation to sql_generators 2022-01-05 12:26:52 -08:00
Anna Scholtz 22b3dbd1b2 Move glean_usage to sql_generators/ 2022-01-05 12:26:52 -08:00
Anna Scholtz 0528cbe6c5 Fix glean_usage template permissions 2022-01-03 14:15:02 -08:00
Anna Scholtz 22fea1dca3 Add docs for sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz 2b3eebe812 Update scripts and test for sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz dab97f5305 Move search to sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz 3f0b5373ce Move glean_usage to sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz e0066cbd51 Move experiment_monitoring to sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz d3727763bf Move feature_usage to sql_generators/ 2022-01-03 14:15:02 -08:00
Anna Scholtz cc4ddb783e Move events_daily to sql_generators/ 2022-01-03 14:15:02 -08:00