Граф коммитов

728 Коммитов

Автор SHA1 Сообщение Дата
Ben Wu 1e13a6c29a
Support results with no rows in matches_pattern check (#6521)
* Support results with no rows in matches_pattern check

* remove extra )
2024-11-19 18:13:07 +00:00
Lucia b83224393c
Improve checks to account for null replacements (#6508)
* Backfill info.

* Update tests,

* Update tests,
2024-11-15 17:03:02 +00:00
Ben Wu 3c13b9af65
Make shredder detect glean derived tables with client_info.client_id (#6459)
* Make shredder detect glean derived tables with client_info.client_id

* Add skip list
2024-11-12 23:01:47 +00:00
m-d-bowerman 221f44ca00
update telemetry for list cards (#6445)
* update telemetry for list cards

* update test yaml

* fix incorrect bool

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-11-05 23:37:50 +00:00
Anna Scholtz 4ad4c6b9d4
Add CLI command to run Bigeye checks (#6383)
* Add support for defining custom SQL rules

* Added monitoring rollback command

* Reformat

* Add tests

* Address review feedback

* Add flags for deleting Bigeye checks

* Add support for defining custom SQL rules

* Added monitoring rollback command

* Add run command to trigger Bigeye checks

* Use bigquery_bigeye_check to trigger Bigeye runs

* Add unit tests for monitoring run CLI command

* Update DAG tests

* Fix imports

* Address review feedback

* Format custom sql rules file

* Remove bigconfig_custom_rules.sql

* Fix bigeye update for public datasets
2024-11-05 18:02:42 +00:00
Anna Scholtz 209b75ae57
Bigeye - Add support for deploying and removing custom SQL rules (#6379)
* Add support for defining custom SQL rules

* Added monitoring rollback command

* Reformat

* Add tests

* Address review feedback

* Add flags for deleting Bigeye checks

* Fix formatting

* Fix tests
2024-11-04 17:41:46 +00:00
Alekhya 7f72235eb4
Switch to baseline beginning Aug 01st 2024 (#6425)
* Switch to baseline beginning Aug 01st 2024

* Modify the search_engine_daily view to handle the cut off

* Fix sql format

* Skip mobile_clients_daily_v2 from dry run

* Remove sql_generator files populating v1

* Add tests for mobile_search_clients_daily_v2

* remove unwanted tests
2024-10-31 19:11:36 +00:00
Anna Scholtz 381107db71
Revert #6418 (#6429) 2024-10-31 15:41:23 +00:00
Anna Scholtz 2add865249
Speed up schema updates (#6418)
* Parallelize dependency graph

* Use GCP API to get table schema when not using cloud function

* Reuse GCP credentials

* Update dependency tests

* Remove print
2024-10-30 14:52:05 +00:00
Anna Scholtz 3b222a21e9
Parallelize metadata publish (#6403) 2024-10-28 19:03:08 +00:00
Sergio E. Betancourt 000a50831a
Adding spons tiles configured telemetry to newtab clients and visits (#6380)
* Adding spons tiles configured telemetry to newtab clients and visits

* ran bqetl format

* Adding new field to tests

* Adding new field to tests

* fix tests

---------

Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
2024-10-23 18:36:34 +00:00
kik-kik 405402e368
feat: tweak glean_usage generator to include install_source in derived baseline tables (#6173)
* feat: tweak glean_usage generator to include install_source in derived baseline tables

* Apply suggestions from code review

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/glean_usage/templates/clients_last_seen_joined.query.sql

* feat: update the test definition to include install_source

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-10-15 13:10:39 +00:00
kik-kik 221cd9b9c2
fix: only generate Airflow task for BigEye if monitoring enabled in the metadata (#6326) 2024-10-10 21:50:42 +00:00
m-d-bowerman 931b653e50
Add window rectangle to newtab visits (#6311)
* add window size and pocket list card telemetry

* add fields to pocket summary

* fix test errors

* additional fields and test
2024-10-08 20:36:52 +00:00
Anna Scholtz a1a12791aa
Update Bigeye warehouse ID (#6297)
* Update Bigeye warehouse ID

* Update test bigeye warehouse ID
2024-10-03 21:12:24 +00:00
kik-kik 4257698cc8
feat(DENG-4602): add BigEye RunMetricOperator to DAG generation (#6285)
* feat: add BigEye RunMetricOperator to DAG generation

* Fix Bigeye DAG generation and move configuration to bqetl_project.yaml

* Reformatting; test fixing

* Install Airflow override requirements

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-10-03 15:13:27 +00:00
Ben Wu 66443eee29
Bug 1920544 Create view to union firefox desktop crashes (#6257) 2024-09-27 17:48:44 +00:00
Curtis Morales 247729d217
In validation, only read schema files when necessary (#6252)
* Remove deleted table from skip list

* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl

* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl

* Clean up test
2024-09-24 17:03:28 +00:00
Lucia 29bb468663
Deng 877 autogenerate checks during shredder mitigation (#6243)
* Reference to the shredder mitigation process during backfills.

* missing dash

* Auto-generate and run data checks. Validate shredder_mitigation label.

* Checks template.

* Fix tests.

* Update checks to include additional EXCEPT, use table_id in staging dataset,  and ensure that tests run and generate a failure that stops the backfill.
2024-09-24 15:35:25 +00:00
Anna Scholtz 66b37ada4a
Add support for secrets when generating Airflow DAGs (#6241)
* Add support for secrets when generating Airflow DAGs

* update test_publish_metadata owner

* Conditional import of Airflow Secrets
2024-09-23 19:58:43 +00:00
Lucia 074731db7b
CI validation of tables with the shredder_mitigation label (#6217)
* Larger wildcards to reduce the chance of collision with actual values.

* Formatting

* Update bigquery_etl/metadata/validate_metadata.py

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

* Add test for validate_metadata.validate, add profile_id and profile_group_id to id-level_columns file.

* Update tests/cli/test_cli_metadata.py

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

---------

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
2024-09-18 18:44:06 +00:00
Lucia 68159a7f1d
Test mitigation (#6205)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.

* Set values back to NULL were corresponds. Improve output information.

* Rename custom_query to custom_query_path to match the expected parameter.

* Missing import

* Larger wildcards to reduce the chance of collision with actual values.
2024-09-17 12:09:17 +00:00
Katie Windau 3bd53ce175
feat(DENG-4847): Add profile_group_id to telemetry_derived.clients_daily_scalar_aggregates_v1 (#6207)
* DENG-4847 update test schema json for main_v5 to include profile_group_id

* DENG-4847 add profile_group_id to clients_daily_scalar_aggregates_v1
2024-09-16 18:59:09 +00:00
Ben Wu 53f385510c
[DENG-4641] Add support for shredding per sample id to shredder (#6197)
* [DENG-4641] Add support for shredding per sample id to shredder

* Add comment explaining connection_pool_max_size
2024-09-16 17:34:38 +00:00
Lucia a770744e63
Enhancements to shredder mitigation (#6174)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.
2024-09-12 16:09:51 +00:00
Lucia 9fc513ba9f
Generate query with shredder mitigation (#6060)
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.

* Exception handling.

* isort & docstrings.

* Apply flake8 to test file.

* Remove variable assignment to different types.

* Make search case insensitive in function.

* Add test cases for function and update naming in a funcion's parameters for clarity.

* Update bigquery_etl/backfill/shredder_mitigation.py

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>

* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().

* Encapsulate actions to generate and run custom queries to generate the subsets for shredder mitigation.

* Query template for shredder mitigation.

* Query template for shredder mitigation and formatting.

* Add check for "GROUP BY 1, 2, 3", improve code readibility, remove unnecesary properties in classes.

* Test coverage. Check for "GROUP BY 1, 2, 3", improve readibility, remove unrequired properties in class Subset.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Separate INTEFER and NUMERIC types.

* Move util functions and convert method to property, both to resolve a circular import. Adjust tests. Update function return and tests.

* Adding backfill_date to exception message. Formatting.

* Adding backfill_date to exception message. Formatting.

---------

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
2024-09-06 16:21:56 +02:00
Anna Scholtz 7260510cc3
Add monitoring metadata support (#6152)
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-09-04 13:03:59 -04:00
Lucia 4ae4560eca
DENG-4660 Run shredder mitigation as managed backfill (#6142)
* Add parameter for custom query to managed backfills.

* Implement 'sidefill' using custom query and flag to run a backfill with shredder mitigation.

* Fix test.

* Fix tests.
2024-08-31 20:13:03 +02:00
Katie Windau dfab201473
DENG-4677 Add profile_group_id to search_clients_last_seen_v1 (#6139) 2024-08-30 12:06:16 -05:00
Alekhya 61c97ecc0a
Fix acer cohort case statement for search agg desktop (#6132)
* Fix acer cohort case statement for search agg desktop

* Fix tests
2024-08-29 17:56:40 -04:00
Katie Windau d2e0c5683c
DENG-4657 add profile_group_id to addon_aggregates_v2 (#6130) 2024-08-29 11:48:53 -05:00
Chelsey Beck c23d341e45
Revert "DENG-3503 updating to use v2 table (#6097)" (#6123)
This reverts commit 8ce9289a28.
2024-08-28 10:33:32 -07:00
Alekhya a3498659b8
Fix mobile os version for search related tables (#6111)
* Fix mobile os version

* Fix tests
2024-08-26 23:48:48 -04:00
Chelsey Beck 8ce9289a28
DENG-3503 updating to use v2 table (#6097) 2024-08-26 10:12:32 -07:00
Katie Windau c33846fc18
DENG-4588 add profile_group_id to newtab_visits_v1 (#6095)
* DENG-4588 add profile_group_id to newtab_visits_v1

* DENG-4588 update tests for newtab_visits_v1
2024-08-23 11:33:43 -05:00
Chelsey Beck c8593c9d8f
updating query to match schema (#6091)
* updating query to match schema

* adding new columns to test

---------

Co-authored-by: akkomar <akkomar@users.noreply.github.com>
2024-08-23 12:07:19 +02:00
Katie Windau c1c690e3a6
DENG-4564 add profile_group_id to urlbar_clients_daily_v1 (#6079)
* DENG-4564 add profile_group_id to urlbar_clients_daily_v1

* DENG-4564 Add description for profile_group_id

* DENG-4564 add profile_group_id to the urlbar_clients_daily_v1 sql tests
2024-08-21 12:23:27 -05:00
Katie Windau 36739bd0b5
DENG-4515 Add profile_group_id to search_clients_daily_v8 (#6064)
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
2024-08-19 07:04:40 -05:00
Alexander 4cfa504e49
chore(deploys): Catch and raise for missing dataset/project in table deploys (#6067)
* chore(deploys): Catch NotFound for missing dataset/project in table deploys

* Update bigquery_etl/deploy.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* update test exception text match

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2024-08-15 13:56:36 -04:00
m-d-bowerman 435422c94e
Add topic selection fields to newtab_visits (#6004)
* Add topic selection fields to newtab_visits

* Remove duplicate from schema

* Add new metric to test schema

* Fix event name in test yaml

* Test yaml fix

* Add visit id to test case

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-08-15 09:32:23 -07:00
Alekhya 47a72c9f42
Add is_sap_monetizable and normalize the search engine (#5907)
* Add is_sap_monetizable to aggregates table

* Fix tests


* add is_acer_cohort


* Add normalized_default_search_engine column for mobile and desktop

* Fix test cases
2024-08-14 18:00:58 -04:00
Katie Windau fb064e8f4f
DENG-4481 Add profile_group_id to telemetry_derived.clients_first_see… (#6044) 2024-08-09 22:10:17 -05:00
Lucia 851ac84f17
Auxiliary functions for shredder mitigation (#6002)
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.

* Exception handling.

* isort & docstrings.

* Apply flake8 to test file.

* Remove variable assignment to different types.

* Make search case insensitive in function.

* Add test cases for function and update naming in a funcion's parameters for clarity.

* Update bigquery_etl/backfill/shredder_mitigation.py

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>

* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().

---------

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
2024-08-05 20:02:16 +02:00
Winnie Chan 6bf63b0dd4
DENG-4283: Updated managed backfills backup table name (#5908)
* Updated back up table name

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2024-07-25 14:47:11 -07:00
Eduardo Filho b994884098
GLAM purge percentile calculations and prep downstream (#5966)
* Remove percentiles

* Remove tests that test percentiles

* Refresh scripts insert null to new percentiles

* Remove percentile columns from queries and schemas

* Delete more percentile tables

* Formatting

* histogram_cast_struct's keys are strings

* Re-add test after fixing failure cause
2024-07-25 10:44:43 -04:00
Katie Windau c037e4fbb4
DENG-4201 Remove GA3 tables from marketing-prod (they are now in shared-prod instead) (#5917)
* DENG-4201 remove blogs_goals_v1 since now in shared prod

* DENG-4201 remove blogs_sessions_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_hits_v1 from mktg prod since now in shared prod

* DENG-4201 remove firefox_whatsnew_summary_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_downloads_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_events_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove www_site_landing_page_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove www_site_page_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_daily_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_landing_page_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_empty_check_v1

* DENG-4201 remove www_site_empty_check_v1

* DENG-4201 remove www_site_metrics_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove dl with attr v1 and v2 from mktg prod since now in shared prod instead

* DENG-4201 remove tests from mktg prod since tables no longer live in mktg prod
2024-07-23 19:01:15 -05:00
Chelsey Beck 7687f52c64
adding location selected to struct (#5960) 2024-07-23 11:47:33 -07:00
Ben Wu ea890a07b4
Update shredder config for new ltv tables to use legacy deletions (#5958) 2024-07-23 10:40:52 -04:00
Eduardo Filho d0b95438c3
GLAM: Do legacy non-normalized aggregations in the same pass as normalized (#5935)
* GLAM: Do legacy non-normalized aggregations in the same pass as normalized

* Add missing eof space

* Fix tests
2024-07-18 13:16:06 -04:00
Anna Scholtz 57bd939905
Fully qualified identifiers in SQL queries (#5764)
* Add fully-qualified identifiers when formatting queries

* Fully-qualified identifiers for queries in sql/

* Check in only formatted SQL to generated-sql branch

* Add comment

* Fully qualify more tables

* Fully qualify test files

* Formatting improvements around CTEs and unit tests

* Option to skip auto qualifying queries
2024-06-27 09:53:33 -07:00