Граф коммитов

707 Коммитов

Автор SHA1 Сообщение Дата
Lucia 68159a7f1d
Test mitigation (#6205)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.

* Set values back to NULL were corresponds. Improve output information.

* Rename custom_query to custom_query_path to match the expected parameter.

* Missing import

* Larger wildcards to reduce the chance of collision with actual values.
2024-09-17 12:09:17 +00:00
Katie Windau 3bd53ce175
feat(DENG-4847): Add profile_group_id to telemetry_derived.clients_daily_scalar_aggregates_v1 (#6207)
* DENG-4847 update test schema json for main_v5 to include profile_group_id

* DENG-4847 add profile_group_id to clients_daily_scalar_aggregates_v1
2024-09-16 18:59:09 +00:00
Ben Wu 53f385510c
[DENG-4641] Add support for shredding per sample id to shredder (#6197)
* [DENG-4641] Add support for shredding per sample id to shredder

* Add comment explaining connection_pool_max_size
2024-09-16 17:34:38 +00:00
Lucia a770744e63
Enhancements to shredder mitigation (#6174)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.
2024-09-12 16:09:51 +00:00
Lucia 9fc513ba9f
Generate query with shredder mitigation (#6060)
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.

* Exception handling.

* isort & docstrings.

* Apply flake8 to test file.

* Remove variable assignment to different types.

* Make search case insensitive in function.

* Add test cases for function and update naming in a funcion's parameters for clarity.

* Update bigquery_etl/backfill/shredder_mitigation.py

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>

* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().

* Encapsulate actions to generate and run custom queries to generate the subsets for shredder mitigation.

* Query template for shredder mitigation.

* Query template for shredder mitigation and formatting.

* Add check for "GROUP BY 1, 2, 3", improve code readibility, remove unnecesary properties in classes.

* Test coverage. Check for "GROUP BY 1, 2, 3", improve readibility, remove unrequired properties in class Subset.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Separate INTEFER and NUMERIC types.

* Move util functions and convert method to property, both to resolve a circular import. Adjust tests. Update function return and tests.

* Adding backfill_date to exception message. Formatting.

* Adding backfill_date to exception message. Formatting.

---------

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
2024-09-06 16:21:56 +02:00
Anna Scholtz 7260510cc3
Add monitoring metadata support (#6152)
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-09-04 13:03:59 -04:00
Lucia 4ae4560eca
DENG-4660 Run shredder mitigation as managed backfill (#6142)
* Add parameter for custom query to managed backfills.

* Implement 'sidefill' using custom query and flag to run a backfill with shredder mitigation.

* Fix test.

* Fix tests.
2024-08-31 20:13:03 +02:00
Katie Windau dfab201473
DENG-4677 Add profile_group_id to search_clients_last_seen_v1 (#6139) 2024-08-30 12:06:16 -05:00
Alekhya 61c97ecc0a
Fix acer cohort case statement for search agg desktop (#6132)
* Fix acer cohort case statement for search agg desktop

* Fix tests
2024-08-29 17:56:40 -04:00
Katie Windau d2e0c5683c
DENG-4657 add profile_group_id to addon_aggregates_v2 (#6130) 2024-08-29 11:48:53 -05:00
Chelsey Beck c23d341e45
Revert "DENG-3503 updating to use v2 table (#6097)" (#6123)
This reverts commit 8ce9289a28.
2024-08-28 10:33:32 -07:00
Alekhya a3498659b8
Fix mobile os version for search related tables (#6111)
* Fix mobile os version

* Fix tests
2024-08-26 23:48:48 -04:00
Chelsey Beck 8ce9289a28
DENG-3503 updating to use v2 table (#6097) 2024-08-26 10:12:32 -07:00
Katie Windau c33846fc18
DENG-4588 add profile_group_id to newtab_visits_v1 (#6095)
* DENG-4588 add profile_group_id to newtab_visits_v1

* DENG-4588 update tests for newtab_visits_v1
2024-08-23 11:33:43 -05:00
Chelsey Beck c8593c9d8f
updating query to match schema (#6091)
* updating query to match schema

* adding new columns to test

---------

Co-authored-by: akkomar <akkomar@users.noreply.github.com>
2024-08-23 12:07:19 +02:00
Katie Windau c1c690e3a6
DENG-4564 add profile_group_id to urlbar_clients_daily_v1 (#6079)
* DENG-4564 add profile_group_id to urlbar_clients_daily_v1

* DENG-4564 Add description for profile_group_id

* DENG-4564 add profile_group_id to the urlbar_clients_daily_v1 sql tests
2024-08-21 12:23:27 -05:00
Katie Windau 36739bd0b5
DENG-4515 Add profile_group_id to search_clients_daily_v8 (#6064)
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
2024-08-19 07:04:40 -05:00
Alexander 4cfa504e49
chore(deploys): Catch and raise for missing dataset/project in table deploys (#6067)
* chore(deploys): Catch NotFound for missing dataset/project in table deploys

* Update bigquery_etl/deploy.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* update test exception text match

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2024-08-15 13:56:36 -04:00
m-d-bowerman 435422c94e
Add topic selection fields to newtab_visits (#6004)
* Add topic selection fields to newtab_visits

* Remove duplicate from schema

* Add new metric to test schema

* Fix event name in test yaml

* Test yaml fix

* Add visit id to test case

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-08-15 09:32:23 -07:00
Alekhya 47a72c9f42
Add is_sap_monetizable and normalize the search engine (#5907)
* Add is_sap_monetizable to aggregates table

* Fix tests


* add is_acer_cohort


* Add normalized_default_search_engine column for mobile and desktop

* Fix test cases
2024-08-14 18:00:58 -04:00
Katie Windau fb064e8f4f
DENG-4481 Add profile_group_id to telemetry_derived.clients_first_see… (#6044) 2024-08-09 22:10:17 -05:00
Lucia 851ac84f17
Auxiliary functions for shredder mitigation (#6002)
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.

* Exception handling.

* isort & docstrings.

* Apply flake8 to test file.

* Remove variable assignment to different types.

* Make search case insensitive in function.

* Add test cases for function and update naming in a funcion's parameters for clarity.

* Update bigquery_etl/backfill/shredder_mitigation.py

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>

* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().

---------

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
2024-08-05 20:02:16 +02:00
Winnie Chan 6bf63b0dd4
DENG-4283: Updated managed backfills backup table name (#5908)
* Updated back up table name

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2024-07-25 14:47:11 -07:00
Eduardo Filho b994884098
GLAM purge percentile calculations and prep downstream (#5966)
* Remove percentiles

* Remove tests that test percentiles

* Refresh scripts insert null to new percentiles

* Remove percentile columns from queries and schemas

* Delete more percentile tables

* Formatting

* histogram_cast_struct's keys are strings

* Re-add test after fixing failure cause
2024-07-25 10:44:43 -04:00
Katie Windau c037e4fbb4
DENG-4201 Remove GA3 tables from marketing-prod (they are now in shared-prod instead) (#5917)
* DENG-4201 remove blogs_goals_v1 since now in shared prod

* DENG-4201 remove blogs_sessions_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_hits_v1 from mktg prod since now in shared prod

* DENG-4201 remove firefox_whatsnew_summary_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_downloads_v1 from mktg prod since now in shared prod

* DENG-4201 remove www_site_events_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove www_site_landing_page_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove www_site_page_metrics_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_daily_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_landing_page_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove blogs_empty_check_v1

* DENG-4201 remove www_site_empty_check_v1

* DENG-4201 remove www_site_metrics_summary_v1 from mktg prod since now moved to shared prod

* DENG-4201 remove dl with attr v1 and v2 from mktg prod since now in shared prod instead

* DENG-4201 remove tests from mktg prod since tables no longer live in mktg prod
2024-07-23 19:01:15 -05:00
Chelsey Beck 7687f52c64
adding location selected to struct (#5960) 2024-07-23 11:47:33 -07:00
Ben Wu ea890a07b4
Update shredder config for new ltv tables to use legacy deletions (#5958) 2024-07-23 10:40:52 -04:00
Eduardo Filho d0b95438c3
GLAM: Do legacy non-normalized aggregations in the same pass as normalized (#5935)
* GLAM: Do legacy non-normalized aggregations in the same pass as normalized

* Add missing eof space

* Fix tests
2024-07-18 13:16:06 -04:00
Anna Scholtz 57bd939905
Fully qualified identifiers in SQL queries (#5764)
* Add fully-qualified identifiers when formatting queries

* Fully-qualified identifiers for queries in sql/

* Check in only formatted SQL to generated-sql branch

* Add comment

* Fully qualify more tables

* Fully qualify test files

* Formatting improvements around CTEs and unit tests

* Option to skip auto qualifying queries
2024-06-27 09:53:33 -07:00
Alekhya efc6aaced3
Add normalized_app_name_os to mobile search aggregate table (#5850)
* Add normalized_app_name_os to mobile search aggregate table

* Fix CI

Fix CI
2024-06-27 10:58:46 -04:00
Alekhya 37791da059
RS 959 - Add OS breakouts (#5767)
* RS 959 - Add OS breakouts

* Add mobile OS breakouts

Add mobile OS breakouts

* Add mobile OS breakouts

* Add tests for desktop

* Add mobile tests

* Fix tests

* Fix CI issue
2024-06-24 21:48:08 -04:00
Chelsey Beck 5923deadc4
Deng 4083 fix checks schedule (#5790)
* moving schedule back 30 minutes

* removing v2 snapshot and updating description

* remove uniqueness check

* adding check to snapshot

* updating description
2024-06-17 14:55:48 -07:00
m-d-bowerman 5507129f6e
Newtab table default UI pocket (#5784)
* Add pocket tile ID

* Add new weather, wallpaper telemetry

* Update newtab_visits_v1 tests

* Fix tests again

* Update metadata

* Add new boolean to test JSON

* Fix tests

* Add category metric to test values

* test

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-06-17 12:26:59 -07:00
Sean Rose 4b8574f3bd
Fix bugs in `derived_view_schemas` SQL generator (#5592)
* Limit `derived_view_schemas` SQL generator to actual view directories.

* Fix the `derived_view_schemas` SQL generator to get the view schemas by dry-running their latest SQL and/or from their latest `schema.yaml` file.  Getting the schema from the currently deployed view wasn't appropriate because it wouldn't reflect the latest view code.

* Rename `View.view_schema` to `View.schema`.

* Change `View` so its schema dry-runs use the cloud function (CI doesn't have permission to run dry-run queries directly).

* Apply partition column filters in view dry-run queries when possible for speed/efficiency.

* Don't allow missing fields to prevent view schema enrichment.

* Only copy column descriptions during view schema enrichment.

* Only try enriching view schemas from their reference table `schema.yaml` files if those files actually exist.

* Change `main_1pct` view to select directly from the `main_remainder_1pct_v1` table, so the `derived_view_schemas` SQL generator can detect the partition column to use and successfully dry-run the view to determine its schema.

* Formalize the order `bqetl generate all` runs the SQL generators in.

* Have `bqetl generate all` run `derived_view_schemas` last, in case other SQL generators create derived views.

* Fix `Schema._traverse()` to only recurse if both fields are records.
2024-06-07 19:05:36 -07:00
Winnie Chan c58c7fbd5c
DENG-3869 Create schema before deploying backfill staging table (#5643)
* Added schema from query file

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2024-06-07 12:47:09 -07:00
Sean Rose 5a7d7985c1
Fix `format_timedelta` function's parsing of negative timedeltas (#5740)
* Fix `format_timedelta` function's parsing of negative timedeltas.

The entire timedelta can be negative.

* Refactor to use a single timedelta regular expression.

* Fix typo in `format_timedelta` function argument.
2024-06-05 09:05:12 -07:00
Curtis Morales 8f474c79a4
Hardcode normalized_os as iOS in event_aggregates (#5725) 2024-06-05 10:40:14 -04:00
Preethi Issac 3691aff7c9
RS_1233 add is_enterprise_policies to search_clients_daily_v8 and search_aggregates (#5714)
* RS_1233_Add payload.processes.parent.scalars.policies_is_enterprise to search aggregates and search clients daily table

Add ingpayload.processes.parent.scalars.policies_is_enterprise to
- search_derived/search_aggregates
- search_derived/searc_ clients_daily_v8

* Update query.sql

* update to schema.yaml

* Fix CI issues

* Fix the tests

Fix tests issue

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-06-05 10:21:18 -04:00
Eduardo Filho 076a77947a
fix(geckoview_version): Replace geckoview.version field with valid gecko.version (#5736) 2024-06-04 18:00:11 -04:00
Winnie Chan 11b891febb
Added dataset id (#5721) 2024-06-04 13:16:21 -07:00
Preethi Issac 3e639762de
adding search with ads organic to mobile_search_aggregates and search_revenue_levers_daily table (#5682)
* updates to mobile_search_aggregates and search_revenue_levers_table

adding search_with_ads_organic to mobile_search_aggregates and search_revenue_levers_table

* Update query.sql

* Fix schema.yaml and test files

* Fix CI issue

Fix CI issue

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-06-04 10:54:55 -04:00
Anna Scholtz 1857068763
Fix mobile_search_clients_daily (#5702)
* Update mobile_search_clients_daily tempate

* Update mobile_search_clients_daily tests
2024-06-03 10:10:22 -04:00
Anna Scholtz 5ed5d2b396
Revert "adding search_with_ads_organic to mobile_clients_daily table (#5683)" (#5701)
This reverts commit 71bbaa2ac8.
2024-05-31 10:05:51 -07:00
Alexander 40c2b58482
fix(deploy): skip (instead of fail) deploys with explicitly null destination_table (#5700) 2024-05-31 12:50:47 -04:00
Alexander 0cd4295478
chore: refactor schema deploys, add and use deploy utils (#5674)
* chore: refactor schema deploys, add and use utils

* Update tests

* Add deploy tests

* Use string representation of table object in log statements
2024-05-30 16:23:53 -04:00
Preethi Issac 71bbaa2ac8
adding search_with_ads_organic to mobile_clients_daily table (#5683)
* adding search_with_ads_organic to mobile_clients_daily table

* Fix the CI tests

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-05-30 14:19:17 -04:00
kik-kik 11e7ca30a9
feat: update firefox_android_clients_v1 to pull distribution_id only from the baseline ping (#5685)
* feat: update firefox_android_clients_v1 to pull distribution_id only from the baseline ping

* feat: update firefox_android_clients_v1 baseline test schema to include distribution_id

* fix: resolve distribution_id not in baseline error

---------

Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
2024-05-30 12:43:53 -05:00
Chelsey Beck 6cfcbb6efc
removing acoustic last engaged timestamp from models and tests (#5678)
* removing acoustic last engaged timestamp

* removing suppression list model and tests

* removing dependency until frequency is increased

* formatting

* formatting
2024-05-30 08:12:07 -07:00
Alekhya 53ebe05a13
Revert "RS-788 Add support for organic searches with ads to the mobile search…" (#5676)
This reverts commit 9814059423.
2024-05-29 19:29:45 -04:00
Preethi Issac 9814059423
RS-788 Add support for organic searches with ads to the mobile search counts tables (#5598)
* adding organic searches with ads to this table

* updating mobile_search_aggregates table with search_with_ads_organic column

* updating the search revenue lever table

- include search_with_ads_organic columns for Bing, Google and DDG

* Fix CI issues

* Fix tests CI failure

* fix tests

* Fix test sql failure

* Update query.sql

reverting back to original code for search_revenue_levers table

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-05-29 13:29:50 -04:00