* Add default values in template to fix sqlglot parsing error.
* Adding backfill_date to exception message. Formatting.
* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* Formatting.
* Set values back to NULL were corresponds. Improve output information.
* Rename custom_query to custom_query_path to match the expected parameter.
* Missing import
* Larger wildcards to reduce the chance of collision with actual values.
* Add default values in template to fix sqlglot parsing error.
* Adding backfill_date to exception message. Formatting.
* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* Formatting.
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.
* Exception handling.
* isort & docstrings.
* Apply flake8 to test file.
* Remove variable assignment to different types.
* Make search case insensitive in function.
* Add test cases for function and update naming in a funcion's parameters for clarity.
* Update bigquery_etl/backfill/shredder_mitigation.py
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().
* Encapsulate actions to generate and run custom queries to generate the subsets for shredder mitigation.
* Query template for shredder mitigation.
* Query template for shredder mitigation and formatting.
* Add check for "GROUP BY 1, 2, 3", improve code readibility, remove unnecesary properties in classes.
* Test coverage. Check for "GROUP BY 1, 2, 3", improve readibility, remove unrequired properties in class Subset.
* Increase test coverage. Expand DataType INTEGER required for UNION queries.
* Increase test coverage. Expand DataType INTEGER required for UNION queries.
* Separate INTEFER and NUMERIC types.
* Move util functions and convert method to property, both to resolve a circular import. Adjust tests. Update function return and tests.
* Adding backfill_date to exception message. Formatting.
* Adding backfill_date to exception message. Formatting.
---------
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add parameter for custom query to managed backfills.
* Implement 'sidefill' using custom query and flag to run a backfill with shredder mitigation.
* Fix test.
* Fix tests.
* chore(deploys): Catch NotFound for missing dataset/project in table deploys
* Update bigquery_etl/deploy.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* update test exception text match
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Add topic selection fields to newtab_visits
* Remove duplicate from schema
* Add new metric to test schema
* Fix event name in test yaml
* Test yaml fix
* Add visit id to test case
---------
Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
* Add is_sap_monetizable to aggregates table
* Fix tests
* add is_acer_cohort
* Add normalized_default_search_engine column for mobile and desktop
* Fix test cases
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.
* Exception handling.
* isort & docstrings.
* Apply flake8 to test file.
* Remove variable assignment to different types.
* Make search case insensitive in function.
* Add test cases for function and update naming in a funcion's parameters for clarity.
* Update bigquery_etl/backfill/shredder_mitigation.py
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().
---------
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Remove percentiles
* Remove tests that test percentiles
* Refresh scripts insert null to new percentiles
* Remove percentile columns from queries and schemas
* Delete more percentile tables
* Formatting
* histogram_cast_struct's keys are strings
* Re-add test after fixing failure cause
* DENG-4201 remove blogs_goals_v1 since now in shared prod
* DENG-4201 remove blogs_sessions_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_hits_v1 from mktg prod since now in shared prod
* DENG-4201 remove firefox_whatsnew_summary_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_downloads_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_events_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove www_site_landing_page_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove www_site_page_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_daily_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_landing_page_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_empty_check_v1
* DENG-4201 remove www_site_empty_check_v1
* DENG-4201 remove www_site_metrics_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove dl with attr v1 and v2 from mktg prod since now in shared prod instead
* DENG-4201 remove tests from mktg prod since tables no longer live in mktg prod
* Add fully-qualified identifiers when formatting queries
* Fully-qualified identifiers for queries in sql/
* Check in only formatted SQL to generated-sql branch
* Add comment
* Fully qualify more tables
* Fully qualify test files
* Formatting improvements around CTEs and unit tests
* Option to skip auto qualifying queries
* RS 959 - Add OS breakouts
* Add mobile OS breakouts
Add mobile OS breakouts
* Add mobile OS breakouts
* Add tests for desktop
* Add mobile tests
* Fix tests
* Fix CI issue
* Limit `derived_view_schemas` SQL generator to actual view directories.
* Fix the `derived_view_schemas` SQL generator to get the view schemas by dry-running their latest SQL and/or from their latest `schema.yaml` file. Getting the schema from the currently deployed view wasn't appropriate because it wouldn't reflect the latest view code.
* Rename `View.view_schema` to `View.schema`.
* Change `View` so its schema dry-runs use the cloud function (CI doesn't have permission to run dry-run queries directly).
* Apply partition column filters in view dry-run queries when possible for speed/efficiency.
* Don't allow missing fields to prevent view schema enrichment.
* Only copy column descriptions during view schema enrichment.
* Only try enriching view schemas from their reference table `schema.yaml` files if those files actually exist.
* Change `main_1pct` view to select directly from the `main_remainder_1pct_v1` table, so the `derived_view_schemas` SQL generator can detect the partition column to use and successfully dry-run the view to determine its schema.
* Formalize the order `bqetl generate all` runs the SQL generators in.
* Have `bqetl generate all` run `derived_view_schemas` last, in case other SQL generators create derived views.
* Fix `Schema._traverse()` to only recurse if both fields are records.
* Fix `format_timedelta` function's parsing of negative timedeltas.
The entire timedelta can be negative.
* Refactor to use a single timedelta regular expression.
* Fix typo in `format_timedelta` function argument.
* updates to mobile_search_aggregates and search_revenue_levers_table
adding search_with_ads_organic to mobile_search_aggregates and search_revenue_levers_table
* Update query.sql
* Fix schema.yaml and test files
* Fix CI issue
Fix CI issue
---------
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
* feat: update firefox_android_clients_v1 to pull distribution_id only from the baseline ping
* feat: update firefox_android_clients_v1 baseline test schema to include distribution_id
* fix: resolve distribution_id not in baseline error
---------
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
* removing acoustic last engaged timestamp
* removing suppression list model and tests
* removing dependency until frequency is increased
* formatting
* formatting
* adding organic searches with ads to this table
* updating mobile_search_aggregates table with search_with_ads_organic column
* updating the search revenue lever table
- include search_with_ads_organic columns for Bing, Google and DDG
* Fix CI issues
* Fix tests CI failure
* fix tests
* Fix test sql failure
* Update query.sql
reverting back to original code for search_revenue_levers table
---------
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>