* Switch to baseline beginning Aug 01st 2024
* Modify the search_engine_daily view to handle the cut off
* Fix sql format
* Skip mobile_clients_daily_v2 from dry run
* Remove sql_generator files populating v1
* Add tests for mobile_search_clients_daily_v2
* remove unwanted tests
* Parallelize dependency graph
* Use GCP API to get table schema when not using cloud function
* Reuse GCP credentials
* Update dependency tests
* Remove print
* Adding spons tiles configured telemetry to newtab clients and visits
* ran bqetl format
* Adding new field to tests
* Adding new field to tests
* fix tests
---------
Co-authored-by: m-d-bowerman <mbowerman@mozilla.com>
* feat: tweak glean_usage generator to include install_source in derived baseline tables
* Apply suggestions from code review
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Update sql_generators/glean_usage/templates/clients_last_seen_joined.query.sql
* feat: update the test definition to include install_source
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* feat: add BigEye RunMetricOperator to DAG generation
* Fix Bigeye DAG generation and move configuration to bqetl_project.yaml
* Reformatting; test fixing
* Install Airflow override requirements
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Remove deleted table from skip list
* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl
* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl
* Clean up test
* Reference to the shredder mitigation process during backfills.
* missing dash
* Auto-generate and run data checks. Validate shredder_mitigation label.
* Checks template.
* Fix tests.
* Update checks to include additional EXCEPT, use table_id in staging dataset, and ensure that tests run and generate a failure that stops the backfill.
* Larger wildcards to reduce the chance of collision with actual values.
* Formatting
* Update bigquery_etl/metadata/validate_metadata.py
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
* Add test for validate_metadata.validate, add profile_id and profile_group_id to id-level_columns file.
* Update tests/cli/test_cli_metadata.py
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
---------
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
* Add default values in template to fix sqlglot parsing error.
* Adding backfill_date to exception message. Formatting.
* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* Formatting.
* Set values back to NULL were corresponds. Improve output information.
* Rename custom_query to custom_query_path to match the expected parameter.
* Missing import
* Larger wildcards to reduce the chance of collision with actual values.
* Add default values in template to fix sqlglot parsing error.
* Adding backfill_date to exception message. Formatting.
* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* Formatting.
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.
* Exception handling.
* isort & docstrings.
* Apply flake8 to test file.
* Remove variable assignment to different types.
* Make search case insensitive in function.
* Add test cases for function and update naming in a funcion's parameters for clarity.
* Update bigquery_etl/backfill/shredder_mitigation.py
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().
* Encapsulate actions to generate and run custom queries to generate the subsets for shredder mitigation.
* Query template for shredder mitigation.
* Query template for shredder mitigation and formatting.
* Add check for "GROUP BY 1, 2, 3", improve code readibility, remove unnecesary properties in classes.
* Test coverage. Check for "GROUP BY 1, 2, 3", improve readibility, remove unrequired properties in class Subset.
* Increase test coverage. Expand DataType INTEGER required for UNION queries.
* Increase test coverage. Expand DataType INTEGER required for UNION queries.
* Separate INTEFER and NUMERIC types.
* Move util functions and convert method to property, both to resolve a circular import. Adjust tests. Update function return and tests.
* Adding backfill_date to exception message. Formatting.
* Adding backfill_date to exception message. Formatting.
---------
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add parameter for custom query to managed backfills.
* Implement 'sidefill' using custom query and flag to run a backfill with shredder mitigation.
* Fix test.
* Fix tests.
* chore(deploys): Catch NotFound for missing dataset/project in table deploys
* Update bigquery_etl/deploy.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* update test exception text match
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Add topic selection fields to newtab_visits
* Remove duplicate from schema
* Add new metric to test schema
* Fix event name in test yaml
* Test yaml fix
* Add visit id to test case
---------
Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
* Add is_sap_monetizable to aggregates table
* Fix tests
* add is_acer_cohort
* Add normalized_default_search_engine column for mobile and desktop
* Fix test cases
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.
* Exception handling.
* isort & docstrings.
* Apply flake8 to test file.
* Remove variable assignment to different types.
* Make search case insensitive in function.
* Add test cases for function and update naming in a funcion's parameters for clarity.
* Update bigquery_etl/backfill/shredder_mitigation.py
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().
---------
Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
* Remove percentiles
* Remove tests that test percentiles
* Refresh scripts insert null to new percentiles
* Remove percentile columns from queries and schemas
* Delete more percentile tables
* Formatting
* histogram_cast_struct's keys are strings
* Re-add test after fixing failure cause
* DENG-4201 remove blogs_goals_v1 since now in shared prod
* DENG-4201 remove blogs_sessions_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_hits_v1 from mktg prod since now in shared prod
* DENG-4201 remove firefox_whatsnew_summary_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_downloads_v1 from mktg prod since now in shared prod
* DENG-4201 remove www_site_events_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove www_site_landing_page_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove www_site_page_metrics_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_daily_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_landing_page_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove blogs_empty_check_v1
* DENG-4201 remove www_site_empty_check_v1
* DENG-4201 remove www_site_metrics_summary_v1 from mktg prod since now moved to shared prod
* DENG-4201 remove dl with attr v1 and v2 from mktg prod since now in shared prod instead
* DENG-4201 remove tests from mktg prod since tables no longer live in mktg prod
* Add fully-qualified identifiers when formatting queries
* Fully-qualified identifiers for queries in sql/
* Check in only formatted SQL to generated-sql branch
* Add comment
* Fully qualify more tables
* Fully qualify test files
* Formatting improvements around CTEs and unit tests
* Option to skip auto qualifying queries