Some historical Stripe tax transactions contain exchange rates with 16 digits after the decimal, which exceeds the normal NUMERIC type's maximum scale of 9 digits after the decimal.
* Switch to baseline beginning Aug 01st 2024
* Modify the search_engine_daily view to handle the cut off
* Fix sql format
* Skip mobile_clients_daily_v2 from dry run
* Remove sql_generator files populating v1
* Add tests for mobile_search_clients_daily_v2
* remove unwanted tests
* Fix logging in backfill validation
* validate backfill should fail
* validate backfill should still fail
* revert entry, move validate-backfills to after generate-sql
* Parallelize dependency graph
* Use GCP API to get table schema when not using cloud function
* Reuse GCP credentials
* Update dependency tests
* Remove print
* feat: add BigEye RunMetricOperator to DAG generation
* Fix Bigeye DAG generation and move configuration to bqetl_project.yaml
* Reformatting; test fixing
* Install Airflow override requirements
---------
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Give reader role to dry run account for datasets in stage
* add sql changes to deploy
* hardcode service accounts
* Revert "add sql changes to deploy"
* Remove func in dryrun
* f
* Allow specifying a collection for monitored tables
* Update bigquery_etl/cli/monitoring.py
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Format bqetl
---------
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
* Remove deleted table from skip list
* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl
* Parse schema file in validate_shredder_mitigation function so it works on priv-bqetl
* Clean up test
* Reference to the shredder mitigation process during backfills.
* missing dash
* Auto-generate and run data checks. Validate shredder_mitigation label.
* Checks template.
* Fix tests.
* Update checks to include additional EXCEPT, use table_id in staging dataset, and ensure that tests run and generate a failure that stops the backfill.
* Add `column_exists_in_schema` field to structured_missing_columns
* Add column_exists_in_schema to telemetry_missing_columns
* Add UDF to convert column names to be compatible with schema conventions
* Add UDF test for snake_case_columns
* Fix stage deploys for INFORMATION_SCHEMA
* Fix UDF test
* Code review feedback
* Review feedback
* Larger wildcards to reduce the chance of collision with actual values.
* Formatting
* Update bigquery_etl/metadata/validate_metadata.py
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
* Add test for validate_metadata.validate, add profile_id and profile_group_id to id-level_columns file.
* Update tests/cli/test_cli_metadata.py
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
---------
Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
* fix(glam_fog): Create function histogram_filter_high_values to prevent INT64 overflow
* fix(glam_fog): Use histogram_filter_high_values to avoid overflow
* Generate Bigeye monitoring configs in CI
* Ensure Bigconfig files are loaded exactly once
* Avoid duplicate validation of Bigconfig files
* Authentication using API key to Bigeye
* Remove api-key option for Bigeye and rely on env var instead
* Add default values in template to fix sqlglot parsing error.
* Adding backfill_date to exception message. Formatting.
* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.
* Formatting.
* Set values back to NULL were corresponds. Improve output information.
* Rename custom_query to custom_query_path to match the expected parameter.
* Missing import
* Larger wildcards to reduce the chance of collision with actual values.