* Update bqetl_google_analytics_derived_ga4 yaml configs
* Switch to using countifs for more readability
* reformat and switch to countifs for readability
* Initial draft
* adding new GA4 dag
* fixing source for browser
* work in progress
* change group by to be explicit
* adding non_fx_sessions
* adding non-fx-sessions
* adding campaign to query
* adding new column to group by
* added ad_content column
* Adding downloads column
* Adding non fx downloads
* reformat the SQL
* Add owner to new DAG
* Updata data types in the schema file
* Add missing comma
* Update query.sql
* Update start date and reduce from tier1 -> tier2
* Update sql/moz-fx-data-marketing-prod/ga_derived/www_site_metrics_summary_v2/metadata.yaml
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Update sql/moz-fx-data-marketing-prod/ga_derived/www_site_metrics_summary_v2/metadata.yaml
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Update sql/moz-fx-data-marketing-prod/ga_derived/www_site_metrics_summary_v2/query.sql
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Update sql/moz-fx-data-marketing-prod/ga_derived/www_site_metrics_summary_v2/query.sql
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* Update sql/moz-fx-data-marketing-prod/ga_derived/www_site_metrics_summary_v2/query.sql
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* update source and medium columns to use last touch attribution instead of first touch attribution
* update group by
---------
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
* DENG-2262 - add new DAG bqetl_desktop_installs_v1
* Initial commit for DENG-2262
* DENG-2262 - reformatted query.sql
* DENG-2262 add empty new line to end of view.sql
* update
* This is the modified codes following suggestions in https://github.com/mozilla/bigquery-etl/pull/4467. This pull request aims to update codes that create four mobile feature usage tables
* updated codes following comments from kik in 72d6d71910
* update feature usage table codes following suggestions
* added new data to mobile feature usage table (logins data for iOS) and modified to add unnest for nested values
* update codes based on suggestion from https://github.com/mozilla/bigquery-etl/pull/4648/files
* update metadata.yaml and dags.yaml according to suggestions
* update metadata.yaml and dags.yaml according to suggestions
* remove clustering and references in all the metadata files, add LEFT JOIN to all the UNNEST, modify SQL files to folllow the incremental submission date format, and rename dau as events(metrics)_ping_client_count
* updated schema to reflect the name change for dau
* update according to comments on dec 15
* update the distinct client count name
* fix the yaml errors
* fix no new line error fordags.yaml
---------
Co-authored-by: Ruoxi Zhao <rzhao@rzhao-37509.local>
Co-authored-by: Ruoxi Zhao <rzhao@rzhao-37509.lan>
* Add support for assigning Airflow tasks to task groups
* Generate separate Airflow tasks for glean_usage
* Remove Airflow dependencies from old glean_usage tasks
* Update scheduler of aggregates to run after upstreams.
* Update dags for new scheduler of analytics_aggregates
* Update dag bqetl_search
* Remove DAG.
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
* feat: urlbar events final release
* feat: new result types
* feat: add interaction and group
* fix: date
* fix: use BQ builtin for UUIDs
* Add the view_v2'
* Add new table to the DAG
* fix CI error
fix ci error
* remove teon brooks
* Incorporate feedback by Curtis
Incorporate feedback from Curtis
---------
Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
* New job to calculate newtab visits -> activity stream sessions
* Removing newline chars at end of file
* Removing newline chars at end of file
* Removing newline chars at end of file
* Addressing comment suggestions
* Format
* Add bqetl_ads DAG
* Add ACL to nt_visits_to_sessions_conversion_factors_daily_v1
* Add metadata files
* Add view to dry_run skip list
* Oops, fix the view
---------
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
* Add review checker derived datasets
* Add bqetl_review_checker dag
Fix
* Fix CI validate dag step
* Incorporate feedback from Alex
* Fix CI
* change client last seen to clients first seen
change client last seen to clients first seen
* fix dag
* DENG-1314 Implement changes to bqetl and create default DAG.
* DENG-1314. Update Documentation.
* DENG-1314. Dummy query to enable generating DAG and run tests.
* DENG-1314. Update tests.
* Update bigquery_etl/cli/query.py
Raise exception when scheduling information is missing.
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* DENG-1314. Update tests.
* DS-3054. Update query creation to set bqetl_default as default value for --dag. Update tests.
* Default task and tests update.
* Default task and tests update.
* 3650 - Remove default DAG option, update DAG template comment & tests.
* 3650 - Condition for DAG warning.
* 3650 - Update docs.
* Clarification on sql/moz-fx-data-shared-prod/analysis/bqetl_default_task_v1/metadata.yaml
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Update docs/cookbooks/creating_a_derived_dataset.md
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* Setup query and metadata templates to generate the queries for active_users_aggregates_deletion_request tables.
* Separate the active_users KPIs from the calculation based on deletion requests.
* Move mobile view outside of the browsers loop.
* Update DAG, move partition_date to the end of the query.
* Update DAG, move partition_date to the end of the query, change from parameters in the metadata to filter dates in the query.
* Revert change to fenix_derived.
* Remove 7-day window update and update tests. Fix indentation.
* Set query parameters using jija template.
* Set query parameters using jija template.
* Using parameters in the query.
* Formatting.
* Update sql_generators/active_users_deletion_requests/templates/mobile_deletion_request_query.sql
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* Formatting.
* Search table's date filter.
* Query only searches for clients with deletion request. Add required filter on submission_timestamp for table deletion_request.
* Format
---------
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
* ensuring double_opt_in field is of type INTEGER and reduced the number of retries for this job
* rerun dag creation
* change retry count in metadata and rerun dag creation
* changed the number of retries for acoustic DAGs to 1
---------
Co-authored-by: Leli Schiestl <lschiestl@mozilla.com>
* take rbaffourawuah@mozilla.com off of email list for DAG
* remove rbaffourawuah@mozilla.com from dags.yaml
* fix formatting and remove another instance of rbaffourawuah@mozilla.com from DAG
* fix order of emails
* modify name to be more explicity
* rename view and folder
* update view name
* update view location
* update view name
* add metadata, schema yamls and query.py
* created adjust_derived namespace
* add query.py, metadata, schema, dataset for testing
* delete extraneous file, update DAG name
* modify bqetl_adjust DAG redux
* update DAG name, take out '_derived'
* update table name in view
* standardize table names across files
* regenerate DAG
* update schema in both locations
* add query.py, metadata, schema yaml files
* take put extraneous print statements, update datasets to be 'adjust' or 'adjust_derived'
* add submission date to date_partition_parameter
* update table name to be just one table
* add DAG for adjust_derived
* add bq_etl adjust_derived DAG to yaml file
* add note about API token
* revert changes to bqetl.adjust.py
* use proper tast_id
* fix start dates
* add python command and docker image
* add python command and docker image
* delete extraneous code
* comment out docker part in old adjust dag
* add whitespace, delete extraneous code
* Update sql/moz-fx-data-shared-prod/adjust/adjust_derived/view.sql
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/query.py
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/query.py
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
* updated logic to check if response dictionary is not empty, moved view out of nested folder, added token ownership statement to metadata file, turned off email retry in dags.yaml, separated out clean up of json to its own function
* take out extraneous if statement and move else statement
* reorder where comment is to make more sense
* more description as to why we're using mhirose's API token
* take out periods
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/metadata.yaml
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* combine adjust DAGs
* change logic for query_export check loop continuance, adapt metadata.yamls
* add blank parameters test
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/metadata.yaml
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* add arguments to metadata.yaml
* remove external table reference
* refactor to add date parameter
* refactor based on Circle CI's advice
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/query.py
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
* Update sql/moz-fx-data-shared-prod/adjust_derived/adjust_derived_v1/query.py
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
* take out TODO comment
---------
Co-authored-by: kik-kik <kignasiak@mozilla.com>
Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
* added apple_ads_derived for copying over apple_ad data from the fivetran dataset, and apple_ads views now read from it
* added bqetl_fivetran_apple_ads.py DAG responsible for copying apple_ads data from the fivetran project over to moz-fx-data-shared-prod
* now dryrun skips apple_ads_derived instead of apple_ads as the query now accesses restricted dataset
* added schema files for apple_ads_derived datasets
* added descriptions to schema.yaml files for apple_ads_derived namespace
* added dataset_metadata for apple_ads_derived to include a link to the dbt transformations
* fixed apple_ads view definitions
* removed application label and referenced_tables section inside metadata.yaml for apple_ads as requsted by srose in PR#3847
* corrected source project for apple_ads views
* renamed apple_ads_derived to apple_ads_external
* added * to apple_ads_external namespace name to skip in the dryrun due to integration test deployment
* made tweaks to apple_ads and apple_ads_external datasets/namespaces as requested by whd
* updated apple_ads_external skip rule to the way it is meant to be defined, this will work once a fix is rolled out for dryrun
* fixed dag bqetl_fivetran_apple_ads description and updated the schedule to run once a day
* DENG-775 Added session_id to JOIN between GA data and stub_attr.stdout. Also expanded date range on GA session data to [download_date - 2 days, download_date + 1 day]
* Updated query to handle missing GA download_session_id. It effectively applies V1 logic to the MISSING_GA_CLIENT dl_tokens.
* Update domain metadata dag.
* Remove from triage with tags
* Remove telemetry-alerts email
* Add date formatting for monthly partition id
* Add support for `table_partition_format` in dag generation
* Don't add partition format if there's already a destination table
* use the correct name
* Add partition templates for all time partitioning types
* lint fixes
* more docs
* update all dags to include `table_partition_template` parameter.
* don't set if we have a partition offset
* don't add the parameter for the default 'day' partitioning scheme
* added mdn_yari_derived namespace along with mdn_popularities_v1 query to support mdn_popularities DAG inside telemetry-airflow
* added query.py as an alternative to exporting the data
* removed query.sql for mdn_yari.mdn_popularities_v1
* Updated query.py for mdn_popularities_v1 to move the blob to target location and clean up after
* made changes as requested in PR#3598 by akkomar