bigquery-etl

Граф коммитов

Автор	SHA1	Сообщение	Дата
Frank Bertsch	00a6e5a0f6	Add metadata.yaml for socorro_crash_v2 (#4664 )	2023-12-07 14:10:47 -05:00
Frank Bertsch	d9461f9d40	Authorize view and add workgroup access for taskcluster (#4661 )	2023-12-07 12:28:33 -05:00
Anna Scholtz	b65ce41901	Fix structured_error_counts query (#4659 )	2023-12-06 14:15:18 -08:00
Anna Scholtz	be60f5aa56	Use live tables for structured error counts (#4598 ) * Use live tables for structured error counts * Prevent from old records being deleted	2023-12-06 11:13:53 -08:00
Frank Bertsch	16bdbcbcc8	Create tables that have state values per day (#4634 ) * Create tables that have state values per day * Change Airflow DAG * Move markov states to cols rather than array * Move bot/bad client filter to materialized table * Add install_source and consecutive_days_seen features * Add field to CTE * Use jinja vars instead of sql variables * Use correct UDF incantation	2023-12-06 12:54:46 -05:00
kik-kik	cc8f697b7e	Migrated DIM checks over to ETL checks for telemetry_derived.unified_metrics_v1 (#4649 )	2023-12-06 13:36:54 +01:00
Marlene Hirose	1dc430e7b0	add dau_clients_days_since_seen to CTE and num_clients_dau_on_day column to table in query and schema (#4652 )	2023-12-05 13:55:40 -07:00
Anna Scholtz	a729162a1d	Remove baseline_clients_daily DAG dependency for FF ios baseline clients yearly (#4651 )	2023-12-05 10:12:18 -08:00
kik-kik	076a0e0775	feat(DENG-2083): added firefox_ios_derived.clients_activation_v1 and corresponding view (#4631 ) * added firefox_ios_derived.clients_activation_v1 and corresponding view * fixing a missing seperator in firefox_ios_derived.clients_activation_v1 checks * adding firefox_ios_derived.clients_activation_v1 to shredder configuration * removed is_suspicious_device_client as it should not be there, thanks bani for pointing this out * fixed black formatting error inside shredder/config.py * applied bqetl formatting * minor styling tweak as suggested by bani in PR#4631	2023-12-05 12:42:39 +01:00
Eduardo Filho	0bf4c279d6	GLAM avoid scientific notation for big sample counts (#4647 ) * GLAM avoid scientific notation for big sample counts * Cast to bignumeric instead of numeric	2023-12-04 17:47:50 -05:00
Leli	f84a097524	Feat/deng 2046/migrating telemetry derived active users aggregates v1 dim checks to etl checks (#4641 ) * Migrated DIM checks over to ETL checks for telemetry_derived.active_users_aggregates_v1 * rewrite * code review suggestions * add doc * rename --------- Co-authored-by: kik-kik <kignasiak@mozilla.com>	2023-12-04 17:19:48 +01:00
kik-kik	377685cac9	fixing broken test for firefox_ios_derived.baseline_clients_yearly_v1 (#4645 )	2023-12-04 10:48:56 -05:00
kik-kik	97c8a420cd	Migrated DIM checks over to ETL checks for internet_outages.global_outages_v1 (#4639 )	2023-12-04 12:20:36 +01:00
Anna Scholtz	7e36ba259e	Skip check for baseline_clients_last_seen for Fire TV (#4640 )	2023-12-01 09:24:11 -08:00
kik-kik	9409d2b6cb	feat(DENG-1774 / cancelled): deleting fenix_derived/firefox_android_clients_v2, v1 will remains the active model (#4610 ) * deleting fenix_derived/firefox_android_clients_v2, v1 will remain the active model * removed fenix_derived.firefox_android_clients_v2 from shredder config	2023-12-01 11:16:54 +01:00
akkomar	ff5db87fbb	Add user-facing view to fxa_oauth.clients (#4623 )	2023-11-30 20:16:56 +01:00
Anna Scholtz	7985f5daa6	Update dataset_metadata.yaml for broken site reports (#4630 )	2023-11-30 10:14:54 -08:00
Anna Scholtz	7087dbff30	Separate Airflow tasks for glean_usage (#4588 ) * Add support for assigning Airflow tasks to task groups * Generate separate Airflow tasks for glean_usage * Remove Airflow dependencies from old glean_usage tasks	2023-11-30 09:48:17 -08:00
Ksenia	a232e0d776	Fixes #4624 - Add a view for firefox_desktop.broken_site_report (#4625 ) Co-authored-by: Anna Scholtz <anna@scholtzan.net>	2023-11-30 09:20:00 -08:00
Leli	ad90b92943	fix typo in project name (#4628 ) * fix typo in project name * remove shared-prod project from sql for google_ads_derived	2023-11-30 17:36:47 +01:00
Marlene Hirose	228287b78f	add code for cohort_daily_statistics using clients_first_seen_v2 with… (#4404 ) * add code for cohort_daily_statistics using clients_first_seen_v2 with new columns from clients_first_seen_v2 * take out extra sample_id * Update sql/moz-fx-data-shared-prod/telemetry_derived/cohort_daily_stats_clients_frst_seen_v2/query.sql switching column names - original was swapped Co-authored-by: Alexander <anicholson@mozilla.com> * update column names- change cohort_date to first_seen_date, make more descriptive; take out client_id and sample_id in the final table; take out extraneous columns that are not used in final table * fix group by - days_seen_bits not days_interacted_bits * take out second_seen_date, irrelevant * change date _activity to submission_date * replace submission_date_activity with client_activity * add new line at end of schema.yaml file * refactor code to use clients_first_seen_v2, originally commited cohorts_daily_statistics_v1 code in the v2 file * add cohort_daily_statistics_v2 job to DAG * add cohort_daily_statistics_v2 job to DAG, take out submission_date and add activity_date to query.sql * delete now needless dags folder * correct alias of table * change submission_date to activity_date * fix column name apple_model to apple_model_id * add days_seen_dau_bits and other calculations based on this * add attribution_dlsource to table * take out underscore from column name, attribution_dlsource * revise comment - 196 days not 180 days * add all the other columns from clients_first_seen_v2, update schema.yaml file with new columns * take out sample_id, fix schema * take out document_id, dl_token, app_build_id columns, rename activity_date to submission_date, rename cohort_date to first_seen_date to match clients_first_seen_28_days_later * move files from cohort_daily_statistics_v2 to desktop_cohort_daily_retention_v1 to reflect name change, take out extraneous colums such as xpcom_abi, attribution_dlsource, engine_data columns --------- Co-authored-by: Alexander <anicholson@mozilla.com>	2023-11-29 12:28:57 -08:00
Leli	d7a1d206c2	Deng 1662 move google ads to ads google mmc connector (#4525 ) * DENG-1662 move from google_ads connector to ads_google_mmc connector * format queries	2023-11-29 19:25:16 +01:00
Alekhya	9379f011b0	Fix the source table to point to unified view to include all apps (#4622 )	2023-11-29 12:30:31 -05:00
Frank Bertsch	1071222867	DENG-2013 - Add explicit dependencies & checks for history (#4620 )	2023-11-29 09:50:35 -05:00
kik-kik	a1849121d2	added ETL checks to fenix_derived.firefox_android_clients_v1 (#4609 )	2023-11-29 15:20:06 +01:00
Alexander	84ae235a91	Add desktop_acquisition_funnel view (#4616 ) * Add desktop_acquisition_funnel view * Update reference * Update view.sql Took out some of the TODO comments around naming to stay consistent with the table it is reading as well as reduce effort to make changes to the spoke-default view that is currently setup with test data. --------- Co-authored-by: gkabbz <gkabbz@gmail.com>	2023-11-28 17:22:10 -05:00
Frank Bertsch	d484a56ca6	Give census access to gclid conversions data (#4613 )	2023-11-28 14:05:32 -05:00
Frank Bertsch	9ae3d95313	Add init clause to ga_clients table (#4611 )	2023-11-28 13:17:38 -05:00
akkomar	d642cefd59	SVCSE-1595 Update more accounts_db schemas to match deployed tables (#4605 )	2023-11-27 21:51:17 +01:00
akkomar	b9f467d92b	SVCSE-1595 Update accounts_db schemas to match deployed tables. (#4604 )	2023-11-27 19:03:11 +01:00
kik-kik	f635adc069	feat(): updating fxa android funnel to support install_source filtering downstream (#4561 ) * Added a filter to only include playstore data In keeping the bottom of the funnel consistent with the upper funnel, we have to only include installs from play store in the bottom of the funnel metrics * for fenix_derived.funnel_retention_clients_week_* tables making sure we only include playstore users * updating the changes as requested by soGaussian to expose to users the install_source field to enable filtering --------- Co-authored-by: richard baffour <baffour345@gmail.com>	2023-11-27 11:40:09 +01:00
akkomar	e1a94c9e4a	SVCSE-1595 Setup import of tables from production FxA databases (#4597 )	2023-11-24 17:44:38 +01:00
Eduardo Filho	ec297972c6	Glam accounts for sampling when calculating sample_count for windows & release probes (#4581 ) * Glam - fix legacy windows & release probes' sample count going fwd * Glam FOG accounts for sampling when calculating total_sample for windows & release probes * fog - fix client count and sample count * Add channel filtering for fog	2023-11-23 17:06:20 -05:00
akkomar	a47bcedc28	SVCSE-1595 Setup import of tables from staging FxA databases (#4578 )	2023-11-23 17:45:15 +01:00
Anna Scholtz	c5b1896ca0	Use live tables to determine deletion request ping volume (#4442 )	2023-11-22 13:17:09 -08:00
akkomar	e645e9e3f6	Bug 1865716 - Include errorGroups in legacy docker_fxa_admin_server_sanitized query (#4589 ) `errorGroups` field was added in `docker_fxa_admin_server_sanitized_v2` and breaks the UNION.	2023-11-21 20:30:28 +01:00
Frank Bertsch	73b86960fc	Account for NULL handling in joins (#4590 ) Previously, NULL values in the join keys didn't join, resulting in duplicate rows. This change will coalesce those to empty strings and NULLIFY them in the view.	2023-11-21 08:55:03 -05:00
Alekhya	3fca93c3af	DS-3272 - Fix review checker clients to remove dups (#4583 ) * Fix review checker clients to remove dups * Fix CI issues * Add row_num filter * add submission_date to partition * remove submission_date from partition	2023-11-20 21:39:16 -05:00
Frank Bertsch	19300fc87f	Firefox ios adclicks (#4585 ) * Add Firefox iOS client adclicks history * Add metadata description to view	2023-11-20 16:32:22 -05:00
Frank Bertsch	c982921ff5	Remove duplicate BQ query param (#4587 )	2023-11-20 16:10:20 -05:00
Frank Bertsch	27f96027dc	Add ga_clients_v1 table & view (#4560 ) * Add ga_clients_v1 table & view - Query from ga_sessions - Fix tests * Use correct scheduling parameters Co-authored-by: Alexander <anicholson@mozilla.com> * Move HAVING clause to WHERE Co-authored-by: Alexander <anicholson@mozilla.com> * Change CTE name Co-authored-by: Alexander <anicholson@mozilla.com> --------- Co-authored-by: Alexander <anicholson@mozilla.com>	2023-11-20 15:45:04 -05:00
Mike Williams	4d0b0cb370	migrates old pingcentre onboarding artifacts to new firefox_desktop view (#4457 ) * migrates old pingcentre onboarding artifacts to new firefox_desktop view * generate event rollup dag * generate review checker dag * update messaging system dag * incl project in table names --------- Co-authored-by: Anna Scholtz <anna@scholtzan.net>	2023-11-20 12:15:46 -08:00
Frank Bertsch	dc3864fbc4	Add conversion event; fix gclid conversions query (#4584 ) * Add first_run conversion; use correct table names * Ignore dryrun of query and view * Remove HAVING clause; fix logical_or	2023-11-20 14:48:02 -05:00
Frank Bertsch	05fed88b07	Include GA intraday sessions tables (#4582 ) * Include GA intraday sessions tables * Update doc string on backfilling ga_sessions * Dont dryrun stub_attribution view	2023-11-20 11:58:45 -05:00
Frank Bertsch	f3b13c652e	Add gclid_conversions table & view (#4558 ) * Add gclid_conversions table & view This table will support the desktop conversion events. Each valid GCLID will have any associated conversion events. See the decision brief: https://docs.google.com/document/d/1T8ArA9r8HDMTj1ES9NHfJFv2gUWo7w0MjG07iXtuUOI * Use correct table name * Use new stub attribution dataset; clarify activity_date * Use correct date_partition_parameter Co-authored-by: Alexander <anicholson@mozilla.com> * Include activity_date as parameter * Use INNER instead of LEFT joins * Update doc strings to clarify GCLID vs GA Session --------- Co-authored-by: Alexander <anicholson@mozilla.com>	2023-11-20 10:42:02 -05:00
Frank Bertsch	104ece82d9	Add derived stub attribution logs (#4557 ) * Add derived stub attribution logs This table keeps triplets from the stub attribution logs. The triplet of (dl_token, ga_client_id, stub_session_id) will only ever appear once here. See the associated decision brief: https://docs.google.com/document/d/1L4vOR0nCGawwSRPA9xiR8Hmu_8ozCGUecXAtBWmGGA0/edit * Move stub attribution table to new dataset In order to ensure limited access to the stub attribution service data without significantly decreasing developer velocity, we move these tables to a new dataset. That dataset has the defaults we want for all stub attribution log data: - Defaults to just read access to data-science/DUET workgroup - No read/write access for DE We will backfill via the bqetl_backfill DAG. * Rename view * Use correct dataset name in view * Skip dryrun; no access	2023-11-17 16:36:48 -05:00
Frank Bertsch	5cf8d30153	Add session date param; fix checks CLI bug (#4579 ) * Fix checks to filter on partitions * Don't print "missing checks file" on success Previously, the statement that checks.sql files were missing was printed on any execution of the for statement. ("else" clauses after "for"s execute after completion of the "for" clause). Instead, we want to print only when there are no files.	2023-11-17 15:33:23 -05:00
Frank Bertsch	cbb843e455	Add ga_sessions_v1 table & view (#4554 ) * Add ga_sessions_v1 table & view This table aggregates session-level data from GA. * Rename nullify string func * Apply suggestions from code review Co-authored-by: Alexander <anicholson@mozilla.com> * Add upstream backfill deps * Move depends_on to correct section --------- Co-authored-by: Alexander <anicholson@mozilla.com>	2023-11-16 15:58:33 -05:00
Marlene Hirose	10dcdf4b8c	remove file that isn't ready yet (#4572 )	2023-11-15 09:04:48 -08:00
betling	350d167802	summing sap and ad clicks (#4571 )	2023-11-15 11:15:54 -05:00

1 2 3 4 5 ...

1393 Коммитов