Граф коммитов

349 Коммитов

Автор SHA1 Сообщение Дата
Ben Wu 59e49ea36c
Remove hardcoded dataset in baseline clients last seen check (#6514)
* Remove hardcoded dataset in baseline clients last seen check

* Remove extra .

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-11-18 20:38:11 +00:00
Lucia 92b7098f50
Prepare active_users_aggregates for a backfill with shredder mitigation. (#6349)
* Prepare active_users_aggregates for a backfill with shredder mitigation. Rename columns [first_seen_date, os_version, segment] to cascade upstream changes during the backfill for stable numbers.

* Update CODEOWNERS with the new version.

* Avoid modifying the active_users_aggregates view until the new version is backfilled. Remove the view generation from the new version which conflicts with previous version.

* Generate DAG with version.

* Adjust schema to query.

* Ensure attribution columns are present in mobile's schema.

* Convert NULLS in city to '??' as required by existing data.

* Remove uri_count, active_hours from views. Missing 'AS' added to to query.
2024-11-18 20:20:22 +00:00
Anna Scholtz ed7fc6b151
Update bigconfig.yaml to use tag_deployments (#6449) 2024-11-06 17:29:51 +00:00
Ben Wu 6e1e917317
Move event_monitoring_aggregates_v1 to run in on-demand billing (#6447) 2024-11-05 17:12:01 +00:00
rzhao 695ec38dad
Create iOS_onboarding.toml (#6316)
* Create iOS_onboarding.toml

Added iOS_onboarding.toml - query for generating the underlying table in the iOS onboarding dashboard.

* Update iOS_onboarding.toml

* feat: remove locale from dimension list, it does not exist in the source table

* Delete sql_generators/funnels/configs/iOS_onboarding.toml

* Create ios_onboarding.toml

updated to lower case file name and removed "locale" since it doesn't exist in firefox_ios_clients

* add a line in bqetl_project.yaml to avoid dry run

---------

Co-authored-by: kik-kik <kignasiak@mozilla.com>
2024-11-05 16:47:22 +00:00
Alekhya 7f72235eb4
Switch to baseline beginning Aug 01st 2024 (#6425)
* Switch to baseline beginning Aug 01st 2024

* Modify the search_engine_daily view to handle the cut off

* Fix sql format

* Skip mobile_clients_daily_v2 from dry run

* Remove sql_generator files populating v1

* Add tests for mobile_search_clients_daily_v2

* remove unwanted tests
2024-10-31 19:11:36 +00:00
m-d-bowerman 44cb706ff7
remove provider- from engine strings (#6419) 2024-10-30 15:59:19 +00:00
Anna Scholtz 35c92f086c
Filter overactive clients for `baseline_clients_daily` and `sponsored_tiles_clients_daily` (#6413)
* Filter overactive clients from baseline_clients_daily

* Filter overactive clients for sponsored_tiles_clients_daily

* Increase sponsored_tiles_clients_daily overactive threshold

* Update sql/moz-fx-data-shared-prod/telemetry_derived/sponsored_tiles_clients_daily_v1/query.sql

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>

---------

Co-authored-by: Ben Wu <12437227+BenWu@users.noreply.github.com>
2024-10-28 20:37:25 +00:00
Ben Wu 7386ab9574
[DENG-5931] Add partitioning to experiment_events_live_v1 (#6414) 2024-10-28 19:59:18 +00:00
Ben Wu 9386a728a5
Add date partitioning to experiment_search_events_live_v1 materialized views (#6393) 2024-10-24 15:44:01 +00:00
Katie Windau fc11532b8a
DENG-4768 add 3 new columns to glean stable table views (#6390)
* DENG-4768 add 3 new columns to glean stable table views

* DENG-4768 update comment in init.py

* Updated config.yml

* Updated config.yml
2024-10-23 13:53:52 -07:00
Ben Wu d7c8a12d9e
Bug 1905938 Support events with no metrics in glean_usage generator (#6358)
* Bug 1905938 Support events with no metrics in glean_usage generator

* fix event_error_monitoring

* not

* Apply suggestions from code review

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2024-10-16 15:30:28 -07:00
Alekhya bf391d7869
Search Mobile - Fix engine coding from baseline tables (#6357)
* Search Mobile -Fix engine coding from baseline tables

* fix the query to use existing udf

---------

Co-authored-by: Alekhya <alekhya@Mac.lan>
2024-10-16 18:44:08 +00:00
kik-kik 0e5b7c3d88
feat: add install_source to new_profiles selection as baseline ping now contains this info on the fenix app (#6356) 2024-10-16 13:57:37 +00:00
kik-kik e886d4ceed
fix: add missing project id prefix in the bigconfig templates in the fq_table_name (#6352)
* fix: add missing project id prefix in the bigconfig templates in the fq_table_name

* feat: update the bigeye collection to use the non-kpi collection
2024-10-15 18:14:32 +00:00
kik-kik 433b954cfc
feat: add bigconfig.yml to all mobile_kpi_metric tables only (#6337)
* feat: add bigconfig.yml to all mobile_kpi_metric tables only

* fix: update the named schedule to be valid (exist inside BigEye)
2024-10-15 16:37:57 +00:00
kik-kik 405402e368
feat: tweak glean_usage generator to include install_source in derived baseline tables (#6173)
* feat: tweak glean_usage generator to include install_source in derived baseline tables

* Apply suggestions from code review

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update sql_generators/glean_usage/templates/clients_last_seen_joined.query.sql

* feat: update the test definition to include install_source

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-10-15 13:10:39 +00:00
Anna Scholtz dd82bdae35
Add partitioning and clustering for event_monitoring materialized views (#6286)
* Add partitioning and clustering for event_monitoring materialized views

* event_monitoring partitioning on submission_date; additional clustering
2024-10-02 15:22:12 -07:00
kik-kik c9b8f4ef66
feat: update engagement, and retention to use the new attribution table (#6279) 2024-10-02 15:28:03 +00:00
Ben Wu 66443eee29
Bug 1920544 Create view to union firefox desktop crashes (#6257) 2024-09-27 17:48:44 +00:00
kik-kik 5765a27382
fix(DENG-4904): mobile kpi generator attribution clients duplication (#6265)
* fix: client duplication inside attribution_clients (mobile_kpi_support_metric generator)

* feat: add a uniqueness check to attribution_clients inside mobile_kpi_support_metrics generator
2024-09-26 13:25:00 +00:00
Sean Rose 725d0922a7
Reduce overactive clients threshold in `events_daily_v1` ETLs (bug 1921117). (#6263) 2024-09-25 22:25:54 +00:00
Katie Windau 26dbc47a93
DENG-4962 Add legacy telemetry client ID to events_stream (#6262)
* DENG-4962 add legacy telemetry client ID to events_stream

* DENG-4962 add if/then logic for adding legacy telemetry client ID
2024-09-25 18:02:26 +00:00
kik-kik 03b8fa6320
feat: add more granual attribution fields to the mobile_kpi_metric generator (#6254)
* feat: add more granual attribution fields to the mobile_kpi_metric generator

* fix: trailing comma in the EXCEPT union view

* fix: new_profile_clients omitting timestamp field

* feat: for now remove attribution_clients from union
2024-09-25 11:39:25 +00:00
kik-kik 5bc2904263
feat(): add distribution id to attribution clients template (#6251)
* feat: add distribution_id to attribution table

* feat: move the distribution_id configuration to init.py

* feat: add distribution_id field description
2024-09-24 12:55:38 +00:00
Sean Rose bfdc0a38c6
Exclude `firefox_crashreporter` Glean app from baseline ping checks (bug 1920582). (#6249) 2024-09-23 23:06:03 +00:00
akkomar 69c6c9d261
DENG-3274 Switch all accounts funnels to event ping (#6224) 2024-09-19 18:14:49 +00:00
Alexander e29adb36c3
chore: update anicholson ownership (#6220) 2024-09-19 13:25:40 +00:00
akkomar e4ab56da51
DENG-3274 Switch accounts login funnel to event ping (#6222) 2024-09-19 08:08:15 +00:00
Katie Windau 528df9c542
DENG-4684 add profile_group_id to feature_usage_v2 (#6200) 2024-09-13 19:01:46 +00:00
Anna Scholtz d326cd54b6
Fix skipping of apps in glean_usage (#6191)
* Fix skipping of apps in glean_usage

* Fix event error monitoring skip apps
2024-09-12 15:29:35 -07:00
Arkadiusz Komarzewski c53af44abd
Allow skipping apps in event monitoring generator
Also skip ads_backend because its dataset is restricted.
2024-09-12 17:50:07 +02:00
kik-kik 1570bf1682
feat: include is_daily_user filter to ensure only new profiles are included (#6042)
* feat: include is_daily_user filter to ensure only new profiles are included and use submission_date as first_seen_date so that the downstream query date filter works correctly and limits processing

* feat: additional guard clause to include only new_profile true rows where submission_date = first_seen_date
2024-09-05 17:48:48 +02:00
Katie Windau 65183f5324
Include both profile group IDs, from baseline and metric, in separate columns (#6157) 2024-09-04 15:14:16 -05:00
Katie Windau e4d806639a
DENG-4718 Add profile_group_id to baseline_clients_daily_v1, baseline_clients_last_seen_v1 (#6156) 2024-09-04 13:56:29 -05:00
akkomar 463807c661
Bug 1915728 - Reduce overactive clients filter value in events_daily queries (#6147) 2024-09-04 13:28:24 +02:00
Sean Rose 23d6f9e826
Increase view SQL length limit to 1024k characters. (#6055)
Google has increased the view SQL character limit in the projects used by our BigQuery ETLs (Google support case 52448716).
2024-08-28 17:34:56 -07:00
Katie Windau c1a635a421
DENG-4631 - add profile_group_id to metrics_clients_last_seen_v1 (#6117)
* DENG-4631 - add profile_group_id to metrics_clients_last_seen_v1

* DENG-4631 add schema.yaml
2024-08-27 17:24:13 -05:00
Katie Windau 44eb0a638c
DENG-4632 - add profile_group_id to firefox desktop & format query.sql (#6112) 2024-08-27 09:41:13 -05:00
Alekhya a3498659b8
Fix mobile os version for search related tables (#6111)
* Fix mobile os version

* Fix tests
2024-08-26 23:48:48 -04:00
ksiegler1 c1e6c97bd0
Add login funnel by entrypoint (#6106)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-08-26 14:49:12 -07:00
ksiegler1 464f9e183e
Add account settings engagement funnel (#6105)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
2024-08-26 10:51:42 -07:00
Katie Windau 90284ccae5
DENG-4613 update serp_events_v2 generator to include profile_group_id (#6104) 2024-08-26 11:43:36 -05:00
Katie Windau 997b18d2ee
DENG-4560 Add profile_group_id to events_stream_v1 (#6099) 2024-08-23 16:08:51 -05:00
Katie Windau 5614bb673a
DENG-4607 add profile_group_id to urlbar_events_v2 (#6098) 2024-08-23 12:50:07 -05:00
ksiegler1 f54d847725
Add account deletion funnel (#6089)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-08-22 14:53:24 -07:00
Ben Wu 4d6990dbbe
Include tables with no expirations in table_partition_expirations (#6084) 2024-08-22 18:04:31 +01:00
akkomar bfc2eb1583
Allow field addition in events_stream tables with struct metrics (#6082) 2024-08-22 14:55:43 +02:00
Ben Wu 03a42ae334
Sort datasets in table_partition_expirations sql generator (#6076) 2024-08-20 17:55:06 +01:00
akkomar 806fc84b13
Filter out overactive clients in all events_daily queries (#6074)
In https://github.com/mozilla/bigquery-etl/pull/2333 we started filtering out overactive clients from desktop events_daily query. Back then I opted for not adding this filter for Glean queries as their event counts were significantly lower than desktop.

We are now having `bqetl_event_rollup.mozilla_vpn_derived__events_daily__v1` failing with `Cannot query rows larger than 100MB limit.` error. We'll fix it by extending the `client_event_count` filter to all queries.

3M threshold seems safe and a good first value to try - I have tested this query with this threshold on `2024-08-18` and got the same number of rows in the output table as currently in production (10598).

I tested `2024-08-19` by running:
```
bqetl generate events_daily --use_cloud_function=False --output_dir=sql_test_events_daily
cat sql_test_events_daily/moz-fx-data-shared-prod/mozilla_vpn_derived/events_daily_v1/query.sql | bq query --project_id=moz-fx-data-shared-prod --parameter=submission_date:DATE:2024-08-19 --use_legacy_sql=false --max_rows=0 --dataset_id=mozdata:tmp --destination_table=akomar_vpn_events_daily_test --replace
```
2024-08-20 16:10:21 +02:00