Граф коммитов

5873 Коммитов

Автор SHA1 Сообщение Дата
Marlene Hirose 8068144049
replace events_metrics_v2 metadata.yaml file with the proper one. Previous one was the page_metrics metadata.yaml file (#6114) 2024-08-27 08:58:07 -07:00
Katie Windau 44eb0a638c
DENG-4632 - add profile_group_id to firefox desktop & format query.sql (#6112) 2024-08-27 09:41:13 -05:00
Alekhya a3498659b8
Fix mobile os version for search related tables (#6111)
* Fix mobile os version

* Fix tests
2024-08-26 23:48:48 -04:00
Anna Scholtz d077ea046e
Limit the projects scanned for routines (#6101) 2024-08-26 15:28:49 -07:00
ksiegler1 c1e6c97bd0
Add login funnel by entrypoint (#6106)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-08-26 14:49:12 -07:00
m-d-bowerman e958f73e26
add thumb_voting event name to pocket events (#6110)
* add thumb_voting event name

* set up backfill
2024-08-26 13:17:28 -07:00
Katie Windau 453c2e25ee
DENG-4626 add profile_group_id to review_checker_events_v1 (#6108)
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-08-26 14:44:42 -05:00
Katie Windau 06d7b97be1
DENG-4620 add profile_group_id to review_checker_clients_v1 (#6107) 2024-08-26 14:27:30 -05:00
Eduardo Filho 23c803fc38
(GLAM) Cast scalar values as NUMERIC to avoid SUM overflow (#6109) 2024-08-26 15:09:44 -04:00
ksiegler1 464f9e183e
Add account settings engagement funnel (#6105)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
2024-08-26 10:51:42 -07:00
Chelsey Beck 8ce9289a28
DENG-3503 updating to use v2 table (#6097) 2024-08-26 10:12:32 -07:00
Katie Windau 90284ccae5
DENG-4613 update serp_events_v2 generator to include profile_group_id (#6104) 2024-08-26 11:43:36 -05:00
Alekhya a2854663df
Complete 08-15-2024 backfill search aggregates (#6100) 2024-08-23 17:43:20 -04:00
Katie Windau 997b18d2ee
DENG-4560 Add profile_group_id to events_stream_v1 (#6099) 2024-08-23 16:08:51 -05:00
Katie Windau 5614bb673a
DENG-4607 add profile_group_id to urlbar_events_v2 (#6098) 2024-08-23 12:50:07 -05:00
Katie Windau c1ddbf8c2a
DENG-4556 Add profile_group_id to telemetry_derived.newtab_interactions_v1 (#6096)
* DENG-4556 update schema.yaml

* DENG-4556 add profile_group_id to newtab_interactions_v1
2024-08-23 12:06:08 -05:00
Katie Windau c33846fc18
DENG-4588 add profile_group_id to newtab_visits_v1 (#6095)
* DENG-4588 add profile_group_id to newtab_visits_v1

* DENG-4588 update tests for newtab_visits_v1
2024-08-23 11:33:43 -05:00
Alexander d0d84cd882
Revert "chore(data-privacy-mapping): DSRE-1690 - ignore tables contaiing RECO…" (#6090)
This reverts commit 5dd4c45ab5.
2024-08-23 12:08:23 -04:00
Katie Windau bfbe603535
DENG-4568 fix bug in query (#6093) 2024-08-23 10:37:06 -05:00
Alexander 385b0103ed
fix(dryrun): table_metadata used before reference (#6094) 2024-08-23 11:17:26 -04:00
Chelsey Beck c8593c9d8f
updating query to match schema (#6091)
* updating query to match schema

* adding new columns to test

---------

Co-authored-by: akkomar <akkomar@users.noreply.github.com>
2024-08-23 12:07:19 +02:00
Alekhya 81a931ea67
Backfill search_aggregates_v8 table (#6065)
Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
2024-08-22 21:26:11 -04:00
ksiegler1 f54d847725
Add account deletion funnel (#6089)
Co-authored-by: Kimberly Siegler <kimberlysiegler@Kimberlys-MacBook-Pro-2.local>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2024-08-22 14:53:24 -07:00
whd 345ed707dc
chore(ci): stop publishing to dockerhub (#5137) 2024-08-22 19:57:02 +00:00
Katie Windau 936a22379f
DENG-4568 add profile_group_id to sponsored_tiles_clients_daily_v1 (#6085) 2024-08-22 13:01:53 -05:00
Chelsey Beck efbfa05ebb
updating files according to workbook (#6086)
* updating files according to workbook

* formatting yaml

* reformatting

* adding s to sheet
2024-08-22 10:46:35 -07:00
Ben Wu 4d6990dbbe
Include tables with no expirations in table_partition_expirations (#6084) 2024-08-22 18:04:31 +01:00
Chelsey Beck 92b77d0049
adding tag for triage to record only (#6083) 2024-08-22 18:36:10 +02:00
akkomar bfc2eb1583
Allow field addition in events_stream tables with struct metrics (#6082) 2024-08-22 14:55:43 +02:00
Anna Scholtz 7271c771dd
Remove data-observability-dev as the datasets got removed from BigQuery (#6081)
* Remove data-observability-dev as the datasets got removed from BigQuery

* Remove bqetl_data_observability_test_data_copy
2024-08-21 16:40:50 -07:00
m-d-bowerman e21a96f60e
Add topic, Pocket, default ui fields (#6069) 2024-08-21 14:40:02 -07:00
Chelsey Beck 302e32b5f3
Adding python script to export bigquery table as json objects to GCS for Merino (#6058)
* adding python script to export bigquery table as json objects to GCS

* updating to delete files older than 3 days

* renaming directory to add version

* updating description

* reformatting and ignoring type on storage

* updating description

* moving script to query.py, adding exception, and adding .ndjson extension to temp file

* updating bucket name and adding file prefix

* adding days old as parameter
2024-08-21 12:16:34 -07:00
Katie Windau c1c690e3a6
DENG-4564 add profile_group_id to urlbar_clients_daily_v1 (#6079)
* DENG-4564 add profile_group_id to urlbar_clients_daily_v1

* DENG-4564 Add description for profile_group_id

* DENG-4564 add profile_group_id to the urlbar_clients_daily_v1 sql tests
2024-08-21 12:23:27 -05:00
Katie Windau b541625b7f
DENG-4554 - add profile_group_id to telemetry_derived.addons_v2 (#6078) 2024-08-20 15:33:39 -05:00
Katie Windau fb498e6e68
DENG-4546 - update events_v1 view and add schema.yaml for existing table so CI checks pass (#6077) 2024-08-20 13:58:18 -05:00
Ben Wu 03a42ae334
Sort datasets in table_partition_expirations sql generator (#6076) 2024-08-20 17:55:06 +01:00
akkomar 806fc84b13
Filter out overactive clients in all events_daily queries (#6074)
In https://github.com/mozilla/bigquery-etl/pull/2333 we started filtering out overactive clients from desktop events_daily query. Back then I opted for not adding this filter for Glean queries as their event counts were significantly lower than desktop.

We are now having `bqetl_event_rollup.mozilla_vpn_derived__events_daily__v1` failing with `Cannot query rows larger than 100MB limit.` error. We'll fix it by extending the `client_event_count` filter to all queries.

3M threshold seems safe and a good first value to try - I have tested this query with this threshold on `2024-08-18` and got the same number of rows in the output table as currently in production (10598).

I tested `2024-08-19` by running:
```
bqetl generate events_daily --use_cloud_function=False --output_dir=sql_test_events_daily
cat sql_test_events_daily/moz-fx-data-shared-prod/mozilla_vpn_derived/events_daily_v1/query.sql | bq query --project_id=moz-fx-data-shared-prod --parameter=submission_date:DATE:2024-08-19 --use_legacy_sql=false --max_rows=0 --dataset_id=mozdata:tmp --destination_table=akomar_vpn_events_daily_test --replace
```
2024-08-20 16:10:21 +02:00
Ryan VanderMeulen b7f47c3b2e
Add Windows 11 23H2 & 24H2 updates to aggregates (#5883)
Co-authored-by: Katie Windau <153020235+kwindau@users.noreply.github.com>
2024-08-20 04:12:17 -07:00
Katie Windau 244d97ef2b
DENG-4514 add profile group ID to clients_first_seen_v3 (#6072) 2024-08-19 11:35:12 -05:00
Katie Windau 36739bd0b5
DENG-4515 Add profile_group_id to search_clients_daily_v8 (#6064)
Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
2024-08-19 07:04:40 -05:00
Marlene Hirose 331c1469ae
replace previous code that was the wrong code that was copied over from marketing_prod. Copy the right code over from marketing_prod to shared_prod (#6068) 2024-08-17 13:41:11 -07:00
Katie Windau af1e182c02
DENG-3096 Add 6 more countries to all 3 Cloudflare (#6070)
* DENG-3096 Added 6 more countries to pull browser usage data for

* DENG-3096 Add 6 more countries to device usage

* DENG-3096 Add 6 countries to OS usage
2024-08-16 19:18:37 +01:00
Alexander 4cfa504e49
chore(deploys): Catch and raise for missing dataset/project in table deploys (#6067)
* chore(deploys): Catch NotFound for missing dataset/project in table deploys

* Update bigquery_etl/deploy.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* update test exception text match

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2024-08-15 13:56:36 -04:00
Chelsey Beck 5a5087a031
DENG-3503 add braze subscription map v2 (#6066) 2024-08-15 09:59:26 -07:00
m-d-bowerman 435422c94e
Add topic selection fields to newtab_visits (#6004)
* Add topic selection fields to newtab_visits

* Remove duplicate from schema

* Add new metric to test schema

* Fix event name in test yaml

* Test yaml fix

* Add visit id to test case

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-08-15 09:32:23 -07:00
Alekhya 32399c5c50
Fix search_aggregates_v8 schema (#6063)
* Fix search_aggregates_v8 schema

* Fix the order of columns in schema.yaml

* Fix schema.yaml format issue
2024-08-15 10:34:19 -04:00
Katie Windau 659d01bfb9
DENG-4515 fix cast null as string to match schema (#6062) 2024-08-15 07:09:19 -05:00
Alekhya 47a72c9f42
Add is_sap_monetizable and normalize the search engine (#5907)
* Add is_sap_monetizable to aggregates table

* Fix tests


* add is_acer_cohort


* Add normalized_default_search_engine column for mobile and desktop

* Fix test cases
2024-08-14 18:00:58 -04:00
Katie Windau ccd0f25c52
DENG-4515 - part 1, add column to schema (#6052) 2024-08-14 14:53:34 +01:00
Alekhya 9d2ba6e20d
Separate DAU metrics into `search_dau_aggregates` table (#6000)
* Separate DAU metrics

* Combine desktop + mobile DAU queries

* Fix CI failure

Fix CI failure

* Remove eligible markets, reduce group bys

* Update desktop DAU and revert new mobile bug

* Fix sql file format

---------

Co-authored-by: skahmann3 <16420065+skahmann3@users.noreply.github.com>
2024-08-13 20:52:38 -04:00