Граф коммитов

204 Коммитов

Автор SHA1 Сообщение Дата
Leli 4135c7adbd
Deng 3597 marketing suppression list remove ctms (#5481)
* DENG-3600 Acoustic suppression list add timestamp

* DENG-3597 Marketing Suppression List remove ctms

---------

Co-authored-by: Chelsey Beck <64881557+chelseybeck@users.noreply.github.com>
2024-05-02 16:37:35 -07:00
Leli 38cab3f16c
DENG-3607 update braze_derived suppression list (#5484)
* DENG-3607 update braze_derived suppression list

* update table reference
2024-05-02 16:23:49 -07:00
Leli f371aabd48
DENG-3600 Acoustic suppression list add timestamp (#5479) 2024-05-02 21:44:46 +02:00
Chelsey Beck 8e8d502b35
removing active subscriptions count attribute (#5471)
* removing active subscriptions count attribute

* updating test to remove count attribute
2024-05-02 07:58:34 -07:00
Chelsey Beck f53bbad76b
updating products and adding changed products model (#5450)
* updating products and adding changed products model

* updating user profiles to join on external id

* updating tests to match new schema

* reverting naming

* updating test

* updating products model to include additional attributes and updating tests

* updating order and tests

* updating format

* upddating source table

* updating schema

* updating user profiles table schema to match new products array
2024-04-29 19:17:06 -07:00
Chelsey Beck fc057b00ca
Adding new models for braze sync (#5445)
* adding models for changed and deleted users

* adding filter to changed waitlists model

* updating tests to include has_fxa field

* updating tests to include an unchanged user

* adding new line

* removing unnecessary cross join
2024-04-26 13:32:11 -07:00
Chelsey Beck d8edb09f32
DENG-3008 Updating Braze Models (#5405)
* updating create statement

* joining on users table to filter for active and ensuring there is at least one subscription

* joining on users to filter for active

* adding dev subscription group

* removing fxa_id in favor of has_fxa

* bringing in update timestamp for downstream use

* updating formatting and adding filter for active users

* adding filter for one active newsletter

* updating tests

* adding fxa id back to users table to join to products

* updating query

* updating values

* updating tests

* fix test for subscriptions

* changing schema to array

* updating format

* updating to pull in all subscriptions with statuses

* removing create statement

* updating subscriptions query to make it an array and updating associated tests

* updating formatting and comment

---------

Co-authored-by: Leli Schiestl <lschiestl@mozilla.com>
2024-04-23 12:41:44 -07:00
Leli b3ffb3adaf
Deng 3453 braze models unit tests (#5394)
* DENG-3453 braze models add unit tests

* fix test

* add more tests
2024-04-19 15:16:04 +02:00
Anna Scholtz eac0ac80c2
Remove telemetry_derived init.sql files (#5342)
* Remove init.sql files for telemetry_derived queries

* Remove init.sql for events_daily

* Remove init.sql from skip lists

* Remove init.sql references from tooling

* Add schema for baseline_clients_first_seen
2024-04-10 15:36:30 -07:00
Lucia 3705f4e1d1
Add distribution_id to firefox_android_clients. (#5325)
* Add distribution_id to firefox_android_clients.

* Update tests schemas.
2024-04-04 21:23:10 +02:00
Anna Scholtz 963a9792d4
Remove search query init.sql (#5306)
* Remove init.sql files for fenix and use is_init() instead

* Remove init.sql files for search_derived datasets

* Simplify is_init() for acer_cohort_v1
2024-04-03 15:20:54 -07:00
Winnie Chan 33f9017c75
DENG-2823: Added deprecate cli command (#5219)
* Added deprecate cli command

* Fixed typo

* Fixed failed tests

* Fixed deletion date label

* Update bigquery_etl/metadata/parse_metadata.py

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

* Fixed deletion date

* Fixed arguments optional

* Added return back

* Added invalid deletion date test

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2024-03-19 11:17:32 -07:00
m-d-bowerman 765865eaa3
Add `locale` and non-search engage flag (#5224)
* Add `locale` and non-search engage flag

* Fix test failure

* Fix test expect
2024-03-18 14:22:17 -07:00
Anna Scholtz 13fc230fd4
Fix test files for dl_token_ga_attribution_lookup_v1 (#5183) 2024-03-07 14:08:57 -08:00
Katie Windau bdb2361b18
DENG-2979 Update dl_token_ga_attribution_lookup_v1 to use new GA4 client ID after switch to GA4 (#5181)
* DENG-2979 - add logic to use GA4 client ID after it starts coming through instead of old GA3 one, & start updating tests

* DENG-2979 fix SQL formatting

* DENG-2979 - add temp table to do a final distinct

* DENG-2979 add a data check to make sure the code errors out if someone tries to run prior to 8-25-2023

* DENG-2979 update the date of the tests to be more recent and not fall in the data check fail period

* DENG-2979 update test cases
2024-03-07 14:00:29 -06:00
Lucia 84ee88e2b9
Dependabot/pip/black 24.1.1 fix (#5027)
* Bump black from 23.10.1 to 24.1.1

Bumps [black](https://github.com/psf/black) from 23.10.1 to 24.1.1.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/23.10.1...24.1.1)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Reformat files with black to fix dependabot update.

* Reformat with black 24.1.1. Update test dag with required space.

* Update test dags.

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-19 15:27:34 +01:00
Curtis Morales 0a9ee2434a
AD-178 Update event_aggregates tables to pull query_type from the Ads team's table (#4967)
* Pull query_type from adm.blocks table

* Do the same in event_aggregates_v1

* Update tests
2024-02-06 13:37:45 -05:00
kik-kik 9788ff7293
feat(DENG-2481): Add fenix install referrer to fenix firefox android clients (#4940)
* Tweaking firefox_android_clients_v1 to also include play_store attribution fields

* removed additional logic used for testing found within _previous CTE

* removed firefox_android_clients_v1 init.sql in favour of templating via is_init() inside the query

* Made changes as suggested by fbertsch in PR#4940

* Fixing sql tests
2024-02-02 17:42:52 +01:00
Alexander acfbcbfbea
Revert "Update clients_last_seen_v1 and clients_first_seen to clients_first_s…" (#4948)
This reverts commit 39a89a95ea.
2024-02-01 15:30:27 -05:00
Curtis Morales d019d4ebc0
AD-178 Add query_type to event_aggregates_suggest and event_aggregates (#4927)
* Add query_type to event_aggregates_suggest

* Format

* Add query_type to event_aggregates table

* Update event_aggregates tests
2024-01-31 11:27:12 -05:00
Alexander 39a89a95ea
Update clients_last_seen_v1 and clients_first_seen to clients_first_seen_v2 (#4915)
* Update to clients_first_seen_v1

* Update tests
2024-01-30 16:33:01 -05:00
Sean Rose a70b2aa689
Support symlinks (#4881)
* Avoid using `Path.glob()` or `Path.rglob()` for recursive file searches.

Because they don't currently support following symlinks (they will in Python 3.13).

* Specify `followlinks=True` as necessary when calling `os.walk()`.
2024-01-24 13:02:43 -08:00
Sergio E. Betancourt 4661be0248
[AD-145] Adding topsite dismissals to newtab_* jobs. (#4771)
* Adding topsite dismissals to newtab_* jobs.

* Update test to include new dismissal fields

---------

Co-authored-by: Curtis Morales <cmorales@mozilla.com>
2024-01-05 15:17:59 -06:00
Daniel Thorn 25c18112b5
DENG-1352 - Migrate contextual services ETL to desktop glean pings (#4474) 2023-12-07 16:12:07 -08:00
akkomar 82308b91d5
Bug 1866469 - Exclude use_counters from GLAM ETL (#4603)
* Bug 1866469 - Exclude use_counters from GLAM ETL

* Attempt to fix tests

---------

Co-authored-by: Eduardo Filho <edugomfilho@gmail.com>
2023-11-24 16:00:24 -05:00
Frank Bertsch 73b86960fc
Account for NULL handling in joins (#4590)
Previously, NULL values in the join keys didn't join, resulting
in duplicate rows. This change will coalesce those to empty
strings and NULLIFY them in the view.
2023-11-21 08:55:03 -05:00
Frank Bertsch 27f96027dc
Add ga_clients_v1 table & view (#4560)
* Add ga_clients_v1 table & view

- Query from ga_sessions
- Fix tests

* Use correct scheduling parameters

Co-authored-by: Alexander <anicholson@mozilla.com>

* Move HAVING clause to WHERE

Co-authored-by: Alexander <anicholson@mozilla.com>

* Change CTE name

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2023-11-20 15:45:04 -05:00
Frank Bertsch 05fed88b07
Include GA intraday sessions tables (#4582)
* Include GA intraday sessions tables

* Update doc string on backfilling ga_sessions

* Dont dryrun stub_attribution view
2023-11-20 11:58:45 -05:00
Frank Bertsch 104ece82d9
Add derived stub attribution logs (#4557)
* Add derived stub attribution logs

This table keeps triplets from the stub attribution logs.
The triplet of (dl_token, ga_client_id, stub_session_id)
will only ever appear once here.

See the associated decision brief:
https://docs.google.com/document/d/1L4vOR0nCGawwSRPA9xiR8Hmu_8ozCGUecXAtBWmGGA0/edit

* Move stub attribution table to new dataset

In order to ensure limited access to the stub attribution service
data without significantly decreasing developer velocity, we
move these tables to a new dataset. That dataset has the defaults
we want for all stub attribution log data:
- Defaults to just read access to data-science/DUET workgroup
- No read/write access for DE

We will backfill via the bqetl_backfill DAG.

* Rename view

* Use correct dataset name in view

* Skip dryrun; no access
2023-11-17 16:36:48 -05:00
Frank Bertsch cbb843e455
Add ga_sessions_v1 table & view (#4554)
* Add ga_sessions_v1 table & view

This table aggregates session-level data from GA.

* Rename nullify string func

* Apply suggestions from code review

Co-authored-by: Alexander <anicholson@mozilla.com>

* Add upstream backfill deps

* Move depends_on to correct section

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2023-11-16 15:58:33 -05:00
Alexander 8a233917ee
Bug 1864722 - Fix column name typo (#4567) 2023-11-14 16:43:15 -05:00
Alexander b6929713ea
DENG-1705 - Add startup_profile_selection_reason from first ping to clients_daily, clients_first_seen_v2 and downstream (#4482)
* DENG-1705 - Add startup_profile_selection_reason to clients_first_seen

* Add startup_profile_selection_reason_first_ping_only

* Query typo

* Update test schema

* Update sql/moz-fx-data-shared-prod/telemetry_derived/clients_first_seen_28_days_later_v1/schema.yaml

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
2023-11-13 15:03:23 -05:00
Alexander 7dc6014e70
Rename main_v4 -> main_v5 in ssl_ratios tests (#4536) 2023-11-08 15:42:06 -05:00
Alexander f500ba9b7a
DENG-1705 - Add missing client attribution columns to clients daily/first-seen (#4505)
* DENG-1705 Add missing client attribution columns to clients daily/firstseen
* Update clients_last_seen_joined
2023-11-08 15:13:17 -05:00
Alexander 1be564e23c
DENG-1935 Change data ordering from pings in clients-first-seen-v2 (#4533)
* DENG-1935 Change data ordering from pings in clients-first-seen-v2
* Added main ping for client-3, maintain chosen ping
2023-11-08 10:49:17 -05:00
Alexander c5dd137f26
Fix error_aggregates sample data references (#4462) 2023-10-19 15:43:47 -04:00
Anna Scholtz 34c8cf35e7
Fix error_aggregates tests to reference main_v5 (#4448) 2023-10-17 13:16:40 -07:00
Eduardo Filho a0d1e16101
Remove use_counters from queries and test file referencing main_v5 (#4410) 2023-10-12 11:16:52 -04:00
Eduardo Filho 88a67e0c28
Update glam tests after switching to main_v5 (#4408) 2023-10-11 16:29:44 -04:00
Lucia 6826cbe395
DS-2947 clients_first_seen_v2 (#3962)
* DENG-850 Add test setup.

* DS-2947. Create new dataset and tests for Firefox Desktop Clients.

* DS-2947. Update dataset name to clients_first_seen_v2.

* DS-2947. Dataset name to clients_first_seen_v2.

* DS-2947. Updating tests.

* DS-2947. Schema for clients_first_seen_v2.

* DS-2947. Tests update.

* Tests update

* Restore test files

* DS-2947. Get data from main and new profile ping. Get first dltoken and dlsource available. Update tests.

* DS-2947. Use main ping's submission_timestamp_min to find the earliest ping.

* Remove app_display_versin as it is normalised in app_version. Update fields on a 7 day window. Retrieve data from the ping with the earliest NOT NULL value to remove NULLS when main ping is not available.

* Remove app_display_versin as it is normalised in app_version. Update fields on a 7 day window. Retrieve data from the ping with the earliest NOT NULL value to remove NULLS when main ping is not available.

* Update schemas, remove duplicated columns from query and init. Adapt existing unitest and add unitest for 7-day window updates. Include scheduler in DAG bqetl_analytics_tables.

* Update to enable initialize from query.sql. Remove init.sql.

* Update DAGs dependencies.

* DAG bqetl_main_summary updated.

* Query and tests update to join with sample_id.

* Refactor metadata fields in query and tests.

* Schema and descriptions updated. Remove filter to query the existing table. Remove the DATETIME, the first_seen_date is equivalent.

* Column required to be explicit in the query to match the schema.

* Test fix.

* Tests tmp changes.

* Remove 7-day window update and update tests.

* Add second_seen_date to the query

* DS-3037 Add second_seen_date and tests.

* DS-3037 Add is_init to calculate second_seen_date.

* remove files in analysis dataset

* DS-3037 Add is_init to calculate second_seen_date. Formatting.

* DS-2986 Add initialize script. Change submission_timestamp_min to submission_date due to NULL values in that field.

* DENG-1314. Update metadata reported pings and tests in the query.

* DS-3054. Update bqetl initialize command and query to support parallel run.

* DS-3054. Update query to use submission_timestamp_min from main ping where available for precision in source ping for first_seen_date, add source ping of second_seen_date, get only first_seen date from new_profile and shutdown ping due to 16% clients with more than one new_profile ping. Add capability to run in parallel in bqetl. Update tests.

* DS-3054. Remove initialize.py.

* Reset unrelated formatting changes from this branch to match the main branch.

* Correct jira template.

* DS-2947. Update naming for attribution dltoken and dlsource.

* DS-2947. Update column names and tests and clarity for the initialization command.

* DS-2986. Create table with schema and metadata in command initialize.

* Document what is the result expected from each subquery.

* Documentation update.

* DS-3145. Include user agent and the source ping. Add query documentation.

* DS-3145. Update tests.

* DS-3146 Update logic to get attributes only from the ping that reports the first_seen_date, include locale, update the source for app_build_id and collect second_seen_date only from main ping.

* DS-3146 Update logic to get attributes only from the ping that reports the first_seen_date, include locale, update the source for app_build_id and collect second_seen_date only from main ping.

* DS-3054. Updates and save initialization.

* DS-3054. Table name required for DAG generation.

* DS-2947_implement_bigquery_changes_in_another_PR.

* DS-2947 Naming

* Add clients_first_seen_v2 to skip dry-run.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-09-25 16:39:04 +02:00
kik-kik 8782b6e196
removing firefox_accounts_derived fxa_users_daile_v1 test as it appears broken (#4296) 2023-09-14 16:13:40 +02:00
Anna Scholtz 3f79cc5151
Generate soft etl checks (#4268)
* Add markers to check cli command to differentiate warning from hard failures

* Fix CI issues

* Fix dag generation

* Incorporate Feedback

* Generate Airflow tasks for #fail and #warn checks

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2023-09-13 10:22:39 -07:00
Alekhya 2e916eb856
DENG1381 - Add bqetl support for deprecation metadata (#4213)
* Support bq dataset deprecation process (metadata)

* Add bqetl metadata cli command

* Initial draft for adding deprecation support to bqetl

* Incorporate Anna's feedback

* Fix based on whd's feedback

* Fix ci issues

* Remove unnecessary logic from metadata.py

* Add dataset metadata yaml for ga_derived

* Ignore dirs that do not have dataset_metadata yaml

* Remove unwanted dataset metadata yamls

* Update bigquery_etl/cli/metadata.py

Co-authored-by: whd <whd@users.noreply.github.com>

---------

Co-authored-by: whd <whd@users.noreply.github.com>
2023-09-12 18:47:54 +00:00
m-d-bowerman e319825e83
Add sov branch to newtab visit topsites (#4274)
* Add sov branch to newtab visit topsites

* Update query to match schema

* Add newtab_sov to test schema
2023-09-12 07:41:25 -07:00
Lucia e6a46bbef9
Rename tests schema for activations. (#4197)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-08-16 12:16:22 -07:00
Lucia c22850ec7a
Query the channel from the first ping reporting it to avoid NULLs. Get the metadata.reported_baseline_ping from the subset that is used to find the baseline ping's first_seen_date. (#4195)
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-08-16 18:55:46 +02:00
Winnie Chan 994ef2ed4b
DENG-1187 Added reported baseline ping column to fenix android (#4073)
* Added reported baseline ping column
---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-08-15 12:18:51 -07:00
Winnie Chan 1743e70139
DENG-1188: Include beta and nightly in firefox_android_clients (#4080)
* Added channel column

* Fixed channel typo

* Removed channel filter from activations

* Added tests

* Removed todo comment

* Updated metrics init date

* Removed channel grouping

* Reset tests

* Added baseline channels test

* Update tests/sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v1/test_baseline_channels/moz-fx-data-shared-prod.fenix.baseline_clients_daily.yaml

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* Update tests/sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v1/test_baseline_channels/expect.yaml

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* Update tests/sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v1/test_baseline_channels/expect.yaml

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* Fixed failed seq test

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
2023-08-14 19:36:58 +02:00
Winnie Chan fadc1eb1b9
DENG-801: Fix firefox_android_clients activations (#4093)
* Added activations inner join

* Fixed unit tests

* Added activation tests

* Update sql/moz-fx-data-shared-prod/fenix_derived/firefox_android_clients_v1/query.sql

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>

* Removed comment

* Removed inner join

* Minor fixes

---------

Co-authored-by: Lucia <30448600+lucia-vargas-a@users.noreply.github.com>
2023-08-14 19:13:11 +02:00
kik-kik b927ed22be
feat(DENG-949): Added `render` subcommand and `--dry-run` flag to the bqetl check command (#4045)
* added render subcommand to the bqetl check command

* added a dry_run flag to bqqetl check run command

* added a test to make sure run command exists with status code 0

* added test for check render subcommand

* fixing linter checks

* attempting using an alternative way of testing the render command

* fixing render test by testing the _render() directly rather than the render cli wrapper

* removed dead test

* Apply suggestions from code review by ascholtz

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* fixed black and mypy errors

* fixed app_store_funnel_v1 check formatting

* reformatted tests checks

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-08-09 16:39:47 +02:00
Winnie Chan 61b9b2ffe3
DENG-772: Fenix population with first_session ping and min seq (#4090)
* Changed to min seq and capture null seq
2023-07-28 08:55:39 -07:00
Eduardo Filho 25d4ab4042
Bug 1844886: Add non-norm cols to glam scalar tbls (#4111)
* Add non-norm cols to scalar tbls

* Add missing schema to scalar_percentiles_v1

* Add missing schema to scalar_percentiles_v1
2023-07-25 11:13:20 -04:00
Eduardo Filho 144a508ee6
Glam non norm agg (#3873)
* glam: Partition clients_histogram_aggregates by sample_id (has been running like this since April 3 from a different branch)

* glam: Non normalized aggregations to legacy histograms

* glam: add non-normalized aggs to probe counts extract

* glam: add init.sql to relevant tbls for non-norm aggs

* glam: ignore dryrun histogram_percentiles

* glam: add description and eol to init

* glam: Partition clients_histogram_aggregates by sample_id (has been running like this since April 3 from a different branch)

* glam: Non normalized aggregations to legacy histograms

* glam: add non-normalized aggs to probe counts extract

* glam: add init.sql to relevant tbls for non-norm aggs

* glam: ignore dryrun histogram_percentiles

* glam: add description and eol to init

* fix schema files

* fix clients_histogram_probe_counts schema

* remove another init.sql

* fix dryrun ignore order

* fix table name

* change dryrun ignore order to try avoiding fenix for being on path

* another change in dryrun

* Move glam queries from dryrun to bqetl_project.yaml to ignore

* add tbl deps on tests
2023-07-19 17:01:58 -04:00
Rebecca BurWei 4ffdb8484a
Add has_adblocker_addon to search_clients_daily (#3558)
* feat: adblocker addons field

* Update sql/moz-fx-data-shared-prod/search_derived/search_clients_daily_v8/query.sql

Co-authored-by: Curtis Morales <cmorales@mozilla.com>

* fix: use private table

* fix: where clause for private table

* Reference static addons table

* Switch to new monetization_blocking_addons table

* Drop project name in reference to monetization_blocking_addons

* Don't dry-run search_clients_daily

* Add has_adblocker_addons to search_clients_daily_v8 tests

---------

Co-authored-by: Curtis Morales <cmorales@mozilla.com>
2023-07-19 12:04:38 -04:00
Sean Rose a4fdf3c65e
Include new fields for SubPlat in FxA events views (DENG-1006) (#3926)
* Include new fields for SubPlat in `fxa_content_auth_stdout_events`.

* Include new fields for SubPlat in `nonprod_fxa_content_auth_stdout_events`.

* Include new fields for SubPlat in `fxa_all_events`.

* Move new `time` column to be by the other timestamp columns.

* Keep `subscribed_plan_ids` as a string so it's accessible in Looker.

* Add `schema.yaml` files for FxA events ETLs.

So the tables can be successfully staged for CI for downstream ETLs/views to pass.

* Fully qualify view in `fxa_users_daily_v1` to try to get test to pass.

* Rename `time` column `event_time`.

* Include new fields for SubPlat in `nonprod_fxa_all_events`.

---------

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
2023-06-23 15:41:13 +00:00
Alexander 8423c7ad2e
Use baseline_clients_daily instead of ping and first_seen for fenix_android_clients (#3910)
* Change source for first_seen and baseline to baseline_clients_daily

* Edit tests and schemas

* Update to fenix.baseline_clients_daily
2023-06-08 14:50:14 +00:00
Alexander 505c895f62
GROWTH-41 Add last_seen columns to firefox_android_clients (#3863)
* Added last_reported columns

* Fixed tests

* Added missing locale field
2023-06-05 16:37:59 +00:00
Lucia cbe42ab9a9
Deng 850 firefox android clients reported ping (#3789)
* DENG-850 Retrieve FALSE instead of NULL in the in metadata when there isn't first_session or metrics ping.

* DENG-850 Unitest for no first session ping.

* DENG-850 syntax fix

* DENG-850 Tests for first session ping, and no baseline ping.

* DENG-850 Tests suite.

* DENG-850 YAML fixes.

* DENG-850 Adjustment to the case of reported first_session and metrics ping. The unitests are adjusted to get the value for reported pings.

* DENG-850 Add sample id to test.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-05-26 12:50:08 +02:00
Alexander db604e4b3d
DENG-796 - newtab_visits (#3762)
Add new table - Newtab Visits
2023-05-25 09:32:25 -04:00
Glenda Leonard a31072d408
DENG-775 downloads_with_attribution_v2 (#3716)
* DENG-775 Added session_id to JOIN between GA data and stub_attr.stdout.  Also expanded date range on GA session data to [download_date - 2 days, download_date + 1 day]

* Updated query to handle missing GA download_session_id.  It effectively applies V1 logic to the MISSING_GA_CLIENT dl_tokens.
2023-04-26 14:47:34 -04:00
Anna Scholtz 48d8c7603d
Metric hub integration - rewrite SSL ratios to use metrics (#3698)
* Add metrics.data_source()

* Rewrite SSL ratios to use metrics

* Fix docs formatting
2023-04-04 15:41:44 -07:00
Glenda Leonard b13f45bc63
DENG-658 - Initial table definitions for dl_token processing. (#3644)
* Initial table definitions for dl_token processing.  Includes update to sql pytest_plugin to account for tablenames with date suffixes.

* Removed cluster reference and shortened description

* Added sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql to dryrun skip

* Added time_on_site

* Moved country_names sample test data file.

* Update bigquery_etl/pytest_plugin/sql.py

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Update sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Update sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Updated based on PR feedback.  Added LEFT JOIN to ensure sessions without pageviews are not dropped.

* Set has_ga_download_event = null if exception=GA_UNRESOLVABLE

* Standardized logic for time_on_site

* - Added test for multiple downloads for 1 session
- Added detailed description of table.

* Updated to use mode_last_retain_nulls instead of ANY_VALUE

* Set pageviews, unique_pageviews = 0 if null.

* Added boolean additional_download_occurred to indicate if another download occurred in the same session.

---------

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
2023-03-23 16:45:58 -04:00
kik-kik ed1317cb7d
bug(): added org_mozilla_firefox_derived/client_deduplication_v1 to dry run skip (#3667)
* added org_mozilla_firefox_derived/client_deduplication_v1 to dry run skip

* fixing fxa_users_daily_v1 test
2023-03-16 15:42:04 +00:00
Rebecca BurWei 26366152b9
RS-595 (#3518)
* feat: new field in search clients daily - is_sap_monetizable

* Added column to tests

---------

Co-authored-by: Alexander Nicholson <anicholson@mozilla.com>
2023-03-10 12:38:52 -05:00
Alexander 60c85e7c54
Revert CI changes for private UDFs and add stub documentation - DENG-735 (#3652)
* Revert "CI fixes for supporting private UDFs in bigquery-etl - DENG-735 (#3631)"

This reverts commit edcfe758f7.

* Added stub UDF for monetized_search

* Add docs for using a private internal UDF
2023-03-10 11:41:46 -05:00
Alexander edcfe758f7
CI fixes for supporting private UDFs in bigquery-etl - DENG-735 (#3631)
* Minimize stub normalize_search_engine UDF and usage in search_clients_last_seen tests
* Move sql tests downstream of private-generate-sql and copy UDFs into sql-dir for tests
2023-03-09 16:54:42 -05:00
m-d-bowerman 69c0c1d5fc
[DS-2566] Add sidebar search probes to clients_daily tables (#3629)
* Add sidebar search probes to clients_daily tables

* update downstream schemas

* Format fix

* Add new field to test

* Update schemas with hist fields

* Remove duplicated field from schema

---------

Co-authored-by: Glenda Leonard <75265513+gleonard-m@users.noreply.github.com>
2023-03-08 18:14:58 -05:00
Leli 05975e88ff
Fivetran remove dev (#3619)
* remove dev destination

* fivetran - daily_connector_costs restructure CTEs for clearer GROUP BY
2023-03-02 13:00:48 +01:00
Leli 5aebf9cae9
fivetran costs change tables (#3599)
* fivetran costs change tables

* fivetran costs - incorporate code review

* fivetran costs - change monthly_costs back to coalesce

* fivetran costs - fix query
2023-02-27 19:26:21 +01:00
Leli 5108c2d307
add daily_active_rows to fivetran_costs (#3570)
* add daily_active_rows to fivetran_costs

* regenerate dag

* regenerate dag with black 23.1.0
2023-02-08 13:45:49 +01:00
Daniel Thorn ac053c326a
Update dependencies missed by dependabot (#3566) 2023-02-06 12:14:32 -08:00
Curtis Morales 570f3bbab3
Fix urlbar_clients_daily_v1 (#3549)
* Fix urlbar_clients_daily_v1

* Rename test file

* Fix new name

* Rename one more test case file
2023-02-01 10:57:57 -05:00
Leli d161c02ef1
Fix Fivetran Costs calculations (#3547)
* Add fivetran_costs

* Add fivetran_costs - adding schemas to all tables and adressing other suggestions

* Add fivetran_costs - renaming tables, adding to the schema

* Add fivetran_costs - adding fivetran-dev

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* implementing suggestions

* rerun dag creation

* fixing the tests

* fixing errors

* change position of rounding
2023-02-01 13:30:09 +01:00
Leli 8b0158c8dc
Add fivetran_costs (#3509)
* Add fivetran_costs

* Add fivetran_costs - adding schemas to all tables and adressing other suggestions

* Add fivetran_costs - renaming tables, adding to the schema

* Add fivetran_costs - adding fivetran-dev

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* Add fivetran_costs - adding tests

* implementing suggestions

* rerun dag creation
2023-01-31 16:27:29 +01:00
kik-kik 794ae7b22e Update sql/moz-fx-data-shared-prod/firefox_accounts_derived/funnel_events_source_v1/query.sql
Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2022-12-15 13:11:07 +01:00
Chelsea Troy a1a155a22f
Add urlbar_persisted to query for daily search client table (#3349)
* Add urlbar_persisted to query for daily search client table

* Add column to schema (not backward breaking)

* Address test expectations

* Update schemata and queries for companion tables

* Make adjustments that Alex identified in the PR to make sure the new fields get ingested properly

* Run schema update for clients_daily_v6
2022-12-05 15:33:43 -05:00
Anna Scholtz a05e0dd80c Update event_aggregates_v1 tests 2022-11-17 08:34:36 -08:00
Alekhya 56983f3e8d
Add Glean iOS Focus and Klar to search metrics (#3285)
correct the column for default search engine

add tests and  ios to views
2022-10-18 13:52:53 -04:00
Frank Bertsch ce992ca411
Bug 1791580 - Add qualified use to clients_last_seen (#3232)
* Bug 1791580 - Add qualified use to clients_last_seen

* Add new fields for private/normal browsing URIs

* Update clients_last_seen_joined schema

* Reformat view

* Update and extend tests
2022-10-11 09:27:40 -04:00
Alexander 588d468dc8
Hoist schemas in SQL tests up to table dir (#3145) 2022-08-17 13:11:24 -04:00
wil stuckey 4034b1a93d
Update table declarations to include session_id, sequence_no in sanitized and impression tables (#3092)
Co-authored-by: whd <whd@users.noreply.github.com>
2022-08-09 16:40:09 +00:00
Nan Jiang 2b67d988d7
CONSVC 1898: Add suggest_data_sharing_enabled to event_aggregates (#3116) 2022-07-26 16:55:49 -04:00
Alexander f99f112336
Android Focus search ETL - DO-824, Bug 1749833 (#2682)
Added glean data for Focus on Android to `mobile_search_clients_daily_v1`
2022-07-26 16:25:32 -04:00
Nan Jiang 8aa9596ff4
CONSVC 1813: Include iOS data into Contextual Services derived dataset (#3049) 2022-06-30 11:05:04 -04:00
wil stuckey a78127a050
#1775029 Update the suggest_impression_sanitized_v3 query (#3041)
* Update the suggest_impression_sanitized_v3 query

* exclude region and country when preparing for the join
* filter `impressions.request_id` to non null to drop queries without a
  corresponding impression.
* Add tests? Haven't figured out bootstraping issues on my M1 yet so not
  sure how well these will work. TO CI!

* Swap the left and right and remove the conditional on the final join

* Align expectations

* Will this fix the tests? tune in to find out.

* Fix expectations AGAIN

* Update based on review comments and formatter changes
2022-06-23 15:50:16 +00:00
Anna Scholtz 2f5c6ac41a Generate ExternalTaskMarkers for Airflow downstream dependencies 2022-06-22 11:05:25 -07:00
akkomar ceda6dd35f
Use approximate client count in GLAM scalar_percentiles_v1 (#3039)
This is a follow-up to https://github.com/mozilla/bigquery-etl/pull/3037 which unblocked `scalar_bucket_counts_v1`.
`scalar_percentiles_v1` uses the same source table (`clients_scalar_aggregates_v1`) and started failing today with the same error (disk/memory limits exceeded for shuffle operations).

`APPROX_COUNT_DISTINCT` used here runs HLL under the hood. The reason for using it here is that we can't split the aggregation here into two stages as in the aforementioned PR due to quantiles calculation.

I have run this query locally and confirmed that it works.
2022-06-21 10:55:08 -04:00
Nan Jiang 97f676cc8e
CONSVC 1800: Add os to the Contextual Services derived dataset (#3027)
* CONSVC 1800: Add os to the Contextual Services derived dataset

* Review fixes

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
2022-06-16 23:31:17 +00:00
Nan Jiang e4b180dbbc
Bug 1757768 (follow-up): Fix test failures (#3017) 2022-06-13 21:34:04 +00:00
Nan Jiang 44399cd7af
Bug 1757768: add match_type to contextual services derived dataset (#2897)
* Bug 1757768: add match_type to contextual services derived dataset

* f test
2022-06-08 17:52:41 +00:00
Rebecca BurWei 997708a74f
Add country to urlbar_clients_daily (#3009)
* Add country to urlbar_clients_daily
2022-06-03 15:26:59 -04:00
Jeff Klukas 3db4633376
CONSVC-1681 Add mobile data to contextual services event_aggregates (#2805)
* CONSVC-1681 Add mobile data to contextual services event_aggregates

See https://mozilla-hub.atlassian.net/browse/CONSVC-1681

* Use 'phone' instead of 'mobile'

* Update init.sql

* Commentary on filter

* Aggregation test update

* Update overactive filter test

* Dry run exemptions

* Update sql/moz-fx-data-shared-prod/contextual_services_derived/event_aggregates_v1/query.sql

* format
2022-03-31 10:53:29 -04:00
Rebecca BurWei 78885a77dd
Add experiments field from clients_daily to urlbar_clients_daily (#2814)
* Add experiments field

* Updated urlbar_clients_daily schema

* Updated tests

Co-authored-by: Jeff Klukas <jklukas@mozilla.com>
Co-authored-by: Alexander Nicholson <anicholson@mozilla.com>
2022-03-23 16:48:16 -04:00
Alekhya a76cd01efa
add minimum client count for fenix (#2642)
add minimum client count for fenix

add minimum client count for fenix

add minimum client count for fenix

add minimum client count for fenix
2022-01-12 11:49:59 -05:00
Alekhya 4b178bc49b
Minimum client counts (#2628)
* added minimum client count for desktop

added minimum client count for desktop

added minimum client count for desktop

added minimum client count for desktop

added minimum client count for desktop

* Update the sql query

* Updated the total_users > 375 than 100
2022-01-06 14:50:29 -05:00
Jeff Klukas 59e16919aa
Lowercase and trim in suggest_impressions_sanitized_v2 (#2581)
This better matches the current client behavior for matching.
We're currently getting `<disallowed>` in results and results with uppercase
letters. I don't think preserving these differences has analytical value,
and it makes the results harder to work with.
2021-12-16 08:16:19 -05:00
Jeff Klukas ec7e68f213
ROAD-85 Simple sanitization job for Merino logs (#2522)
* ROAD-85 Simple sanitization job for Merino logs

See https://mozilla-hub.atlassian.net/browse/ROAD-85

This uses the adM allowlist of queries for sanitization, so can be expressed
entirely in a single query. Future iterations will involve python logic and
will likely need to be held elsewhere.

* Separate external query and query to copy data into shared-prod
2021-12-14 16:18:24 -05:00
Alekhya 2f1413fee1
Revert "correcting minimum client count - desktop and fenix (#2544)" (#2566)
This reverts commit 5b743090b4.
2021-12-10 10:15:52 -05:00
Alekhya 5b743090b4
correcting minimum client count - desktop and fenix (#2544)
* correcting minimum client count - desktop and fenix

* corrected test cases for desktop

* corrected the join for desktop
2021-12-06 10:14:42 -05:00
Alexander Nicholson 48d3ac3f60
Bug 1742183 Added iOS probes to mobile_search_clients_daily (#2526) 2021-11-25 17:19:49 -05:00