Граф коммитов

5967 Коммитов

Автор SHA1 Сообщение Дата
Katie Windau f57ce59d89
DENG-4847 revert earlier change (#6214) 2024-09-17 19:40:10 +00:00
Anna Scholtz 13349e8589
Generate Bigeye monitoring configs in CI (#6194)
* Generate Bigeye monitoring configs in CI

* Ensure Bigconfig files are loaded exactly once

* Avoid duplicate validation of Bigconfig files

* Authentication using API key to Bigeye

* Remove api-key option for Bigeye and rely on env var instead
2024-09-17 16:15:20 +00:00
kik-kik 1cbc8a5001
feat: Bump up some of the project dependencies (#6211) 2024-09-17 15:28:56 +00:00
dependabot[bot] 3e43192a36
Bump types-python-dateutil from 2.9.0.20240316 to 2.9.0.20240906 (#6209)
Bumps [types-python-dateutil](https://github.com/python/typeshed) from 2.9.0.20240316 to 2.9.0.20240906.
- [Commits](https://github.com/python/typeshed/commits)

---
updated-dependencies:
- dependency-name: types-python-dateutil
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-17 14:03:27 +00:00
Lucia 68159a7f1d
Test mitigation (#6205)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.

* Set values back to NULL were corresponds. Improve output information.

* Rename custom_query to custom_query_path to match the expected parameter.

* Missing import

* Larger wildcards to reduce the chance of collision with actual values.
2024-09-17 12:09:17 +00:00
akkomar c30cbfbf5a
Fix Shredder monitoring query (#6210)
Curly braces need to be escaped.

Fixes https://github.com/mozilla/bigquery-etl/pull/6197/files#diff-57d8ebcb18d45fb7886e62272db614409d5d34e58e9a77d421a94130f3ca1dcd
2024-09-17 10:18:00 +00:00
dependabot[bot] dc0d39edc1
Bump mkdocs-material from 9.5.12 to 9.5.13 (#5178)
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.5.12 to 9.5.13.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.5.12...9.5.13)

---
updated-dependencies:
- dependency-name: mkdocs-material
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-16 20:08:26 +00:00
Katie Windau 3bd53ce175
feat(DENG-4847): Add profile_group_id to telemetry_derived.clients_daily_scalar_aggregates_v1 (#6207)
* DENG-4847 update test schema json for main_v5 to include profile_group_id

* DENG-4847 add profile_group_id to clients_daily_scalar_aggregates_v1
2024-09-16 18:59:09 +00:00
Ben Wu 53f385510c
[DENG-4641] Add support for shredding per sample id to shredder (#6197)
* [DENG-4641] Add support for shredding per sample id to shredder

* Add comment explaining connection_pool_max_size
2024-09-16 17:34:38 +00:00
Marlene Hirose 460d5e9367
remove backfill file (#6204) 2024-09-13 20:24:40 +00:00
Marlene Hirose ea3e84a28d
Deng 4177 create desktop retention aggregates v2 (#6203)
* Desktop_retention_aggregates_v2 changes the data source from clients_first_seen_v2 to v3 as well as adding distribution_id to table

* add owner1 label

* add backfill.yaml script to initiate backfill on this table
2024-09-13 19:04:46 +00:00
Katie Windau 528df9c542
DENG-4684 add profile_group_id to feature_usage_v2 (#6200) 2024-09-13 19:01:46 +00:00
Leli e2f89e0373
Change owners on lelis things (#6198)
* add sean

* add anna

* add chelsey

* add frank
2024-09-13 18:02:33 +00:00
Marlene Hirose ce267f6209
mark backfill as complete (#6201) 2024-09-13 17:48:58 +00:00
dependabot[bot] b23448ecdc
Bump aiohttp from 3.9.4 to 3.10.2 (#6192)
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.4 to 3.10.2.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](https://github.com/aio-libs/aiohttp/compare/v3.9.4...v3.10.2)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-13 16:51:04 +00:00
Marlene Hirose ea2ad740a7
Desktop_retention_aggregates_v2 changes the data source from clients_first_seen_v2 to v3 as well as adding distribution_id to table (#6199) 2024-09-13 16:30:38 +00:00
Eduardo Filho 26d15abc18
chore: update PR template (#6186)
* Update PR template

* Make link to checklist bold
2024-09-13 14:23:25 +00:00
Katie Windau 9594cf8e47
DENG-4807 Add profile_group_id to events_live view (#6195) 2024-09-13 01:29:48 +00:00
Marlene Hirose 6e988cc446
add backfill.yaml file (#6196) 2024-09-12 22:19:10 +00:00
Anna Scholtz d326cd54b6
Fix skipping of apps in glean_usage (#6191)
* Fix skipping of apps in glean_usage

* Fix event error monitoring skip apps
2024-09-12 15:29:35 -07:00
Marlene Hirose b346a23afe
Desktop_retention_clients_v2 changes the data source from clients_fir… (#6193)
* Desktop_retention_clients_v2 changes the data source from clients_first_seen_v2 to v3. There are changes to logic for is_desktop, os, os_version as well as adding profile_group_id and days_active_bits

* add owner1 name to metadata.yaml
2024-09-12 21:18:12 +00:00
Anna Scholtz 47b9a51972
Add CLI commands to deploy BigConfig files (#6169)
* Add CLI commands to deploy BigConfig files

* Review feedback

* Ignore bigeye_credentials changes
2024-09-12 20:16:35 +00:00
Katie Windau bd212e89be
DSRE-1681 add sensor (#6188) 2024-09-12 19:30:01 +00:00
Katie Windau 1e5e3ff926
DSRE-1681 Create GA3 and GA4 SUMO data tables in shared-prod (#6179)
* initial commit work in progress

* DSRE-1681 - create new tables for GA3 and GA4 SUMO data, copies of raw

* DSRE-1681 remove clustering from both metadata.yaml files
2024-09-12 16:45:00 +00:00
Lucia a770744e63
Enhancements to shredder mitigation (#6174)
* Add default values in template to fix sqlglot parsing error.

* Adding backfill_date to exception message. Formatting.

* Improve getting collumn dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* DENG_4733. Improve getting column dtypes by including the schema files. Change custom_query for custom_query_path for readibility.

* Formatting.
2024-09-12 16:09:51 +00:00
Arkadiusz Komarzewski c53af44abd
Allow skipping apps in event monitoring generator
Also skip ads_backend because its dataset is restricted.
2024-09-12 17:50:07 +02:00
Ben Wu acbbe849ae
Add error message to failed deploys (#6182) 2024-09-12 16:29:47 +01:00
Leli b90f24729b
remove leli from braze dag docs (#6181)
* remove leli from braze dag docs

* also remove email
2024-09-12 17:25:43 +02:00
akkomar 0842f32d8b
Unskip ads_backend in glean_usage generator (#6180)
This will generate events_stream and event_monitoring_live tables for ads_backend. We originally skipped this app because there was a delay in its ping table deployment.
2024-09-12 14:13:46 +02:00
Sergio E. Betancourt bcc0644719
[RS-1278] Desktop tiles forecasting inputs table (#5929)
* Create derived table containing data inputs into desktop tiles forecasts.

* Create derived table containing data inputs into desktop tiles forecasts.

* Query adjustments and better data descriptions

* Making sure table descriptions are identical.

* Correcting partitioning field

* updated to avoid manual lookback and removed session columns

* Update sql/moz-fx-data-shared-prod/ads_derived/desktop_tiles_forecast_inputs_v1/metadata.yaml

Co-authored-by: Curtis Morales <cmorales@mozilla.com>

* Apply suggestions from code review

Co-authored-by: Curtis Morales <cmorales@mozilla.com>

* move max date filter

* add dag for job

* column name change _3 to _1to3

* remove monthly run and add the new table to bqetl_ads

* Apply suggestions from code review

Co-authored-by: Curtis Morales <cmorales@mozilla.com>

* fix name change in schema

---------

Co-authored-by: Jared Snyder <jsnyder@mozilla.com>
Co-authored-by: Jared Snyder <jaredssnyder@gmail.com>
Co-authored-by: Curtis Morales <cmorales@mozilla.com>
2024-09-11 20:34:03 -05:00
Corban Cloud a26ce8b19e
DSRE-1683: Enables OIDC for GCR Artifact Push (#6178)
* Add tested context to private GCR, add OIDC flag

* upgrade orbs
2024-09-11 16:18:05 -07:00
Marlene Hirose 0f0ab635b1
add udf for paid vs organic attribution (#6177)
* add udf for paid vs organic attribution

* fix formatting of query file
2024-09-11 13:27:13 -07:00
Katie Windau a9a65d2511
DSRE-1681 Create 2 new datasets, sumo_ga_derived & sumo_ga (#6172) 2024-09-11 14:00:12 -05:00
Preethi Issac 842eaf9f0c
updating udf.sql with fakespot suggest results (#6176)
* updating udf.sql with fakespot suggest results

* format updates
2024-09-11 14:47:41 -04:00
Marlene Hirose 97ec89723d
remove first_seen_year from query and schema (#6175)
* remove first_seen_year from query and schema

* take out commented out column

* format query.sql
2024-09-11 10:10:14 -07:00
Marlene Hirose 7a4c673c99
reformat query file (#6171) 2024-09-10 16:23:09 -07:00
Marlene Hirose a86a7ff632
add CTEs, change join to get all the data, add coalesce for is_dau (#6170) 2024-09-10 14:17:56 -07:00
Katie Windau 08f9ad97ec
DENG-4754 update ga sessions v2 logic (#6168)
* DENG-4754 adjust logic for query.sql

* DENG-4754 update logic
2024-09-10 15:38:44 -05:00
Katie Windau 93597594a9
DENG-4754 Add 2 new columns to ga_sessions_v2 (#6167) 2024-09-10 14:28:56 -05:00
Lucia 7cc941a5cb
Add default values in template to fix sqlglot parsing error. (#6166) 2024-09-10 08:47:11 -05:00
akkomar 0d9975ad85
Bug 1915724 - Update accounts events SQL check (#6165) 2024-09-10 12:12:21 +02:00
Marlene Hirose 78a5582e43
initial add of query.sql and metadata.yaml files. (#6161)
* initial add of query.sql and metadata.yaml files

* format sql file to stop Circle CI error

* label count of client_ids, add schema.yaml file

* add dag_name to metadata.yaml

* take out extraneous columns, reorder to match query

* add date partition parameter as first_seen_date

* add period to end of descriptions

* take out explicit dependency

* make query more efficient by setting first_seen_date to equal @submission_date and explicity use Inner join instead of left
2024-09-09 13:07:11 -07:00
skahmann3 a767e65a21
Complete search revenue levers daily backfill (#6160)
* Need proper aggregation when merging to fix DAU

* Initiate another managed backfill

* reformatting

* finish backfill

* Fix sql format'

'

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2024-09-09 15:15:40 -04:00
Marlene Hirose eef6802e70
update comment to be clients_first_seen_v2 to clients_first_seen_v3 (#6162) 2024-09-09 10:13:35 -07:00
Katie Windau 5ae7e91a4b
DENG-4152 fix SQL to read from new shared-prod path (#6164) 2024-09-09 11:00:38 -05:00
Abhishek Aggarwal b5f4adb19e
DENG-4691: Add view for fakespot cost data in dataservices prod project (#6163)
* DENG-4691: View for fakespot cost data in "dataservices-high-prod" project
    - Only Kuberay created Ray serve app costs are included
2024-09-09 16:42:54 +02:00
Katie Windau ae96d4e981
DENG-4152 - move GA4 tables (#6136)
and views to shared-prod

Co-authored-by: Marlene Hirose <92952117+Marlene-M-Hirose@users.noreply.github.com>
2024-09-06 10:39:55 -07:00
Lucia 9fc513ba9f
Generate query with shredder mitigation (#6060)
* Auxiliary functions required to generate the query for a backfill with shredder mitigation.

* Exception handling.

* isort & docstrings.

* Apply flake8 to test file.

* Remove variable assignment to different types.

* Make search case insensitive in function.

* Add test cases for function and update naming in a funcion's parameters for clarity.

* Update bigquery_etl/backfill/shredder_mitigation.py

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>

* Add test cases for missing parameters or not matching parameters where expected. minimize the calls for get_bigquery_type().

* Encapsulate actions to generate and run custom queries to generate the subsets for shredder mitigation.

* Query template for shredder mitigation.

* Query template for shredder mitigation and formatting.

* Add check for "GROUP BY 1, 2, 3", improve code readibility, remove unnecesary properties in classes.

* Test coverage. Check for "GROUP BY 1, 2, 3", improve readibility, remove unrequired properties in class Subset.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Increase test coverage. Expand DataType INTEGER required for UNION queries.

* Separate INTEFER and NUMERIC types.

* Move util functions and convert method to property, both to resolve a circular import. Adjust tests. Update function return and tests.

* Adding backfill_date to exception message. Formatting.

* Adding backfill_date to exception message. Formatting.

---------

Co-authored-by: Leli <33942105+lelilia@users.noreply.github.com>
2024-09-06 16:21:56 +02:00
kik-kik 1570bf1682
feat: include is_daily_user filter to ensure only new profiles are included (#6042)
* feat: include is_daily_user filter to ensure only new profiles are included and use submission_date as first_seen_date so that the downstream query date filter works correctly and limits processing

* feat: additional guard clause to include only new_profile true rows where submission_date = first_seen_date
2024-09-05 17:48:48 +02:00
Eduardo Filho 2a5b8323b6
fix(GLAM): Fix bucket_counts template generation (#6159) 2024-09-05 10:33:23 -04:00