Граф коммитов

25 Коммитов

Автор SHA1 Сообщение Дата
Anna Scholtz 3f79cc5151
Generate soft etl checks (#4268)
* Add markers to check cli command to differentiate warning from hard failures

* Fix CI issues

* Fix dag generation

* Incorporate Feedback

* Generate Airflow tasks for #fail and #warn checks

---------

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
Co-authored-by: Alekhya <88394696+alekhyamoz@users.noreply.github.com>
2023-09-13 10:22:39 -07:00
Glenda Leonard acdacb5095
Reduced threshold from 250,000 to 50,000 (#4037) 2023-07-11 10:37:57 -04:00
Lucia 4330e029d2
DS-2965 Update UDF get_earliest_value() (#3956)
* DS-2947. Update UDF get_earliest_value with an additional input and output value.

* DS-2947. Formatting.

* DS-2947. Formatting.

* DS-2947. Tests fix.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-06-16 17:58:06 +00:00
Glenda Leonard c69fee0b5f
DENG-941 initial impl of check rendering and execution. (#3885)
* initial impl

* Updated based on PR feedback

* Moved check from query to separate command

* Expanded from --partition option to generic --parameter option

* Removed `query check` command (check moved to new command)

* Update bigquery_etl/cli/check.py

remove date param format check

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Removed 'parameter' parameter, everything is passed through ctx.args and then converted to a dict for Jinja rendering.  There are no restrictions on ctx.args values.

* Merge error

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-06-13 19:31:59 +00:00
Lucia 38731a440b
RS-722 Remove task name from printed message in DAG generation (#3859)
* RS-722 Remove task_name from dag generation when it is not available.

* RS-722 Reformat files.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-05-25 12:07:41 -04:00
Glenda Leonard 51248b2ca1
DENG-658 Fixed TABLE_SUFFIX parsing to handle GA intraday tables (#3745)
* Fixed TABLE_SUFFIX parsing to handle GA intraday tables
2023-04-27 18:01:07 -04:00
Glenda Leonard a31072d408
DENG-775 downloads_with_attribution_v2 (#3716)
* DENG-775 Added session_id to JOIN between GA data and stub_attr.stdout.  Also expanded date range on GA session data to [download_date - 2 days, download_date + 1 day]

* Updated query to handle missing GA download_session_id.  It effectively applies V1 logic to the MISSING_GA_CLIENT dl_tokens.
2023-04-26 14:47:34 -04:00
Glenda Leonard f182a1c140
Fixed schema types (#3684)
* Updated formatting.
2023-03-24 13:15:54 -04:00
Glenda Leonard b13f45bc63
DENG-658 - Initial table definitions for dl_token processing. (#3644)
* Initial table definitions for dl_token processing.  Includes update to sql pytest_plugin to account for tablenames with date suffixes.

* Removed cluster reference and shortened description

* Added sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql to dryrun skip

* Added time_on_site

* Moved country_names sample test data file.

* Update bigquery_etl/pytest_plugin/sql.py

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* Update sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Update sql/moz-fx-data-marketing-prod/ga_derived/downloads_with_attribution_v1/query.sql

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Updated based on PR feedback.  Added LEFT JOIN to ensure sessions without pageviews are not dropped.

* Set has_ga_download_event = null if exception=GA_UNRESOLVABLE

* Standardized logic for time_on_site

* - Added test for multiple downloads for 1 session
- Added detailed description of table.

* Updated to use mode_last_retain_nulls instead of ANY_VALUE

* Set pageviews, unique_pageviews = 0 if null.

* Added boolean additional_download_occurred to indicate if another download occurred in the same session.

---------

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
2023-03-23 16:45:58 -04:00
richard baffour 15977eaac0
Add Firefox "what's new" page summary ETL (#3535)
* wnp hits and bounces sql

* wnp hits and bounces sql

* Add Firefox whats new page summary ETL

* Delete sql/moz-fx-data-marketing-prod/ga_derived/wnp_hits_v1 directory

* fixed the bounce flag error

* dag regeneration, and edits to the query

* cleaning up version with udf and udating the schema

---------

Co-authored-by: soGaussian <rbaffourawuah@mozilla.com>
2023-02-07 16:08:41 -05:00
Sean Rose 2d84a1d3b7
Change `bqetl format` to improve readability of `CASE` statements (#3546)
* Indent `WHEN` and `ELSE` clauses one level more than `CASE`.
* Indent `THEN` clauses one level more than the corresponding `WHEN` clause.
* Have the content of `WHEN`, `THEN`, and `ELSE` clauses start on the same line as the clause keyword.
* Allow an alias, comma, or dot right after a `CASE` statement's `END`.
2023-02-03 14:35:59 -08:00
Sean Rose a46678ed96
Change `bqetl format` to uppercase built-in function names (#3536) 2023-01-26 16:07:10 -08:00
Sean Rose 495ddcf39f
Correct BigQuery partitioning/clustering metadata in ETL `metadata.yaml` files (#3500)
* Add missing BigQuery partitioning/clustering metadata.
* Correct existing BigQuery partitioning/clustering metadata.
* Allow partition `field` metadata field to be omitted.
* List partition `type` metadata field first.
2023-01-12 13:58:53 -08:00
Ben Wu f4f4005efd
Update query and dag owners (#2292) 2021-09-01 19:51:29 +00:00
Daniel Thorn dfeea39ac5
Enforce more yaml lint rules (#1878) 2021-03-09 17:25:01 -05:00
Ben Wu 8cc9203b37
Use is_mozilla_browser udf in www landing page metrics (#1710) 2021-01-22 15:59:53 -05:00
William Lachance b93c8b236c
Site Metrics: Expand definition of what "Firefox" is (#1692) 2021-01-21 19:14:09 -05:00
Ben Wu 2692ebf1d7
Create script to copy ga_sessions tables between projects (#1565) 2020-11-20 11:58:58 -05:00
Ben Wu 662bd6c72c
Add submission_date to ga query (#1548) 2020-11-12 18:59:09 -05:00
Ben Wu b15c6c7686
Rename GA standardized country names table (#1540) 2020-11-10 18:12:47 -05:00
Anna Scholtz 3b2a672f78 Support python script query scheduling 2020-11-10 14:36:07 -08:00
Ben Wu 1e66f1e048
Add derived GA tables for www site (#1524) 2020-11-09 13:51:29 -05:00
Rhys 1ace0fe2b7
Ran YAMLlint on all yaml files and resolved linting issues (fixes #1297) (#1481)
* "Ran YAMLlint on all yaml files"

* "Moved product info metadata table to README file"

* "Reformatted yaml lists"

* "Updated line breaks so script runs"

* "Updated line breaks so script runs"

* "Undid line breaks"

* "Created custom config file"

* "Removed base document id"

* "Undid line breaks"

* "Reformatted code"

* "Trimmed whitespace"

* "Undid line break"

* "Introduced newline"

* "Trimmed whitespace"

* "Added yamillint to config file"

* "Added yamllint to config file"

* "Moved up yamllint test"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Trimmed whitespace"

* "Removing hyphen to fix CI error"

* "Indentation to remove CI error"

* "Included yamllint install in build run"

* "Added yamllint in requirements.txt and .in file"

* "Moved install yamllint step to its own stage"

* "Updated yamllint test"

* "Updated circleci step"

* "Reformatted code"

* "Added yamllint to circleci steps"

* "Added checkout block to yamllint step"

* "Trimmed whitespace"

* "Undid yamllint step"

* "Specified directory name for yamllint test"

* "Fixed yamlint errors"

* "Fixed yamllint errors"

* "Fixed yamllint errors"

* "Fixed yamllint errors"

* "Ignore pathway in linting"

* "Added ignore venv pathway during linting"

* "Updated ignore block"

* "Updated ignore block"

* "Removed ignore block"

* "Updated ignore block"

* "Indented base as a list"

* "Indented base item"

* Update tests/sql/moz-fx-data-shared-prod/search_derived/mobile_search_clients_last_seen_v1/test_day_bit_shifting/expect.yaml

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>

* "Resolved linting errors"

* "Referenced tables put back on same line"

* "Fixed linting error"

* Update sql/moz-fx-data-shared-prod/account_ecosystem_derived/fxa_logging_users_daily_v1/metadata.yaml

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>

* "Fixed linting error"

Co-authored-by: Anthony Miyaguchi <acmiyaguchi@gmail.com>
2020-10-29 17:24:55 -07:00
Ben Wu 3d8131dd7b
Change entryPage to entry_page in blogs GA (#1500) 2020-10-29 16:34:42 -04:00
Ben Wu 1529510c6e
Add derived tables for blog.m.o google analytics (#1492) 2020-10-28 17:40:45 -04:00