Граф коммитов

94 Коммитов

Автор SHA1 Сообщение Дата
Sean Rose 4bbbc32a5b
Put assert UDFs in `mozfun` project (#4367)
* Put assert UDFs in `mozfun` project.

* Tweak syntax in `assert.array_equals()` to avoid SQLGlot parsing error.
  https://github.com/tobymao/sqlglot/issues/2348

* Fix SQL syntax error in `assert.struct_equals()` tests.

* Fix UDF dependency file path logic when deploying to stage.

* Change regular expressions in `parse_routine` module to allow quotes around routines' dataset and name.
2023-10-13 10:58:42 -07:00
kik-kik 42bfa1409e
Minor tweaks to the data checks docs (#4309) 2023-09-20 10:57:39 +02:00
Anna Scholtz 0b9b833ce3
Add more docs for updating clients_daily (#4316) 2023-09-18 11:02:06 -07:00
Alekhya 01ff67a29f
Rename min_rows check macro to min_row_count (#4301) 2023-09-14 15:06:41 -04:00
Anna Scholtz 9a6a2df03b
Updating and merging data check docs (#4293) 2023-09-13 14:48:47 -07:00
Mike Williams 4bb4c5b52f
fix DENG-1510: add dryrun flag to schema update command (#4270) 2023-09-08 14:58:44 -04:00
Lucia 27262acdfd
Default DAG for bqetl queries (#4143)
* DENG-1314 Implement changes to bqetl and create default DAG.

* DENG-1314. Update Documentation.

* DENG-1314. Dummy query to enable generating DAG and run tests.

* DENG-1314. Update tests.

* Update bigquery_etl/cli/query.py

Raise exception when scheduling information is missing.

Co-authored-by: Daniel Thorn <dthorn@mozilla.com>

* DENG-1314. Update tests.

* DS-3054. Update query creation to set bqetl_default as default value for --dag. Update tests.

* Default task and tests update.

* Default task and tests update.

* 3650 - Remove default DAG option, update DAG template comment & tests.

* 3650 - Condition for DAG warning.

* 3650 - Update docs.

* Clarification on sql/moz-fx-data-shared-prod/analysis/bqetl_default_task_v1/metadata.yaml

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update docs/cookbooks/creating_a_derived_dataset.md

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Daniel Thorn <dthorn@mozilla.com>
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-08-29 14:32:52 +02:00
kik-kik 7afc4c44f1
docs(DENG-960): bqetl data checks cli docs (#4200)
* Small tweaks made to the cli cmds comments / help display for data checks

* added usage docs to data_checks reference docs

* Apply suggestions from code review provided by scholtzan

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-08-24 17:24:23 +02:00
kik-kik 6c27be8a19
added data checks docs to mkdocs.yml to ensure they are included in our docs page (#4189) 2023-08-16 17:48:58 +02:00
kik-kik de35707b1d
docs(DENG-961): Added reference doc for data checks and a cookbook for adding new data checks (#4162)
* added reference doc for data checks and a cookbook for adding new data checks

* Apply suggestions from code review by ascholtz

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

---------

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-08-09 17:12:55 +02:00
Anna Scholtz d5a6dc97f4
Docs for new `bqetl_project.yaml` (#4018)
* Add ConfigLoader and move dry run skip to bqetl_project.yaml

* Update bqetl configuration docs
2023-07-10 09:50:22 -07:00
Curtis Morales b7b1b835ba
Add trigger_rule as an option for generated airflow tasks (#3772)
* Add trigger_rule as an option for generated airflow tasks

* Add test

* Move trigger rule options to enum and add to documentation
2023-06-27 13:58:52 -04:00
Leli d98ae6bf2e
changing the formatting for documentation in cookbooks (#3970)
* changing the formatting for documentation in cookbooks

* change to 1. 1. 1.
2023-06-27 18:41:33 +02:00
Lucia 5dc9e405a8
Deng 1000 docs how to change control (#3903)
* DENG-1000. Update docs with change control guide.

* DENG-1000. Update docs with change control guide.

* DENG-1000. Add sample PR to docs.

* DENG-1000. Add sample PR to docs.

* DENG-1000. Fix syntax of MD file.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-06-05 15:53:16 +00:00
Lucia dfef07ee40
DENG-1000. Update docs with change control guide. (#3901)
* DENG-1000. Update docs with change control guide.

* DENG-1000. Update docs with change control guide.

* DENG-1000. Add sample PR to docs.

---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-06-05 14:53:20 +00:00
Winnie Chan 071c53e4cb
DENG-803/805: Create & Validate backfill cli commands (#3760)
* Added backfill create and validate cli ommand

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
Co-authored-by: kik-kik <42538694+kik-kik@users.noreply.github.com>
2023-06-01 10:06:09 -07:00
Daniel Thorn a0d810275b
Remove java dependency in favor of sqlglot (#3755) 2023-05-17 14:56:42 -07:00
Sean Rose 45836abc69
Fix metrics docs formatting (#3704)
* Remove trailing whitespace to satisfy pre-commit check.

* Remove blank line that's breaking docs formatting.

* Re-add blank lines lost in copy+paste.
2023-04-05 08:07:13 -07:00
Anna Scholtz 48d8c7603d
Metric hub integration - rewrite SSL ratios to use metrics (#3698)
* Add metrics.data_source()

* Rewrite SSL ratios to use metrics

* Fix docs formatting
2023-04-04 15:41:44 -07:00
Anna Scholtz eb5d63a8f5
Metric-hub doc tweaks (#3700) 2023-04-04 10:18:06 -07:00
Anna Scholtz d737925da1
Metric hub integration docs (#3697)
* Add metrics.data_source()

* Docs for referencing metrics

* Add docs for using metrics in queries
2023-04-04 09:56:47 -07:00
Anna Scholtz 34f778c090
Stage deploy docs (#3648)
* Docs for stage deploys

* Update cookbook for updating table schemas

* Docs for running stage deploys locally

* Update common workflow failures
2023-03-10 09:10:32 -08:00
Alexander 60c85e7c54
Revert CI changes for private UDFs and add stub documentation - DENG-735 (#3652)
* Revert "CI fixes for supporting private UDFs in bigquery-etl - DENG-735 (#3631)"

This reverts commit edcfe758f7.

* Added stub UDF for monetized_search

* Add docs for using a private internal UDF
2023-03-10 11:41:46 -05:00
kik-kik 414a3c5d77
Fixed typo in airflow_tags.md (#3589)
* Fixed typo in airflow_tags.md

As Sean pointed out, the tag provided in the docs was incorrect. This change fixes that.

* Apply suggestions from code review

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>

---------

Co-authored-by: Sean Rose <1994030+sean-rose@users.noreply.github.com>
2023-02-16 11:10:58 +01:00
kik-kik 67ac75a04b
:dded description for triage/record_only tag (#3576) 2023-02-10 10:00:31 -05:00
Leli 17c4fb35b1
add to documentation cookbook for testing (#3544) 2023-01-30 18:52:41 +01:00
akkomar 41fb32d7ef
Add a documentation note on faster single test execution (#3541) 2023-01-30 15:09:00 +01:00
Anna Scholtz 22e54ccb5f
[Bug 1812301] Publish only string typed labels (#3530)
* [Bug 1812301] Publish only string typed labels

* Document label publishing
2023-01-26 09:24:49 -08:00
Sean Rose d01ce2fc8a
Fix docs formatting (#3523)
* Use `mdx_truly_sane_lists` to fix nested list formatting in the docs (by default, mkdocs requires nested lists to be indented by four spaces, which isn't how most nested lists in our docs have been formatted).
* Enable syntax highlighting for code blocks in the docs.
* Add support for GitHub Flavored Markdown task lists in the docs.
2023-01-24 08:24:35 -08:00
Sean Rose afea1c35b9
Remove dependencies on the `mozdata` project from ETL (#3496)
Having dependencies on things in `mozdata` can cause issues with deployments, as deploying things to `mozdata` is usually a separate secondary step.
2023-01-23 08:59:04 -08:00
Sean Rose 495ddcf39f
Correct BigQuery partitioning/clustering metadata in ETL `metadata.yaml` files (#3500)
* Add missing BigQuery partitioning/clustering metadata.
* Correct existing BigQuery partitioning/clustering metadata.
* Allow partition `field` metadata field to be omitted.
* List partition `type` metadata field first.
2023-01-12 13:58:53 -08:00
Frank Bertsch 0b6f1a6243
Update creating BQ dataset docs (#3487) 2023-01-06 08:23:23 -05:00
Anna Scholtz 5612836323 Automatically generate tables for Glean apps 2022-12-08 11:09:09 -08:00
Alexander f043631c9d
Update docs, Deploying clients_last_seen now requires skipping dryrun (#3405) 2022-12-06 09:07:59 -05:00
Lucia d92a732802
DENG-561 UDF to find earliest of a set of values (#3385)
* UDF to retrieve the earliest not null value in an array.

* Test validated.

* Documentation upgrade for running the tests.

* Update sql/mozfun/norm/get_earliest_value/metadata.yaml

Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>

* Update documentation.

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
Co-authored-by: Frank Bertsch <frank.bertsch@gmail.com>
2022-12-01 23:22:18 +01:00
Daniel Thorn c281400486
Enforce isort via pytest (#3384) 2022-11-30 11:45:05 -08:00
Winnie Chan 2f9119c056
Updated file paths (#3305) 2022-10-25 14:26:47 -07:00
Anna Scholtz bc1ceb16be Add runbook for adding new Glean apps 2022-10-25 09:15:09 -07:00
Frank Bertsch 00a04b699c
Move to python 3.10 (#3236)
* WIP: Move to python 3.10

* Remove netlify
2022-09-22 15:39:40 -04:00
Rebecca BurWei 59cd108a78
docs for data scientists (#3192)
* docs

* Update docs/cookbooks/common_workflows.md

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

* Update docs/cookbooks/common_workflows.md

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2022-08-31 11:45:06 -05:00
Rebecca BurWei 98def555fd
Add workflow docs for data scientists (#3179)
* Update common_workflows.md

* Update docs/cookbooks/common_workflows.md

Co-authored-by: Anna Scholtz <anna@scholtzan.net>

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2022-08-30 11:14:05 -05:00
Anna Scholtz 81fbdf6c01 Move doc validation and generation commands to CLI 2022-08-25 08:51:41 -07:00
Lucia af64945584
DENG-127 Update docs for backfills (#3150)
* DENG-127 Update docs to include that backfills are documented in Bugzilla.

* DENG-127 Add query for backfill validation.

* DENG-127 Add query for backfill validation.

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2022-08-18 18:22:16 +02:00
Alexander 588d468dc8
Hoist schemas in SQL tests up to table dir (#3145) 2022-08-17 13:11:24 -04:00
Anna Scholtz 9a5a98779a Remove scripts that have been replaced by CLI 2022-07-19 08:08:49 -07:00
Anna Scholtz e7d1e1243c Add task marker docs 2022-06-22 11:05:25 -07:00
kik-kik de2c7ca1dd
removed link to Google Docs as public should not try to access it (#3000) 2022-06-01 14:45:37 +00:00
Lucia dfef0030cf
Update docs to remove the default mozilla-public-data as default in docs (#2960)
* Update docs to remove the default instruction to use  `mozilla-public-data` as the project to create a new table. Clarify to specify the project.

* Update documentation to create a derived dataset. Replace the use of init files for the use of schemas.

* Small corrections

* Small corrections

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2022-05-23 11:23:17 +02:00
Lucia da36b8b3f8
Update bigquery-etl cookbooks (#2935)
* Ignore .idea/ folder in pycharm

* Make instructions to add field generic with example. Add recommendations to delete field.

* Reformat numbered list for consistency.

* Update instructions based on PR feedback.

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2022-05-04 11:10:34 +02:00
Sean Rose 3fb6ebc160
Specify pyjnius in `requirements.in` (#2924) 2022-04-29 12:23:47 -07:00