Граф коммитов

32 Коммитов

Автор SHA1 Сообщение Дата
Linh Nguyen c1c73e690e
Make sure that metadata `friendly_name` and `description` are not None (#4513)
* Fill empty description

* Assign a friendly name if the table doesn't have one

* Update metadata tests

* Update bigquery_etl/metadata/parse_metadata.py

Co-authored-by: Alexander <anicholson@mozilla.com>

* update test again

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2023-11-17 11:48:11 -05:00
Alekhya 2e916eb856
DENG1381 - Add bqetl support for deprecation metadata (#4213)
* Support bq dataset deprecation process (metadata)

* Add bqetl metadata cli command

* Initial draft for adding deprecation support to bqetl

* Incorporate Anna's feedback

* Fix based on whd's feedback

* Fix ci issues

* Remove unnecessary logic from metadata.py

* Add dataset metadata yaml for ga_derived

* Ignore dirs that do not have dataset_metadata yaml

* Remove unwanted dataset metadata yamls

* Update bigquery_etl/cli/metadata.py

Co-authored-by: whd <whd@users.noreply.github.com>

---------

Co-authored-by: whd <whd@users.noreply.github.com>
2023-09-12 18:47:54 +00:00
Alexander f9ff8022d8
Publish dag name as a label (#4084) 2023-07-17 11:58:20 -04:00
Curtis Morales eb02488f34
Fix google sheets metadata and change from "google_sheet" to "google_sheets" for consistency with google (#3914) 2023-06-07 19:25:31 +00:00
Marlene Hirose c08f21c2d5
add csv recognition to tooling (#3881)
Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2023-06-01 09:45:44 -07:00
Alexander 1c3ba13b40
Skip non-emails in owner labels (#3763) 2023-05-11 17:02:46 -04:00
Lucia c34653778f
DENG-774 Change Control for active_users_aggregates (#3687)
* DENG-774 Add change control to active_users_aggregates and test.

* DENG-774 Add test coverage.
---------

Co-authored-by: Lucia Vargas <lvargas@mozilla.com>
2023-04-14 15:49:19 +02:00
Winnie Chan 33a7ab0921
DENG-590 Added owner to labels metadata (#3637)
* Added owner to labels metadata

* Update bigquery_etl/metadata/parse_metadata.py

Co-authored-by: Alexander <anicholson@mozilla.com>

---------

Co-authored-by: Alexander <anicholson@mozilla.com>
2023-03-06 15:08:50 -08:00
Sean Rose 495ddcf39f
Correct BigQuery partitioning/clustering metadata in ETL `metadata.yaml` files (#3500)
* Add missing BigQuery partitioning/clustering metadata.
* Correct existing BigQuery partitioning/clustering metadata.
* Allow partition `field` metadata field to be omitted.
* List partition `type` metadata field first.
2023-01-12 13:58:53 -08:00
Anna Scholtz 64e6613b83 Support multiple source_uris for external data 2022-12-07 15:28:18 -08:00
Anna Scholtz 0ec61ba3a0 Add support for external data 2022-12-07 15:28:18 -08:00
dependabot[bot] f6b20988b5
Bump cattrs from 22.1.0 to 22.2.0 (#3251) 2022-10-05 16:12:55 -07:00
Alexander Nicholson 45655229d3
Added sql_generators script to create schema.yaml files for derived views (#2657) 2022-01-13 15:46:06 -05:00
Anna Scholtz b3651eb338 Partition expiration in days 2021-12-02 08:56:56 -08:00
Anna Scholtz f368335c85 Allow to specify partition expiration in metadata 2021-12-01 12:53:51 -08:00
Anna Scholtz cd32f756ef Support multiple derived_from sources 2021-05-19 12:51:11 -07:00
Anna Scholtz 14175f5426 Update all downstream dependencies 2021-05-19 12:51:11 -07:00
Anna Scholtz 8f9e9d5286 Update derived_from schema 2021-05-19 12:51:11 -07:00
Jeff Klukas e81d4ff988
Bug 1708264 - Add initial support for dataset_metadata.yaml (#1987)
* Bug 1708264 - Add initial support for dataset_metadata.yaml

This adds the machinery to create `dataset_metadata.yaml` as part of
`bqetl query create` and to validate that these files are valid, but these
are not yet actually used for anything.

The next step will be to fill out `dataset_metadata.yaml` for all
existing derived and user-facing datasets.

* Linting and test updates

* Sort before compare in tests

* activity_stream_bi is not user-facing

Co-authored-by: whd <whd@users.noreply.github.com>

Co-authored-by: whd <whd@users.noreply.github.com>
2021-04-29 14:30:34 -04:00
Daniel Thorn a190e18264
Automatically sort python imports (#1840) 2021-02-24 17:11:52 -05:00
Anna Scholtz d8c0ec7707 Move registering yaml literal representation to global scope 2021-02-19 09:34:15 -08:00
Anna Scholtz 99beace6f1 metadata.yaml formatting 2021-02-19 09:34:15 -08:00
Anna Scholtz de644c906e Refactor schema file handling 2021-02-19 09:34:15 -08:00
Anna Scholtz 47dbfe545d Update query schema CLI 2021-02-19 09:34:15 -08:00
Anna Scholtz 3b2a672f78 Support python script query scheduling 2020-11-10 14:36:07 -08:00
jailang 4d3b92528c Change review_bug from string to list 2020-10-28 09:01:43 -07:00
Linh Nguyen 2a1454c309
Update metadata validation logic (issue #924) (#1463)
* Validate metadata with attr

* Update and add tests

* Add check if file exists and update tests

* Format code and update validate metadata

* Revert changes to is_metadata_file()

* Remove format error: whitespaces

* Format test files

Co-authored-by: Anna Scholtz <anna@scholtzan.net>
2020-10-20 16:03:50 -07:00
Anna Scholtz e49618fdf5 Add query schedule command to CLI 2020-08-21 11:10:58 -07:00
Anna Scholtz 3a85538e7f Add CLI command for creating queries 2020-07-29 08:26:24 -07:00
Anna Scholtz 72a60c3bb3 Fix tests 2020-05-28 14:12:24 -07:00
Anna Scholtz 368020f37d Generate Airflow code for tasks 2020-05-28 14:12:24 -07:00
Anna Scholtz e7bd2fb50c Move all metadata related scripts to a metadata directory 2020-04-29 14:18:53 -07:00