Граф коммитов

10 Коммитов

Автор SHA1 Сообщение Дата
Mikaël Ducharme 447d0f8717
fix(backfill): fix BackfillParams import issues (#1587)
* fix(backfill): fix BackfillParams import issues
2022-11-17 16:11:16 -05:00
kik-kik a552e0517a
feat: Adds Airflow DAG for running some of the DIM checks (#1583)
* Add data monitoring dag

* made final tweaks to the data_monitoring DAG

* made changes as requested by @alekhyamoz in PR#1583

Co-authored-by: Alekhya Kommasani <akommasani@mozilla.com>
2022-11-15 13:06:51 +01:00
kik-kik 1dbfb6ad5c
Step to validate airflow dags as part of CI (#1446)
* script to validate dag tags and step to circle ci

* trying out dag tagg validation through parsing

* added missing tag so that tag check does not fail

* Using SQL approach for validation, added extra logging and clean up

* added check to make sure all DAGs have tags

* fixed 3 DAGs missing tags

* implemented suggestions by @haroldwoo in #1446

Co-authored-by: = <=>
2022-01-14 13:04:32 +01:00
Harold Woo e1518a5ff5 [DSRE-6] Upgrade Airflow (wtmo) to 2.1.1 2021-10-18 12:00:34 -07:00
Harold Woo e6456b0132 Upgrade airflow WTMO to 1.10.12 2020-11-09 13:43:08 -08:00
Harold Woo 44a159ea47 Removes bqetl_* dags. We will sync these to our airflow deployment from github bigquery-etl/dags and symlink them to dags/bigquery-etl-dags 2020-07-28 12:05:32 -07:00
Harold Woo 115fcc673c Fixing Backfill ui plugin 2020-04-01 09:49:39 -07:00
Victor Ng c103f3eee4
Features/gcp taar amodump (#628)
* Added gitignore directives for artifacts left by tests and staging the airflow instance locally

* Converted the taar_lite_guidguid job to use SubDagOperator and moz_dataproc_pyspark_runner.
Converted AMO jobs to use GKEPodOperator

* deleted broken connection stub for `google_cloud_derived_datasets`

* changed job to use temp bucket while testing whole DAG

* Downsampled the taar-lite GUID GUID job to 5%

* swapped test gcs location for pyspark job with production location
2019-10-10 15:49:31 -04:00
Roberto Agostino Vitillo 7a51acabf0 EMR release version should be configurable
As we want to support multiple versions of Spark we have to allow users
to configure the EMR release they want their tasks to run in.
2016-09-15 14:11:17 +01:00
Mark Reid 4e801e7914 Add a git ignore file. 2016-06-29 12:07:48 -03:00