Граф коммитов

63 Коммитов

Автор SHA1 Сообщение Дата
Mikaël Ducharme 2efe003555
fix(plugins): Use tuple for timetable plugin registration (#2053) 2024-07-29 13:45:46 -04:00
Mikaël Ducharme 33370d8f4b
chore: Replace Black with Ruff and remove diff-based CI. (#2052) 2024-07-29 11:57:46 -04:00
Mikaël Ducharme 9097e5a10f
feat(plugins): Add Dockerflow-style version endpoint. (#1994) 2024-05-24 12:06:37 -04:00
Mikaël Ducharme ff3a2fa82f
chore(plugins): Remove unused plugins and dependencies. (#1911) 2024-02-09 12:31:02 -05:00
akkomar 872b0566f6
Add link to GKE cluster in top bar menu (#1840)
This is a follow up to https://github.com/mozilla/telemetry-airflow/pull/1838
2023-10-18 17:37:29 +02:00
akkomar 61007ed8fc
Add link to GKE cluster in top bar menu (#1838) 2023-10-17 20:53:38 +02:00
Mikaël Ducharme 63beb2f23f
feat: Move utils, operators and glam_subdags out of dags directory (#1807) 2023-10-17 12:36:51 -04:00
Daniel Thorn d15c87b143
Add missing serialization for MultiWeekTimetable (#1782) 2023-08-15 14:58:17 -07:00
Daniel Thorn c443fb32f5
Create custom timetable to run shredder every 4 weeks (#1771) 2023-08-14 15:55:39 -07:00
kik-kik 72b10914ad
docs(): updating airflow triage reference link + black formatting on mozmenu.py (#1676)
* Updated the menu link to the new version of Airflow Triage Guide

* black formatting applied on plugins/mozmenu.y
2023-04-12 14:39:26 +02:00
Harold Woo 4cf845b224 [DSRE-1135] Fix backfill plugin ui clear infinite loop 2023-01-05 11:50:08 -08:00
mikaeld 3cc49d4090 fix deprecation warnings, clean up and update for 2.3.3 2022-12-12 13:24:03 -05:00
Mikaël Ducharme 0b4b4eb418
Revert "feat(airflow): upgrade airflow from 2.1.4 to 2.3.3 [DSRE-1039] " (#1612)
* Revert "update airflow config for 2.3.3"

This reverts commit d19cc711aa.

* Revert "fix deprecation warnings, clean up and update for 2.3.3"

This reverts commit e80472ab9a.

* Revert "update requirements, introduce constraints file and clean up for 2.3.3"

This reverts commit 8e60dba783.
2022-12-07 18:25:40 -05:00
mikaeld e80472ab9a fix deprecation warnings, clean up and update for 2.3.3 2022-12-07 13:30:42 -05:00
Mikaël Ducharme 447d0f8717
fix(backfill): fix BackfillParams import issues (#1587)
* fix(backfill): fix BackfillParams import issues
2022-11-17 16:11:16 -05:00
Mikaël Ducharme 22705cfa68
feat: add backfill dag (#1580) 2022-11-14 16:38:24 -05:00
Harold Woo b8d66ebac3 [DSRE-1101] Fix backfill to not pickle so that newly added dag tasks can run via backfill 2022-11-07 17:03:39 -08:00
Harold Woo 0d1eaaa9d8 Fix backfill plugin clearing tasks for non background (missed in first pass) 2022-09-08 10:20:13 -07:00
haroldwoo 7633f47bbe Fix backfill UI plugin
airflow tasks clear -c no longer exists, was changed to -y
2022-09-01 12:48:09 -07:00
kik-kik da87b75958
Adding few extra links to the Mozilla menu button for Airflow (#1462)
* Adding few extra links to the Mozilla menu button for Airflow

Hopefully this is a step in making Airflow triage quicker and easier.

* Changes made suggested by @jklukas
2022-02-18 09:54:47 -08:00
Harold Woo e1518a5ff5 [DSRE-6] Upgrade Airflow (wtmo) to 2.1.1 2021-10-18 12:00:34 -07:00
Harold Woo 9bd76d28a5 Upgrade Airflow to 1.10.15 bridge release prior to 2.0 2021-07-20 11:34:26 -07:00
Jeff Klukas fd54979f70
Add links to WTMO dev guide in README and menubar (#1274) 2021-03-29 09:07:42 -04:00
Anthony Miyaguchi ab3d31b548
Prune unused Airflow code related to AWS EMR and Databricks (#1240)
* Remove mozetl and mozdatabricks

* Remove moz_emr

* Remove emr_spark_operator

* Remove tox.ini and unused docker variables

* Remove email on schema change operator

* Add utility script for exporting AWS credentials into env

* Fix broken imports

* Fix docker-compose.yml file with invalid values
2021-02-01 09:59:39 -08:00
Jeff Klukas 76e3b6bd2e Replace line with decoded version 2020-10-15 13:55:14 -04:00
Jeff Klukas d9b84f5cce Fix python 3 string handling issue with backfill UI
I've not had success over the past few days getting the backfill UI to work.
I submit jobs, but they never produce output or result in BQ output.

I checked in cloud logging and found ERROR messages with tracebacks ([example](https://console.cloud.google.com/logs/query;pinnedLogId=2020-10-15T13:57:27.221935242Z%2Fr7d2ll39deckufib8;query=backfill%0Aseverity%3DERROR?project=moz-fx-data-airflow-prod-88e0)):

```
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 56, in handle
    self.handle_request(listener_name, req, client, addr)
  File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/ggevent.py", line 160, in handle_request
    addr)
  File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 114, in handle_request
    for item in respiter:
  File "/usr/local/lib/python3.7/site-packages/werkzeug/wsgi.py", line 506, in __next__
    return self._next()
  File "/usr/local/lib/python3.7/site-packages/werkzeug/wrappers/base_response.py", line 45, in _iter_encoded
    for item in iterable:
  File "/app/pvmount/telemetry-airflow/plugins/backfill/main.py", line 133, in read_process
    result = re.match(pattern, line)
  File "/usr/local/lib/python3.7/re.py", line 175, in match
    return _compile(pattern, flags).match(string)
TypeError: cannot use a string pattern on a bytes-like object
```

So looks like a Python 3 migration bug, and submitted tasks are generating
exceptions before they're able to run.
2020-10-15 13:55:14 -04:00
Anthony Miyaguchi 6bd9f926e5
Bug 1641935 - Remove statuspage operator usage (#1006)
* Bug 1641935 - Remove statuspage operator usage

* Remove dataset operator from example`
2020-05-29 10:29:14 -07:00
Anthony Miyaguchi ca879c1a64
Bug 1632591 - Reflect operational state of job on success by default (#965)
* Remove dataset alerts since they are outdate

* Remove s3fs check operator

* Bug 1632591 - Reflect operational state of job on success by default

* Update doc with default behavior

* Update test to reflect new default state of register_status
2020-05-01 10:12:40 -07:00
Harold Woo 115fcc673c Fixing Backfill ui plugin 2020-04-01 09:49:39 -07:00
Harold Woo 0cf7f3097a Add UI plugin for backfilling and clearing dags 2020-03-28 13:49:19 -07:00
William Lachance 39f4982d1d Remove some unused code and references
* events-to-amplitude
* telemetry-streaming
2020-01-08 13:46:48 -05:00
Anthony Miyaguchi 9b93656b63
Bug 1593149 - Fix broken DAGs in local Airflow environment (#670)
* Add dummy values to gcp extras section and dummy AWS credentials

* Delay uploading mozetl runner until execution time

* Add list of aws and gcp credentials for initializing in dev

* Use realistic values for dummy GCP keyfile

Co-Authored-By: Sunah Suh <github@sunahsuh.com>
2019-11-01 11:01:49 -07:00
Anthony Miyaguchi 40d8eea240
Remove backported databricks operator and hook (#536)
* Remove backported databricks operator and hook

* Update imports to point to airflow.contrib
2019-06-24 09:30:07 -07:00
Victor Ng 7820515428 Enabled the taar_ensemble weekly job (#522)
* split off taar weekly jobs into a separate script

* added ExternaltaskSensor dependency on main_summary

* fixed dependnecy to point to clients_daily instead of main_summary

* fixes as per review

renamed `dag_weekly` to `taar_weekly` for weely taar dag

corrected external_task_id and external_dag_id

* Added a `start_date` argument to the task

* removed Frank as owner and set myself as the owner of the task

removed frank from alert recipient
2019-06-05 12:34:35 -07:00
Anthony Miyaguchi b8a44075d0
Add mozetl-runner for external mozetl-compatible modules (#480)
* Add initial function for generating the mozetl runner

* Add tests for generate_runner

* Generate a runner for external modules

* Add missing changes to test_mozetl
2019-05-01 16:07:39 -07:00
Victor Ng 7c6dc0c56f added pypi library support to match python_mozetl repo (#495)
Support for passing in PyPI libraries was added to the
python_mozetl/bin/mozetl-databricks runner in :

b1789afbd4
2019-05-01 13:23:22 -07:00
Anthony Miyaguchi c2488e6528
Fix #433 - Add support for alternative git path and branch in mozetl jobs (#436)
* Add python3 support to Databricks clusters

* Set default python version to 3

* Update moz_databricks test pattern to include json payload

* Refactor mock_hook into a fixture

* Add test asserting value of `PYSPARK_VERSION`

* Fix environment variable name to PYSPARK_PYTHON

* Fix #433 - Add support for alternative git path and branch in mozetl jobs

* Add test for setting alternative repo path

* Move churn_v2 job to moz_databricks

* Downgrade python version for churn-v2
2019-02-26 16:10:50 -08:00
Anthony Miyaguchi 81c8afa1d9
Revert "Set default python version to 2 (#441)" (#443)
This reverts commit edf9647432.
2019-02-26 13:36:59 -08:00
Anthony Miyaguchi edf9647432
Set default python version to 2 (#441) 2019-02-22 17:17:30 -08:00
Anthony Miyaguchi b62f508112
Add python3 support to moz_databricks and set it to default (#432)
* Add python3 support to Databricks clusters

* Set default python version to 3

* Update moz_databricks test pattern to include json payload

* Refactor mock_hook into a fixture

* Add test asserting value of `PYSPARK_VERSION`

* Fix environment variable name to PYSPARK_PYTHON

* Address review by improving error handling and comments
2019-02-22 12:50:20 -08:00
Anthony Miyaguchi 68c13f6da0
Increase databricks retries and retry delay to avoid api errors (#430)
* Increase databricks retries and retry delay to avoid api errors

* Use the databricks plugin instead of built-in hooks and operators

* Add basic test for mozdatabricks

* Fix error in moz_databricks with keyword argument

* Reword comment
2019-02-19 14:40:06 -08:00
Anthony Miyaguchi fdbd048efa
Revert "Increase databricks retries and retry delay to avoid api errors (#421)" (#427)
This reverts commit 8e4be5fbc0.
2019-02-06 20:37:36 -08:00
Anthony Miyaguchi 8e4be5fbc0
Increase databricks retries and retry delay to avoid api errors (#421)
* Increase databricks retries and retry delay to avoid api errors

* Use the databricks plugin instead of built-in hooks and operators
2019-02-06 14:06:16 -08:00
Anthony Miyaguchi 6ec3165317
Fix relative import in plugin (#426) 2019-02-06 13:49:05 -08:00
Anthony Miyaguchi 9d400a310d
Backport Databricks hook and operator from apache/airflow:v1-10-stable (#419)
* Source databricks code from apache/airflow:v1-10-stable

* Update imports and mocks for plugin structure

* Export backported Databricks plugin from v1-10-stable

* Add comments to backported databricks plugin

* Update links to point to revisions instead of branches
2019-02-01 12:48:18 -08:00
Anthony Miyaguchi 313e631268
Add create_incident flag to DatasetStatusOperator (#412)
* Remove AirflowException as an unreachable code path

* Add create_incident flag to DatasetStatusOperator

* Address initial review

* Set dataset_alerts to automatically open incidents on failure
2019-01-31 12:33:50 -08:00
Anthony Miyaguchi bf1e5f7cb7
Add sensor to operators (#414)
Sensors are not part of the plugin API in airflow 1.9
2019-01-14 13:31:53 -08:00
Anthony Miyaguchi 672983c2e1
Add an operator to check for _SUCCESS files (#406)
* Add a S3FSCheckSuccessOperator and fix imports

* Remove py36 from envlist due to snakebite import in sensors

* Fix #404 - Set the number of expected partitions as a lower bound
2019-01-14 11:04:59 -08:00
Anthony Miyaguchi c0b691c9f1
Create incidents within DatasetStatusHook (#411)
* Update the statupage dataset_client with a method for creating incidents

* Fixed typo

* Add tests for dataset_status_client

* Fix review issues

* Update tests/test_dataset_status_client.py
2019-01-11 12:47:45 -08:00
Arkadiusz Komarzewski 33adb8acba
Allow configuring driver node instance type in Databricks operator 2019-01-07 18:40:03 +01:00