* Revert "update airflow config for 2.3.3"
This reverts commit d19cc711aa.
* Revert "fix deprecation warnings, clean up and update for 2.3.3"
This reverts commit e80472ab9a.
* Revert "update requirements, introduce constraints file and clean up for 2.3.3"
This reverts commit 8e60dba783.
* Adding few extra links to the Mozilla menu button for Airflow
Hopefully this is a step in making Airflow triage quicker and easier.
* Changes made suggested by @jklukas
I've not had success over the past few days getting the backfill UI to work.
I submit jobs, but they never produce output or result in BQ output.
I checked in cloud logging and found ERROR messages with tracebacks ([example](https://console.cloud.google.com/logs/query;pinnedLogId=2020-10-15T13:57:27.221935242Z%2Fr7d2ll39deckufib8;query=backfill%0Aseverity%3DERROR?project=moz-fx-data-airflow-prod-88e0)):
```
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 56, in handle
self.handle_request(listener_name, req, client, addr)
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/ggevent.py", line 160, in handle_request
addr)
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/base_async.py", line 114, in handle_request
for item in respiter:
File "/usr/local/lib/python3.7/site-packages/werkzeug/wsgi.py", line 506, in __next__
return self._next()
File "/usr/local/lib/python3.7/site-packages/werkzeug/wrappers/base_response.py", line 45, in _iter_encoded
for item in iterable:
File "/app/pvmount/telemetry-airflow/plugins/backfill/main.py", line 133, in read_process
result = re.match(pattern, line)
File "/usr/local/lib/python3.7/re.py", line 175, in match
return _compile(pattern, flags).match(string)
TypeError: cannot use a string pattern on a bytes-like object
```
So looks like a Python 3 migration bug, and submitted tasks are generating
exceptions before they're able to run.
* Remove dataset alerts since they are outdate
* Remove s3fs check operator
* Bug 1632591 - Reflect operational state of job on success by default
* Update doc with default behavior
* Update test to reflect new default state of register_status
* Add dummy values to gcp extras section and dummy AWS credentials
* Delay uploading mozetl runner until execution time
* Add list of aws and gcp credentials for initializing in dev
* Use realistic values for dummy GCP keyfile
Co-Authored-By: Sunah Suh <github@sunahsuh.com>
* split off taar weekly jobs into a separate script
* added ExternaltaskSensor dependency on main_summary
* fixed dependnecy to point to clients_daily instead of main_summary
* fixes as per review
renamed `dag_weekly` to `taar_weekly` for weely taar dag
corrected external_task_id and external_dag_id
* Added a `start_date` argument to the task
* removed Frank as owner and set myself as the owner of the task
removed frank from alert recipient
* Add initial function for generating the mozetl runner
* Add tests for generate_runner
* Generate a runner for external modules
* Add missing changes to test_mozetl
* Add python3 support to Databricks clusters
* Set default python version to 3
* Update moz_databricks test pattern to include json payload
* Refactor mock_hook into a fixture
* Add test asserting value of `PYSPARK_VERSION`
* Fix environment variable name to PYSPARK_PYTHON
* Fix#433 - Add support for alternative git path and branch in mozetl jobs
* Add test for setting alternative repo path
* Move churn_v2 job to moz_databricks
* Downgrade python version for churn-v2
* Add python3 support to Databricks clusters
* Set default python version to 3
* Update moz_databricks test pattern to include json payload
* Refactor mock_hook into a fixture
* Add test asserting value of `PYSPARK_VERSION`
* Fix environment variable name to PYSPARK_PYTHON
* Address review by improving error handling and comments
* Increase databricks retries and retry delay to avoid api errors
* Use the databricks plugin instead of built-in hooks and operators
* Add basic test for mozdatabricks
* Fix error in moz_databricks with keyword argument
* Reword comment
* Source databricks code from apache/airflow:v1-10-stable
* Update imports and mocks for plugin structure
* Export backported Databricks plugin from v1-10-stable
* Add comments to backported databricks plugin
* Update links to point to revisions instead of branches
* Remove AirflowException as an unreachable code path
* Add create_incident flag to DatasetStatusOperator
* Address initial review
* Set dataset_alerts to automatically open incidents on failure
* Add a S3FSCheckSuccessOperator and fix imports
* Remove py36 from envlist due to snakebite import in sensors
* Fix#404 - Set the number of expected partitions as a lower bound