TaskInstances are sometimes instantiated outside
core
Airflow with naive datetimes. In case this happens
we
now default to using the time zone of the DAG if
that
is available or the default system time zone.
Closes#2946 from bolkedebruin/AIRFLOW-1927
This enables Airflow and Celery Flower to live
below root. Draws on the work of Geatan Semet
(@Stibbons).
This closes#2723 and closes#2818Closes#2952 from bolkedebruin/AIRFLOW-1755
Update Celery to 4.0.2 for fixing error
TypeError: '<=' not supported between instances of
'NoneType' and 'int'
Hi all,
I'd like to update Celery to version 4.0.2. While
updating my Docker container to version 1.9, I
caught this error:
```
worker_1 | [2018-01-03 10:34:29,934:
CRITICAL/MainProcess] Unrecoverable error:
TypeError("'<=' not supported between instances of
'NoneType' and 'int'",)
worker_1 | Traceback (most recent call last):
worker_1 | File "/usr/local/lib/python3.6
/site-packages/celery/worker/worker.py", line 203,
in start
worker_1 | self.blueprint.start(self)
worker_1 | File "/usr/local/lib/python3.6
/site-packages/celery/bootsteps.py", line 115, in
start
worker_1 | self.on_start()
worker_1 | File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 143,
in on_start
worker_1 | self.emit_banner()
worker_1 | File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 159,
in emit_banner
worker_1 |
string(self.colored.reset(self.extra_info() or
'')),
worker_1 | File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 188,
in extra_info
worker_1 | if self.loglevel <=
logging.INFO:
worker_1 | TypeError: '<=' not supported
between instances of 'NoneType' and 'int'
```
This is because I've been running Python 2 in my
local environments, and the Docker image is Python
3:
https://github.com/puckel/docker-
airflow/pull/143/files
This is the issue in Celery:
https://github.com/celery/celery/blob/0dde9df9d8dd
5dbbb97ef75a81757bc2d9a4b33e/Changelog#L145
Make sure you have checked _all_ steps below.
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
- https://issues.apache.org/jira/browse/AIRFLOW-
XXX
### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#2914 from Fokko/AIRFLOW-1967-update-celery
Since S3Hook is reimplemented based on the AwsHook
using boto3, its package dependencies need to be
updated as well.
Closes#2790 from m1racoli/fix-setup-s3
python-daemon declares its docutils dependency in a setup_requires
clause, and 'python setup.py install' fails since it misses
that dependency.
Closes#2765 from wrp/docutils
https://github.com/spulec/moto/pull/1048 introduced `docker` as a
dependency in Moto, causing a conflict as Airflow uses `docker-py`. As
both packages don't work together, Moto is pinned to the version
prior to that change.
JayDeBeApi made a backwards incompatible change
This updates the JDBC Hook's implementation
and changes the required JayDeBeApi to >= 1.1.1
Closes#2651 from r-richmond/AIRFLOW-926
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.
Closes#2542 from bolkedebruin/upgrade_celery
By default `find_packages()` will find _any_ valid
python package,
including things under tests. We don't want to
install the tests
packages into the python path, so exclude those.
Closes#2597 from ashb/AIRFLOW-1594-dont-install-
tests
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).
Closes#2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit
Closes#2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
1. Upgrade qds_sdk version to latest
2. Add support to run Zeppelin Notebooks
3. Move out initialization of QuboleHook from
init()
Closes#2322 from msumit/AIRFLOW-1192
For now, SecurityTests.test_csrf_rejection fails
because flask-wtf version specified in setup.py is
too old.
This PR fixes it.
Closes#2280 from sekikn/AIRFLOW-1180
Per Apache requirements Airflow should be branded
Apache Airflow.
It is impossible to provide a forward compatible
automatic update
path and users will be required to manually
upgrade.
Closes#2172 from bolkedebruin/AIRFLOW-1000
Add DatabricksSubmitRun Operator
In this PR, we contribute a DatabricksSubmitRun operator and a
Databricks hook. This operator enables easy integration of Airflow
with Databricks. In addition to the operator, we have created a
databricks_default connection, an example_dag using this
DatabricksSubmitRunOperator, and matching documentation.
Closes#2202 from andrewmchen/databricks-operator-
squashed
This PR implements a hook to interface with Azure
storage over wasb://
via azure-storage; adds sensors to check for blobs
or prefixes; and
adds an operator to transfer a local file to the
Blob Storage.
Design is similar to that of the S3Hook in
airflow.operators.S3_hook.
Closes#2216 from hgrif/AIRFLOW-1065
We add the Apache-licensed bleach library and use
it to sanitize html
passed to Markup (which is supposed to be already
escaped). This avoids
some XSS issues with unsanitized user input being
displayed.
Closes#2193 from saguziel/aguziel-xss
This PR includes a redis_hook and a redis_key_sensor to enable
checking for key existence in redis. It also updates the
documentation and add the relevant unit tests.
- [x] Opened a PR on Github
- [x] My PR addresses the following Airflow JIRA
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-999
- [x] The PR title references the JIRA issues. For
example, "[AIRFLOW-1] My Airflow PR"
- [x] My PR adds unit tests
- [ ] __OR__ my PR does not need testing for this
extremely good reason:
- [x] Here are some details about my PR:
- [ ] Here are screenshots of any UI changes, if
appropriate:
- [x] Each commit subject references a JIRA issue.
For example, "[AIRFLOW-1] Add new feature"
- [x] Multiple commits addressing the same JIRA
issue have been squashed
- [x] My commits follow the guidelines from "[How
to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
Closes#2165 from msempere/AIRFLOW-999/support-
for-redis-database