Since S3Hook is reimplemented based on the AwsHook
using boto3, its package dependencies need to be
updated as well.
Closes#2790 from m1racoli/fix-setup-s3
python-daemon declares its docutils dependency in a setup_requires
clause, and 'python setup.py install' fails since it misses
that dependency.
Closes#2765 from wrp/docutils
https://github.com/spulec/moto/pull/1048 introduced `docker` as a
dependency in Moto, causing a conflict as Airflow uses `docker-py`. As
both packages don't work together, Moto is pinned to the version
prior to that change.
JayDeBeApi made a backwards incompatible change
This updates the JDBC Hook's implementation
and changes the required JayDeBeApi to >= 1.1.1
Closes#2651 from r-richmond/AIRFLOW-926
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.
Closes#2542 from bolkedebruin/upgrade_celery
By default `find_packages()` will find _any_ valid
python package,
including things under tests. We don't want to
install the tests
packages into the python path, so exclude those.
Closes#2597 from ashb/AIRFLOW-1594-dont-install-
tests
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).
Closes#2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit
Closes#2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
1. Upgrade qds_sdk version to latest
2. Add support to run Zeppelin Notebooks
3. Move out initialization of QuboleHook from
init()
Closes#2322 from msumit/AIRFLOW-1192
For now, SecurityTests.test_csrf_rejection fails
because flask-wtf version specified in setup.py is
too old.
This PR fixes it.
Closes#2280 from sekikn/AIRFLOW-1180
Per Apache requirements Airflow should be branded
Apache Airflow.
It is impossible to provide a forward compatible
automatic update
path and users will be required to manually
upgrade.
Closes#2172 from bolkedebruin/AIRFLOW-1000
Add DatabricksSubmitRun Operator
In this PR, we contribute a DatabricksSubmitRun operator and a
Databricks hook. This operator enables easy integration of Airflow
with Databricks. In addition to the operator, we have created a
databricks_default connection, an example_dag using this
DatabricksSubmitRunOperator, and matching documentation.
Closes#2202 from andrewmchen/databricks-operator-
squashed
This PR implements a hook to interface with Azure
storage over wasb://
via azure-storage; adds sensors to check for blobs
or prefixes; and
adds an operator to transfer a local file to the
Blob Storage.
Design is similar to that of the S3Hook in
airflow.operators.S3_hook.
Closes#2216 from hgrif/AIRFLOW-1065
We add the Apache-licensed bleach library and use
it to sanitize html
passed to Markup (which is supposed to be already
escaped). This avoids
some XSS issues with unsanitized user input being
displayed.
Closes#2193 from saguziel/aguziel-xss
This PR includes a redis_hook and a redis_key_sensor to enable
checking for key existence in redis. It also updates the
documentation and add the relevant unit tests.
- [x] Opened a PR on Github
- [x] My PR addresses the following Airflow JIRA
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-999
- [x] The PR title references the JIRA issues. For
example, "[AIRFLOW-1] My Airflow PR"
- [x] My PR adds unit tests
- [ ] __OR__ my PR does not need testing for this
extremely good reason:
- [x] Here are some details about my PR:
- [ ] Here are screenshots of any UI changes, if
appropriate:
- [x] Each commit subject references a JIRA issue.
For example, "[AIRFLOW-1] Add new feature"
- [x] Multiple commits addressing the same JIRA
issue have been squashed
- [x] My commits follow the guidelines from "[How
to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
Closes#2165 from msempere/AIRFLOW-999/support-
for-redis-database
This silences deprecation warnings, e.g.
airflow/airflow/utils/dag_processing.py:578:
DeprecationWarning: The
'warn' method is deprecated, use 'warning' instead
Closes#2082 from imbaczek/bug871
Submitting on behalf of plypaul
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-219
-
https://issues.apache.org/jira/browse/AIRFLOW-398
Testing Done:
- Running on Airbnb prod (though on a different
mergebase) for many months
Credits:
Impersonation Work: georgeke did most of the work
but plypaul did quite a bit of work too.
Cgroups: plypaul did most of the work, I just did
some touch up/bug fixes (see commit history,
cgroups + impersonation commit is actually plypaul
's not mine)
Closes#1934 from aoen/ddavydov/cgroups_and_impers
onation_after_rebase
This implements a framework for API calls to Airflow. Currently
all access is done by cli or web ui. Especially in the context
of the cli this raises security concerns which can be alleviated
with a secured API call over the wire.
Secondly integration with other systems is a bit harder if you have
to call a cli. For public facing endpoints JSON is used.
As an example the trigger_dag functionality is now made into a
API call.
Backwards compat is retained by switching to a LocalClient.