Граф коммитов

391 Коммитов

Автор SHA1 Сообщение Дата
Joy Gao f5115b7e6a [ARIFLOW-2458] Add cassandra-to-gcs operator
Closes #3354 from jgao54/cassandra-to-gcs
2018-05-18 02:02:57 +01:00
Marcus Rehm 7c233179e9 [AIRFLOW-2420] Azure Data Lake Hook
Add AzureDataLakeHook as a first step to enable
Airflow connect to
Azure Data Lake.

The hook has a simple interface to upload and
download files with all
parameters available in Azure Data Lake sdk and
also a check_for_file
to query if a file exists in data lake.

[AIRFLOW-2420] Add functionality for Azure Data
Lake

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW-242
0) issues and references them in the PR title.
    -
https://issues.apache.org/jira/browse/AIRFLOW-2420

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
       This PR creates Azure Data Lake hook
(adl_hook.AdlHook) and all the setup required to
create a new Azure Data Lake connection.

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
       Adds tests to airflow.hooks.adl_hook.py in
tests.hooks.test_adl_hook.py

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3333 from marcusrehm/master
2018-05-15 10:30:54 -07:00
alberto.calderari 6c19468e0b [AIRFLOW-2110][AIRFLOW-2122] Enhance Http Hook
- Use a header in passed in the "extra" argument and
  add tenacity retry
- Fix the tests with proper mocking

Closes #3071 from albertocalderari/master
2018-05-14 21:52:22 +02:00
Bolke de Bruin 648e1e6930 [AIRFLOW-2425] Add lineage support
Add lineage support by having inlets and oulets
that
are made available to dependent upstream or
downstream
tasks.

If configured to do so can send lineage data to a
backend. Apache Atlas is supported out of the box.

Closes #3321 from bolkedebruin/lineage_exp
2018-05-14 09:09:25 +02:00
Joy Gao df693cd278 [AIRFLOW-2457] Update FAB version requirement
Closes #3349 from jgao54/update-fab-version
2018-05-12 09:03:12 +02:00
Jordan Zucker 4d43b78f11 [AIRFLOW-2333] Add Segment Hook and TrackEventOperator
Add support for Segment with an accompanying hook
and an
operator for sending track events

Closes #3335 from jzucker2/add-segment-support
2018-05-11 09:25:19 +02:00
Luke Bodeen e5f2a38d6a [AIRFLOW-1978] Add WinRM windows operator and hook
Closes #3316 from cloneluke/winrm_connector2
2018-05-08 11:12:59 -07:00
Fokko Driesprong c5ed7b1fbd [AIRFLOW-2418] Bump Flask-WTF
Flask-appbuilder needs at least 0.14.2

Closes #3313 from Fokko/AIRFLOW-2418
2018-05-06 11:25:10 +02:00
Fokko Driesprong 0ff434a9b7 Revert "[AIRFLOW-2391] Fix to Flask 0.12.2"
This reverts commit 3368f4258c.
2018-04-30 10:35:05 +02:00
Tao feng 700c0f488f [AIRFLOW-2389] Create a pinot db api hook
Closes #3274 from feng-tao/pinot_db_hook
2018-04-30 08:41:43 +02:00
Kaxil Naik e9b74b68aa [AIRFLOW-2266][AIRFLOW-2343] Remove google-cloud-dataflow dependency
This is caused due to the fact that the latest
release (2.4) for apache-beam[gcp] is not
available for Python 3.x. Also as we are using
Google's discovery based API for all google cloud
related commands we don't require to import
google-cloud-dataflow package

Closes #3273 from kaxil/patch-3
2018-04-28 21:24:15 +02:00
Fokko Driesprong 3368f4258c [AIRFLOW-2391] Fix to Flask 0.12.2
Flask 0.12.3 has issues with Airflow and needs to
be fixed.
Therefore lock the version to 0.12.2.

Closes #3277 from Fokko/fd-fix-master-ci
2018-04-28 20:25:13 +02:00
Giovanni Lanzani 6c45b8c5f2 [AIRFLOW-2336] Use hmsclient in hive_hook
The package hmsclient is Python2/3 compatible and
offer a handy context
manager to handle opening and closing connections.

Closes #3239 from gglanzani/AIRFLOW-2336
2018-04-25 12:23:59 +02:00
Sam Garrett efc316d2ad [AIRFLOW-2345] pip is not used in this setup.py
Closes #3241 from sinemetu1/patch-1
2018-04-20 10:03:08 +02:00
Marius van Niekerk e95a1251b7 [AIRFLOW-2240][DASK] Added TLS/SSL support for the dask-distributed scheduler.
As of 0.17.0 dask distributed has support for
TLS/SSL.

[dask] Added TLS/SSL support for the dask-
distributed scheduler.

As of 0.17.0 dask distributed has support for
TLS/SSL.

Add a test for tls under dask distributed

Closes #2683 from mariusvniekerk/dask-ssl
2018-04-18 09:45:52 -07:00
Kengo Seki 6e82f1d7c9 [AIRFLOW-2299] Add S3 Select functionarity to S3FileTransformOperator
Currently, S3FileTransformOperator downloads the
whole file from S3
before transforming and uploading it. Adding
extraction feature using
S3 Select to this operator improves its efficiency
and usablitily.

Closes #3227 from sekikn/AIRFLOW-2299
2018-04-17 10:53:05 +02:00
Bolke de Bruin c7a472ed6b [AIRFLOW-2287] Fix incorrect ASF headers
Closes #3219 from bolkedebruin/fix_header
2018-04-14 09:13:23 +02:00
Kevin Yang ec38ba9594 [AIRFLOW-1325] Add ElasticSearch log handler and reader
Closes #3214 from
yrqls21/kevin_yang_add_es_task_handler
2018-04-13 11:09:50 +02:00
Kengo Seki 5cb530b455 [AIRFLOW-2293] Fix S3FileTransformOperator to work with boto3
S3FileTransformOperator doen't work for now since
it uses a function
which is no longer supported by boto3. This PR
replaces it with a
valid function and adds an unit test for this
operator.

Closes #3200 from sekikn/AIRFLOW-2293
2018-04-12 09:28:18 +02:00
devinXL8 c4ba1051a7 [AIRFLOW-2200] Add snowflake operator with tests 2018-04-04 15:10:00 -04:00
Bolke de Bruin 68bbffd315 [AIRFLOW-1430] Solve GPL dependency
One of the dependencies was pulling in
a GPL library by default. With the new
release of python-nvd3 this is now solved.

Closes #3160 from bolkedebruin/legal
2018-03-27 16:08:53 +02:00
Fabrice 821ced78e7 [AIRFLOW-2060] Update pendulum version to 1.4.4
This fixes a task clearing issue with deep copy

Closes #3154 from cinhil/AIRFLOW-2060
2018-03-24 09:01:59 +01:00
Joy Gao 05e1861e24 [AIRFLOW-1433][AIRFLOW-85] New Airflow Webserver UI with RBAC support
Closes #3015 from jgao54/rbac
2018-03-23 09:18:48 +01:00
Tao feng 7a880a7e98 [AIRFLOW-2183] Refactor DruidHook to enable sql
Refactor DruidHook to be able to issue druid sql query to druid broker

Closes #3105 from feng-tao/airflow-2183
2018-03-14 09:20:20 +01:00
biellls f80138486e [AIRFLOW-442] Add SFTPHook
Closes #2487 from sdiazb/sftp_hook
2018-03-10 15:12:07 +01:00
Kaxil Naik 0ef6361f4b [AIRFLOW-2187] Fix Broken Travis CI due to AIRFLOW-2123
- Added the packages to ignore for Python 3
2018-03-07 00:23:16 +00:00
Fokko Driesprong 976fd1245a [AIRFLOW-2123] Install CI dependencies from setup.py
Install the dependencies from setup.py so we keep all the dependencies
in one single place

Closes #3054 from Fokko/fd-fix-ci-2
2018-03-05 22:46:45 +00:00
Moe Nadal 667a26ce49 [AIRFLOW-1551] Add operator to trigger Jenkins job
Closes #2553 from moe-nadal-ck/AIRFLOW-1551/AddJenkinsOperator
2018-02-27 11:51:49 +01:00
Marcos Bernardelli beadcd327c [AIRFLOW-2125] Using binary package psycopg2-binary
Closes #3055 from bern4rdelli/AIRFLOW-2125
2018-02-24 16:16:00 +01:00
Bolke de Bruin 772dbae298 [AIRFLOW-1927] Convert naive datetimes for TaskInstances
TaskInstances are sometimes instantiated outside
core
Airflow with naive datetimes. In case this happens
we
now default to using the time zone of the DAG if
that
is available or the default system time zone.

Closes #2946 from bolkedebruin/AIRFLOW-1927
2018-02-06 17:26:08 +01:00
Fokko Driesprong e76cda0ff5 [AIRFLOW-2038] Add missing kubernetes dependency for dev
When doing initdb, it fails on the kubernetes
dependency from the
examples

Closes #2978 from Fokko/fd-fix-dependencies
2018-02-05 20:52:28 +01:00
Bolke de Bruin 1e36b37b68 [AIRFLOW-1755] Allow mount below root
This enables Airflow and Celery Flower to live
below root. Draws on the work of Geatan Semet
(@Stibbons).

This closes #2723 and closes #2818

Closes #2952 from bolkedebruin/AIRFLOW-1755
2018-01-19 18:54:26 +01:00
Bolke de Bruin 88130a5d7e [AIRFLOW-2003] Use flask-caching instead of flask-cache
Flask-cache has been unmaintained for over three
years,
flask-caching is the community supported version.

Closes #2944 from bolkedebruin/AIRFLOW-2003
2018-01-15 21:12:03 +01:00
Bolke de Bruin 1abe7f6d54 Merge pull request #2853 from dimberman/Airflow_1517_kubenetes_operator 2018-01-12 19:02:52 +01:00
fenglu-g cc9295fe37 [AIRFLOW-1953] Add labels to dataflow operators
Closes #2913 from fenglu-g/master
2018-01-03 11:16:39 -08:00
Fokko Driesprong b9f4a7437e [AIRFLOW-1967] Update Celery to 4.0.2
Update Celery to 4.0.2 for fixing error
TypeError: '<=' not supported between instances of
'NoneType' and 'int'

Hi all,

I'd like to update Celery to version 4.0.2. While
updating my Docker container to version 1.9, I
caught this error:
```
worker_1     | [2018-01-03 10:34:29,934:
CRITICAL/MainProcess] Unrecoverable error:
TypeError("'<=' not supported between instances of
'NoneType' and 'int'",)
worker_1     | Traceback (most recent call last):
worker_1     |   File "/usr/local/lib/python3.6
/site-packages/celery/worker/worker.py", line 203,
in start
worker_1     |     self.blueprint.start(self)
worker_1     |   File "/usr/local/lib/python3.6
/site-packages/celery/bootsteps.py", line 115, in
start
worker_1     |     self.on_start()
worker_1     |   File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 143,
in on_start
worker_1     |     self.emit_banner()
worker_1     |   File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 159,
in emit_banner
worker_1     |
string(self.colored.reset(self.extra_info() or
'')),
worker_1     |   File "/usr/local/lib/python3.6
/site-packages/celery/apps/worker.py", line 188,
in extra_info
worker_1     |     if self.loglevel <=
logging.INFO:
worker_1     | TypeError: '<=' not supported
between instances of 'NoneType' and 'int'
```

This is because I've been running Python 2 in my
local environments, and the Docker image is Python
3:
https://github.com/puckel/docker-
airflow/pull/143/files

This is the issue in Celery:
https://github.com/celery/celery/blob/0dde9df9d8dd
5dbbb97ef75a81757bc2d9a4b33e/Changelog#L145

Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
    - https://issues.apache.org/jira/browse/AIRFLOW-
XXX

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #2914 from Fokko/AIRFLOW-1967-update-celery
2018-01-03 09:20:00 -08:00
Bolke de Bruin 5e4d7d8d7d [AIRFLOW-XXX] Pin sqlalchemy dependency 2017-12-29 12:28:51 +01:00
Daniel Imberman 78ff2fc180 [AIRFLOW-1517] Kubernetes Operator 2017-12-26 08:45:31 -08:00
William Pursell 355135bcb8 [AIRFLOW-1915] Relax flask-wtf dependency specification
Closes #2876 from wrp/flask-wtf
2017-12-22 13:57:47 +01:00
William Pursell a5ca8cdce9 [AIRFLOW-1938] Clean up unused exception
There is no longer the possibility of a
GitCommandError (since cc4404b5f7)

Closes #2898 from wrp/setup
2017-12-22 13:49:37 +01:00
Chris Riccomini cc4404b5f7 [AIRFLOW-1938] Remove tag version check in setup.py
Closes #2889 from criccomini/AIRFLOW-1938
2017-12-19 12:29:39 -08:00
Kamil Chmielewski 2dbd81fa81 [AIRFLOW-1896] FIX bleach <> html5lib incompatibility
Running airflow with bleach 2.0.0 can cause:
`ImportError: No module named base`
https://github.com/mozilla/bleach/issues/267

This was resolved in https://github.com/mozilla/bl
each/releases/tag/v2.1.2

Closes #2858 from kamilchm/patch-1
2017-12-09 09:34:34 +01:00
Bolke de Bruin 518a41acf3 [AIRFLOW-1826] Update views to use timezone aware objects 2017-11-27 15:54:27 +01:00
Bolke de Bruin 2f168634aa [AIRFLOW-1807] Force use of time zone aware db fields
This change will check if all date times being stored are
indeed timezone aware.
2017-11-27 15:54:27 +01:00
Bolke de Bruin c857436b75 [AIRFLOW-1808] Convert all utcnow() to time zone aware
datetime.utcnow() does not set time zone information.
2017-11-27 15:54:20 +01:00
Bolke de Bruin a47255fb2d [AIRFLOW-1804] Add time zone configuration options
Time zone defaults to UTC as is the default now in order
to maintain backwards compatibility.
2017-11-27 15:53:03 +01:00
Cedrik Neumann 5157b5a763 [AIRFLOW-1817] use boto3 for s3 dependency
Since S3Hook is reimplemented based on the AwsHook
using boto3, its package dependencies need to be
updated as well.

Closes #2790 from m1racoli/fix-setup-s3
2017-11-15 11:55:13 +01:00
Brian Charous cbb00d4055 [AIRFLOW-1102] Upgrade Gunicorn >=19.4.0
Closes #2775 from briancharous/upgrade-gunicorn
2017-11-10 08:49:41 +01:00
William Pursell 9425d359b0 [AIRFLOW-646] Add docutils to setup_requires
python-daemon declares its docutils dependency in a setup_requires
clause, and 'python setup.py install' fails since it misses
that dependency.

Closes #2765 from wrp/docutils
2017-11-09 09:15:12 -08:00
Stefanie Grunwald a61d9444cd
[AIRFLOW-1669] Fix Docker and pin Moto to 1.1.19
https://github.com/spulec/moto/pull/1048 introduced `docker` as a
dependency in Moto, causing a conflict as Airflow uses `docker-py`. As
both packages don't work together, Moto is pinned to the version
prior to that change.
2017-11-02 14:23:32 +01:00
Sumit Maheshwari c5776375fd [AIRFLOW-1315] Add Qubole File & Partition Sensors
Closes #2401 from msumit/AIRFLOW-1315
2017-10-31 19:32:07 +05:30
r-richmond 21257e8f06 [AIRFLOW-926] Fix JDBC Hook
JayDeBeApi made a backwards incompatible change
This updates the JDBC Hook's implementation
and changes the required JayDeBeApi to >= 1.1.1

Closes #2651 from r-richmond/AIRFLOW-926
2017-10-22 20:04:30 +02:00
fenglu-g 7cb818bbac [AIRFLOW-1723] Support sendgrid in email backend
Closes #2695 from fenglu-g/master
2017-10-18 12:27:14 -07:00
Bolke de Bruin 65f3b468a2 [AIRFLOW-1527] Refactor celery config
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.

Closes #2542 from bolkedebruin/upgrade_celery
2017-09-25 11:19:16 -07:00
Bolke de Bruin fa1dc1eb20 Revert "[AIRFLOW-1368] Automatically remove Docker container on exit"
This reverts commit 46c86a5cd2.
2017-09-24 19:35:28 +02:00
Nathaniel Varona 46c86a5cd2 [AIRFLOW-1368] Automatically remove Docker container on exit
Closes #2411 from nathanielvarona/docker-operator
2017-09-22 10:15:23 -07:00
Ash Berlin-Taylor a6b23a36e0 [AIRFLOW-1594] Don't install test packages into python root.[]
By default `find_packages()` will find _any_ valid
python package,
including things under tests. We don't want to
install the tests
packages into the python path, so exclude those.

Closes #2597 from ashb/AIRFLOW-1594-dont-install-
tests
2017-09-13 10:11:39 +02:00
Fokko Driesprong a7a518902d [AIRFLOW-1582] Improve logging within Airflow
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).

Closes #2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
2017-09-13 09:36:58 +02:00
Dan Fuller aa95f25796 [AIRFLOW-1573] Remove `thrift < 0.10.0` requirement
Closes #2574 from dan-disqus/Thrift
2017-09-11 13:14:47 +02:00
Fokko Driesprong de99aa20f4 [AIRFLOW-1324] Generalize Druid operator and hook
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit

Closes #2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
2017-08-18 21:34:03 +02:00
Jay fe0edeaab5 [AIRFLOW-756][AIRFLOW-751] Replace ssh hook, operator & sftp operator with paramiko based
Closes #1999 from jhsenjaliya/AIRFLOW-756
2017-07-20 22:07:45 +02:00
Feng Lu cf2605d3e5 [AIRFLOW-1338] Fix incompatible GCP dataflow hook
Closes #2388 from fenglu-g/master
2017-06-23 13:26:16 -07:00
Kengo Seki 0f55477ccb [AIRFLOW-1172] Support nth weekday of the month cron expression
Closes #2321 from sekikn/AIRFLOW-1172
2017-06-14 17:59:02 -07:00
Sumit Maheshwari 6be02475f8 [AIRFLOW-1192] Some enhancements to qubole_operator
1. Upgrade qds_sdk version to latest
2. Add support to run Zeppelin Notebooks
3. Move out initialization of QuboleHook from
init()

Closes #2322 from msumit/AIRFLOW-1192
2017-06-07 09:09:50 +02:00
Stanislav Kudriashev d2d3e49ca0 [AIRFLOW-1201] Update deprecated 'nose-parameterized'
The 'parameterized' package should be used now,

Closes #2298 from skudriashev/airflow-1201
2017-05-16 11:34:52 +02:00
Kengo Seki ae61987945 [AIRFLOW-1180] Fix flask-wtf version for test_csrf_rejection
For now, SecurityTests.test_csrf_rejection fails
because flask-wtf version specified in setup.py is
too old.
This PR fixes it.

Closes #2280 from sekikn/AIRFLOW-1180
2017-05-13 21:03:51 +02:00
Niels Zeilemaker ac9ccb1518 [AIRFLOW-1179] Fix Pandas 0.2x breaking Google BigQuery change
Closes #2279 from NielsZeilemaker/AIRFLOW-1179
2017-05-09 09:42:32 -07:00
Richard Lee f5bfda0d64 [AIRFLOW-945][AIRFLOW-941] Remove psycopg2 connection workaround
Closes #2272 from dlackty/AIRFLOW-945
2017-05-04 21:34:38 +02:00
Bolke de Bruin 4fb05d8cc7 [AIRFLOW-1000] Rebrand distribution to Apache Airflow
Per Apache requirements Airflow should be branded
Apache Airflow.
It is impossible to provide a forward compatible
automatic update
path and users will be required to manually
upgrade.

Closes #2172 from bolkedebruin/AIRFLOW-1000
2017-04-17 10:09:47 +02:00
Andrew Chen 53ca508456 [AIRFLOW-1028] Databricks Operator for Airflow
Add DatabricksSubmitRun Operator

In this PR, we contribute a DatabricksSubmitRun operator and a
Databricks hook. This operator enables easy integration of Airflow
with Databricks. In addition to the operator, we have created a
databricks_default connection, an example_dag using this
DatabricksSubmitRunOperator, and matching documentation.

Closes #2202 from andrewmchen/databricks-operator-
squashed
2017-04-06 08:30:33 -07:00
Henk Griffioen f1bc5f38ac [AIRFLOW-1065] Add functionality for Azure Blob Storage over wasb://
This PR implements a hook to interface with Azure
storage over wasb://
via azure-storage; adds sensors to check for blobs
or prefixes; and
adds an operator to transfer a local file to the
Blob Storage.

Design is similar to that of the S3Hook in
airflow.operators.S3_hook.

Closes #2216 from hgrif/AIRFLOW-1065
2017-04-05 09:56:23 +02:00
Alex Guziel fe9ebe3ccf [AIRFLOW-1047] Sanitize strings passed to Markup
We add the Apache-licensed bleach library and use
it to sanitize html
passed to Markup (which is supposed to be already
escaped). This avoids
some XSS issues with unsanitized user input being
displayed.

Closes #2193 from saguziel/aguziel-xss
2017-03-28 16:40:32 -07:00
MSempere 8de8501626 [AIRFLOW-999] Add support for Redis database
This PR includes a redis_hook and a redis_key_sensor to enable
checking for key existence in redis. It also updates the
documentation and add the relevant unit tests.

- [x] Opened a PR on Github

- [x] My PR addresses the following Airflow JIRA
issues:
    -
https://issues.apache.org/jira/browse/AIRFLOW-999
- [x] The PR title references the JIRA issues. For
example, "[AIRFLOW-1] My Airflow PR"

- [x] My PR adds unit tests
- [ ] __OR__ my PR does not need testing for this
extremely good reason:

- [x] Here are some details about my PR:
- [ ] Here are screenshots of any UI changes, if
appropriate:

- [x] Each commit subject references a JIRA issue.
For example, "[AIRFLOW-1] Add new feature"
- [x] Multiple commits addressing the same JIRA
issue have been squashed
- [x] My commits follow the guidelines from "[How
to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
  1. Subject is separated from body by a blank line
  2. Subject is limited to 50 characters
  3. Subject does not end with a period
  4. Subject uses the imperative mood ("add", not
"adding")
  5. Body wraps at 72 characters
  6. Body explains "what" and "why", not "how"

Closes #2165 from msempere/AIRFLOW-999/support-
for-redis-database
2017-03-20 11:11:31 -07:00
Sean Cronin f3924696ff [AIRFLOW-954] Fix configparser ImportError
Fixes support for Python 2.7 since
https://github.com/apache/incubator-airflow/pull/2091 was merged
2017-03-07 20:52:48 -05:00
Bolke de Bruin 4f52db317f [AIRFLOW-911] Add coloring and timing to tests
Closes #2106 from bolkedebruin/profile_tests
2017-02-25 22:10:14 +01:00
Bolke de Bruin 784b3638c5 [AIRFLOW-895] Address Apache release incompliancies
* Fixes missing licenses in NOTICE
* Corrects license header
* Removes HighCharts left overs.

Closes #2098 from bolkedebruin/AIRFLOW-895
2017-02-23 23:48:03 +01:00
Jeremiah Lowin 50902d0736 [AIRFLOW-887] Support future v0.16 2017-02-18 18:39:01 -05:00
Marek Baczynski 21d775a9a4 [AIRFLOW-871] change logging.warn() into warning()
This silences deprecation warnings, e.g.

airflow/airflow/utils/dag_processing.py:578:
DeprecationWarning: The
'warn' method is deprecated, use 'warning' instead

Closes #2082 from imbaczek/bug871
2017-02-18 11:12:14 -05:00
Jeremiah Lowin 6e22102782 [AIRFLOW-862] Add DaskExecutor
Adds a DaskExecutor for running Airflow tasks
in Dask clusters.

Closes #2067 from jlowin/dask-executor
2017-02-12 16:06:31 -05:00
Bolke de Bruin a2b0ea3226 Merge pull request #2010 from gsakkis/fixes 2017-01-22 17:36:04 +01:00
Amin Ghadersohi 2acb10a814 [AIRFLOW-776] Add missing cgroups devel dependency
Closes #2009 from aminghadersohi/master
2017-01-22 16:57:02 +01:00
George Sakkis cce6ffcf07 [AIRFLOW-784] Pin funcsigs to 1.0.0 2017-01-21 15:07:58 +02:00
George Sakkis bccb9e2cb9 [AIRFLOW-624] Fix setup.py to not import airflow.version as version 2017-01-21 15:07:37 +02:00
Dan Davydov b56cb5cc97 [AIRFLOW-219][AIRFLOW-398] Cgroups + impersonation
Submitting on behalf of plypaul

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-219
-
https://issues.apache.org/jira/browse/AIRFLOW-398

Testing Done:
- Running on Airbnb prod (though on a different
mergebase) for many months

Credits:
Impersonation Work: georgeke did most of the work
but plypaul did quite a bit of work too.
Cgroups: plypaul did most of the work, I just did
some touch up/bug fixes (see commit history,
cgroups + impersonation commit is actually plypaul
's not mine)

Closes #1934 from aoen/ddavydov/cgroups_and_impers
onation_after_rebase
2017-01-18 18:11:06 -08:00
Jay 44798e0d4d [AIRFLOW-683] Add jira hook, operator and sensor
Closes #1950 from jhsenjaliya/AIRFLOW-683
2017-01-16 17:46:21 +01:00
Bolke de Bruin 19ed9001b9 [AIRFLOW-740] Pin jinja2 to < 2.9.0
Jinja2 2.9.1 seems to have a conflict with flask-admin.
2017-01-07 19:53:01 +01:00
Bolke de Bruin 6fb94630c1 Merge branch 'api_v3' 2016-11-27 20:13:26 +01:00
Bolke de Bruin d5ac6bd9d0 [AIRFLOW-489] Add API Framework
This implements a framework for API calls to Airflow. Currently
all access is done by cli or web ui. Especially in the context
of the cli this raises security concerns which can be alleviated
with a secured API call over the wire.

Secondly integration with other systems is a bit harder if you have
to call a cli. For public facing endpoints JSON is used.

As an example the trigger_dag functionality is now made into a
API call.

Backwards compat is retained by switching to a LocalClient.
2016-11-27 19:44:31 +01:00
Siddharth Anand 41490f9c4b [AIRFLOW-651] Hotfix setup.py
Closes #1902 from r39132/hotfix
2016-11-23 10:18:46 -08:00
Li Xuanji dedc54eeaf [AIRFLOW-640] Install and enable nose-ignore-docstring
Closes #1896 from zodiac/nose-ignore-docstring
2016-11-20 17:38:24 -08:00
Li Xuanji ca6dbc6485 [AIRFLOW-639]AIRFLOW-639] Alphasort package names
Closes #1895 from zodiac/alphasort_requirements
2016-11-20 17:06:47 -08:00
Giovanni Briggs b93e6519cc [AIRFLOW-628] Adding SalesforceHook to contrib/hooks
Also added a salesforce option to setup.py

Closes #1881 from Jalepeno112/feature/salesforce-
hook
2016-11-18 11:10:32 -08:00
jedipi 12e48b4c62 [AIRFLOW-629] stop pinning lxml
Closes #1882 from jedipi/improvement/stop-pinning-
lxml
2016-11-14 23:26:25 -08:00
gtoonstra d8383038ac [AIRFLOW-591] Add datadog hook & sensor
Closes #1851 from gtoonstra/contrib_datadog
2016-11-14 07:21:08 +00:00
jedipi 39499e8aa8 [AIRFLOW-551] pin flask to >=0.11, <0.12
Closes #1825 from jedipi/improvement/upgrade-flask
2016-11-02 00:45:45 -07:00
jedipi 61f92b7f2b [AIRFLOW-552] upgrade funcsigs to 1.0.2
Closes #1826 from jedipi/improvements/upgrade-
funcsigs
2016-11-02 00:40:11 -07:00
George Leslie-Waksman edf033be65 [AIRFLOW-198] Implement latest_only_operator
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-198

Testing Done:
- Local testing of dag operation with
LatestOnlyOperator
- Unit test added

Closes #1752 from gwax/latest_only
2016-09-27 17:07:14 -07:00
Alex Van Boxel 247955d422 [AIRFLOW-468] Update Panda requirement to 0.17.1
BigQuery Hook requires at least Panda 0.17.1

Closes #1767 from alexvanboxel/feature/panda-
upgrade
2016-09-04 15:16:40 +02:00
Maxime Beauchemin 216e5c3c6b Bump version number to v1.7.2 2016-08-31 13:02:09 -07:00
Norman Mu d200f60084 [AIRFLOW-412] Fix lxml dependency
Dear Airflow Maintainers,

Please accept this PR that addresses the following issues:
- https://issues.apache.org/jira/browse/AIRFLOW-412

Testing Done:
-None

Closes #1722 from normster/lxml
2016-08-10 17:13:20 -07:00
Paul Yang fdb7e94914 [AIRFLOW-160] Parse DAG files through child processes
Instead of parsing the DAG definition files in the same process as the
scheduler, this change parses the files in a child process. This helps
to isolate the scheduler from bad user code.

Closes #1636 from plypaul/plypaul_schedule_by_file_rebase_master
2016-07-31 12:49:39 -07:00
Kevin Deldycke 00a591f7bf [AIRFLOW-340] Remove unused dependency on Babel
Closes #1668 from kdeldycke/remove-unused-babel-deps
2016-07-21 10:40:55 -07:00
Rob Froetscher 9f49f12853 [AIRFLOW-247] Add EMR hook, operators and sensors. Add AWS base hook
Closes #1630 from rfroetscher/emr
2016-06-30 15:50:27 -07:00
Maxime Beauchemin 4a84a578a5 Add an Apache Incubator Disclaimer and mocking modules
Closes #1634 from mistercrunch/mock_docs

Adding an Apache Incubator Disclaimer and mocking modules
2016-06-29 13:39:15 -07:00
jlowin 45b735baea [AIRFLOW-31] Add zope dependency
Closes #1608 from jlowin/standard-imports-2.
Also closes AIRFLOW-257.
2016-06-20 12:40:27 -04:00
Bolke de Bruin 0a460081bc [AIRFLOW-6] Remove dependency on Highcharts
Highcharts' license is not compatible with the Apache 2.0
license. This patch removes Highcharts in favor of d3,
however some charts are not supported anymore.

* This brings Maxime Beauchemin's work to master
2016-06-20 14:53:30 +02:00
jlowin 851adc5547 [AIRFLOW-31] Use standard imports for hooks/operators 2016-06-16 14:55:07 -04:00
Maxime Beauchemin 54b361d2a1 [AIRFLOW-238] Make compatible with flask-admin 1.4.1
The new flask-admin==1.4.1 release on 2016-06-13 breaks the Airflow
release currently in Pypi (1.7.1.2). This fixes the edge case triggered
by this new release.

* Closes #1588 on github
2016-06-14 12:22:32 +02:00
Dan Davydov 88f895aa63 Bump version to unblock pypi release 2016-05-20 17:49:28 -07:00
Maxime Beauchemin 893da2add9 1.7.1.1 2016-05-20 17:16:22 -07:00
Maxime Beauchemin 317ad5dda0 Pointing setup.py to then new repo 2016-05-20 17:14:21 -07:00
jlowin 98f10d5444 Merge pull request #1515 from jlowin/PR-MERGE 2016-05-20 14:40:14 -04:00
jlowin 7e56bd4f94 [AIRFLOW-134] Add PR merge script 2016-05-20 14:39:49 -04:00
Dan Davydov af11640592 [AIRFLOW-150] setup.py classifiers dict should be list 2016-05-19 17:46:19 -07:00
Siddharth Anand aedb667d50 Make enhancements to VersionView 2016-05-19 19:25:35 +00:00
Dan Davydov 5e40d9858d Merge branch '1522' 2016-05-19 11:12:13 -07:00
Dan Davydov 0b3d101ff5 [AIRFLOW-52] 1.7.1 version bump and changelog 2016-05-19 10:56:59 -07:00
Siddharth Anand 7d32c174f3 Add a version view to display airflow version info 2016-05-19 06:24:34 +00:00
Alex Van Boxel b7f0245e36 AIRFLOW-21 upgrade GCP client lib 2016-05-05 08:56:50 +02:00
Chris Riccomini 844eb2c8d0 AIRFLOW-15: Remove gcloud 2016-04-28 13:45:09 -07:00
Matt Pelland 11c34c4353
Implement a Cloudant hook 2016-04-19 16:11:54 -04:00
Chris Riccomini afb826aec2 Add PyOpenSSL to Google cloud gcp_api. 2016-04-12 12:22:49 -07:00
bolkedebruin 4865ee66ba Merge pull request #855 from bolkedebruin/ISSUE-852
Use proper signal handling and cascade signals to children (Fix #852)
2016-04-06 20:41:24 +02:00
Bolke de Bruin e8c1144bb8 Add consistent and thorough signal handling and logging
Airflow spawns childs in the form of a webserver, scheduler, and executors.
If the parent gets terminated (SIGTERM) it needs to properly propagate the
signals to the childs otherwise these will get orphaned and end up as
zombie processes. This patch resolves that issue.

In addition Airflow does not store the PID of its services so they can be
managed by traditional unix systems services like rc.d / upstart / systemd
and the likes. This patch adds the "--pid" flag. By default it stores the
PID in ~/airflow/airflow-<service>.pid

Lastly, the patch adds support for different log file locations: log,
stdout, and stderr (respectively: --log-file, --stdout, --stderr). By
default these are stored in ~/airflow/airflow-<service>.log/out/err.

* Resolves ISSUE-852
2016-04-06 20:40:43 +02:00
Paul Rhodes 81ff5cccb7 Allow Operators to specify SKIPPED status internally
* Added ability to skip DAG elements based on raised Exception

* Added nose-parameterized to test dependencies

* Fix for broken mysql test - provided by jlowin
2016-04-06 13:23:49 -04:00
Jeremiah Lowin 6581858703 Missing comma in setup.py 2016-04-05 08:21:43 -04:00
Jeremiah Lowin 10ee622330 Deprecate *args and **kwargs in BaseOperator
BaseOperator silently accepts any arguments. This deprecates the
behavior with a warning that says it will be forbidden in Airflow 2.0.

This PR also turns on DeprecationWarnings by default, which in turn
revealed that inspect.getargspec is deprecated. Here it is replaced by
`inspect.signature` (Python 3) or `funcsigs.signature` (Python 2).

Lastly, this brought to attention that example_http_operator was
passing an illegal argument.
2016-04-05 10:04:55 +02:00
Bolke de Bruin da43737e73 Add pypi meta data and sync version number 2016-03-31 21:42:30 +02:00
bolkedebruin f347ee709f Merge pull request #1128 from bolkedebruin/hivemeta_sasl
Add GSSAPI SASL to HiveMetaStoreHook.
2016-03-30 09:17:50 +02:00
jlowin 1981f6cb6e Fix required gcloud version 2016-03-28 18:33:39 -04:00
Bolke de Bruin 657aebbd0e Merge branch 'master' into hivemeta_sasl 2016-03-28 12:07:47 +02:00
Arthur Wiedmer a7c9c4a11a Changes to Contributing to reflect more closely the current state of development. 2016-03-21 15:12:08 -07:00
Bolke de Bruin 1b75315cd5 Merge remote-tracking branch 'upstream/master' into minicluster 2016-03-20 10:20:43 +01:00
Bolke de Bruin 5603afa5f7 Use unicodecsv to make it py3 compatible 2016-03-19 13:21:44 +01:00
Jiasi Zeng 4af24ee2df Allow users to set hdfs_namenode_principal in HDFSHook config
snakebite library just added the support to specify hdfs_namenode_principal
for Kerberos auth method, and this PR allows users to pass in this config from HDFSHook

Also bump the version of snakebite
2016-03-18 11:25:02 -07:00
Bolke de Bruin ee9d372d5a Merge branch 'impyla' into minicluster 2016-03-18 14:07:49 +01:00
Chris Riccomini 4c73677bf1 Merge pull request #1137 from jlowin/rls3
Remote Log Storage (take 2)
2016-03-17 16:08:53 -07:00
jlowin e2336faa79 version cap for gcp_api 2016-03-17 17:49:02 -04:00
Maxime Beauchemin cbf139cfb6 Cranking up slackclient dep to 1.0 2016-03-08 17:26:19 -05:00
Bolke de Bruin 166c78d9c9 Add GSSAPI SASL to HiveMetaStoreHook.
Will probably only work with python 2.7 until thrift 1.0 is released
2016-03-05 22:18:31 +01:00
Bolke de Bruin e45372fb55 ISSUE-1123 Use impyla instead of pyhs2 2016-03-05 20:15:41 +01:00
jlowin 7e9fb21bb8 Merge branch 'gcp_api' into gcp 2016-03-04 18:46:44 -05:00
jlowin a4b9e5849f rename gcp -> gcloud 2016-03-04 17:47:11 -05:00
jlowin bdd8ade782 add google-api-python-client to extras 2016-03-04 16:37:53 -05:00
jlowin 596270fcde add gcp extras 2016-03-04 16:26:04 -05:00
jlowin 4d7c75e1c5 move oauth2client<2 and httplib2 out of requirements 2016-03-04 16:07:04 -05:00
Chris Riccomini 668b879c2f Add direct dependencies for Google cloud contribs
Switch freeze version to use setup.py like other requirements do.
2016-02-17 07:56:40 -08:00
Maxime Beauchemin c4e4b0606a Adding mock lib to devel extras_require 2016-02-07 13:07:11 -08:00
Maxime Beauchemin eac1edd8ae Merge pull request #882 from asnir/docker_operator
Docker operator
2016-01-29 09:27:08 -08:00
Sumit Maheshwari 5779c18632 typos and xcom changes 2016-01-18 16:14:49 +05:30