Граф коммитов

4853 Коммитов

Автор SHA1 Сообщение Дата
Bolke de Bruin 305a787e33 Bump version 2018-04-23 19:26:40 +02:00
Andrew Chen 2b030699de [AIRFLOW-1652] Push DatabricksRunSubmitOperator metadata into XCOM
[AIRFLOW-1652] Push DatabricksRunSubmitOperator
metadata into XCOM

Push run_id and run_page_url into xcom so
callbacks and other
tasks can reference this information

address comments

Closes #2641 from andrewmchen/databricks-xcom
2018-04-23 19:14:27 +02:00
Bolke de Bruin 1e82e11aed Merge pull request #3257 from artwr/awiedmer-fix-issue-with-jdbc-autocommit 2018-04-23 19:08:24 +02:00
Kengo Seki 65b6ceae74 [AIRFLOW-2234] Enable insert_rows for PrestoHook
PrestoHook.insert_rows() raises
NotImplementedError for now.
But Presto 0.126+ allows specifying column names
in INSERT queries,
so we can leverage DbApiHook.insert_rows() almost
as is.
This PR enables this function.

Closes #3146 from sekikn/AIRFLOW-2234
2018-04-23 19:01:38 +02:00
Ian Suvak ed93290175 [AIRFLOW-2208][Airflow-22208] Link to same DagRun graph from TaskInstance view
Allow graph view to accept blank execution_date
and pass it
through when it's available.

Closes #3132 from iansuvak/persistent_graph
2018-04-23 18:59:58 +02:00
Alan Ma 09bbe24772 [AIRFLOW-1153] Allow HiveOperators to take hiveconfs
HiveOperator can only replace variables via jinja
and the replacements
are global to the dag through the context and
user_defined_macros.
It would be much more flexible to open up
hive_conf to the HiveOperator
level so hive scripts can be recycled at the task
level, leveraging
HiveHook already existing hive_conf param and
_prepare_hiveconf
function.

Closes #3136 from wolfier/AIRFLOW-1153
2018-04-23 18:56:29 +02:00
Arthur Wiedmer 97954e2122 [AIRFLOW-775] Fix autocommit settings with Jdbc hook 2018-04-23 09:52:08 -07:00
Arthur Wiedmer a33b29c851 [AIRFLOW-2364] Warn when setting autocommit on a connection which does not support it 2018-04-23 09:52:08 -07:00
Fokko Driesprong e30a1f451a [AIRFLOW-2357] Add persistent volume for the logs
The logs are kept inside of the worker pod. By
attaching a persistent
disk we keep the logs and make them available for
the webserver.

- Remove the requirements.txt since we dont want
to maintain another
  dependency file
- Fix some small casing stuff
- Removed some unused code
- Add missing shebang lines
- Started on some docs
- Fixed the logging

Closes #3252 from Fokko/airflow-2357-pd-for-logs
2018-04-23 18:43:24 +02:00
Arthur Wiedmer 3450f526ce [AIRFLOW-766] Skip conn.commit() when in Auto-commit 2018-04-23 09:42:10 -07:00
Bolke de Bruin 1d3bb54707 [AIRFLOW-2351] Check for valid default_args start_date
A bug existed when default_args did contain
start_date
but it was set to None, failing to instantiate the
DAG.

Closes #3256 from bolkedebruin/AIRFLOW-2351
2018-04-23 16:37:55 +02:00
William Pursell 6da88bb420 [AIRFLOW-1433] Set default rbac to initdb
05e1861e24 breaks
the api for
any program directly importing airflow.utils.
This sets
a reasonable default.

Closes #3240 from wrp/initdb
2018-04-23 14:35:43 +02:00
Winston Huang a704b541fe [AIRFLOW-2270] Handle removed tasks in backfill
Fix issue with backfill jobs of dags, where tasks
in the
removed state are not run but still considered to
be pending,
causing an indefinite loop.

Closes #3176 from ji-han/AIRFLOW-
2270_dag_backfill_removed_tasks
2018-04-23 12:14:52 +02:00
Kengo Seki 0d199e5f37 [AIRFLOW-2344] Fix `connections -l` to work with pipe/redirect
`airflow connections -l` uses 'tabulate' package
with
fancy_grid format, which outputs box drawing
characters.
It can occur UnicodeEncodeError with pipe or
redirect,
since the default encoding for Python 2.x is
ascii.
This PR fixes it and contains some flask8 related
fixes.

Closes #3244 from sekikn/AIRFLOW-2344
2018-04-23 11:12:52 +02:00
Kengo Seki 49826af108 [AIRFLOW-2300] Add S3 Select functionarity to S3ToHiveTransfer
To improve efficiency and usability, this PR adds
S3 Select functionarity to S3ToHiveTransfer.
It also contains some minor fixes for documents
and comments.

Closes #3243 from sekikn/AIRFLOW-2300
2018-04-23 08:57:23 +02:00
Daniel Imberman a15b7c5b79 [AIRFLOW-1314] Cleanup the config
Closes #2414 from bloomberg:airflow-kubernetes-executor
2018-04-22 10:24:18 +02:00
Fokko Driesprong d807830fe9 [AIRFLOW-1314] Polish some of the Kubernetes docs/config 2018-04-22 10:23:06 +02:00
Jordan Zucker 317b6c7bd5 [AIRFLOW-1314] Improve error handling
Handle too old resource versions and throw exceptions on errors

- K8s API errors will now throw Airflow exceptions
- Add scheduler uuid to worker pod labels to match the two
2018-04-22 10:23:06 +02:00
fenglu-g cdb43cb87c [AIRFLOW-1999] Add per-task GCP service account support 2018-04-22 10:23:06 +02:00
Daniel Imberman b9a87a07e3 [AIRFLOW-1314] Rebasing against master 2018-04-22 10:23:06 +02:00
Benjamin Goldberg 309f764aa3 [AIRFLOW-1314] Small cleanup to address PR comments (#24)
* Small cleanup to address PR comments

* Remove use of enum

* Change back to 3.4
2018-04-22 10:23:06 +02:00
Grant Nicholas c0920efc01 [AIRFLOW-1314] Add executor_config and tests
* Added in executor_config to the task_instance table and the base_operator table

* Fix test; bump up number of examples

* Fix up comments from PR

* Exclude the kubernetes example dag from a test

* Fix dict -> KubernetesExecutorConfig

* fixed up executor_config comment and type hint
2018-04-22 10:23:06 +02:00
fenglu-g ad4e67ce1b [AIRFLOW-1314] Improve k8s support
Add kubernetes config section in airflow.cfg and Inject GCP secrets upon executor start. (#17)
Update Airflow to Pass configuration to k8s containers, add some Py3 … (#9)

* Update Airflow to Pass configuration to k8s containers, add some Py3 compat., create git-sync pod

* Undo changes to display-source config setter for to_dict

* WIP Secrets and Configmaps

* Improve secrets support for multiple secrets. Add support for registry secrets. Add support for RBAC service accounts.

* Swap order of variables, overlooked very basic issue

* Secret env var names must be upper

* Update logging

* Revert spothero test code in setup.py

* WIP Fix tests

* Worker should be using local executor

* Consolidate worker setup and address code review comments

* reconfigure airflow script to use new secrets method
2018-04-22 10:23:06 +02:00
grantnicholas a9d90dc9a5 [AIRFLOW-1314] Use VolumeClaim for transporting DAGs
- fix issue where watcher process randomly dies
- fixed alembic head, was pointing to two tips
2018-04-22 10:22:44 +02:00
dimberman 29daa58ec0 [AIRFLOW-1314] Create integration testing environment 2018-04-22 10:17:39 +02:00
grantnicholas bb1e05c3fa [AIRFLOW-1314] Git Mode to pull in DAGs for Kubernetes Executor 2018-04-22 10:17:39 +02:00
nyeganeh c177d6e863 [AIRFLOW-1314] Add support for volume mounts & Secrets in Kubernetes
Executor
2018-04-22 10:17:39 +02:00
dimberman 5821320880 [AIRFLOW=1314] Basic Kubernetes Mode 2018-04-22 10:17:39 +02:00
Berislav Lopac f520990fe0 [AIRFLOW-2326][AIRFLOW-2222] remove contrib.gcs_copy_operator
Closes #3232 from berislavlopac/AIRFLOW-2326
2018-04-21 08:46:11 +02:00
Guillermo Rodríguez Cano b5f758bb61 [AIRFLOW-2328] Fix empty GCS blob in S3ToGoogleCloudStorageOperator
Closes #3231 from wileeam/fix-empty-blob-in-s3-to-
gcs-operator
2018-04-21 08:36:38 +02:00
DerekRoy 8e83e2b3ef [AIRFLOW-2350] Fix grammar in UPDATING.md
Closes #3248 from r39132/patch-1
2018-04-21 08:34:16 +02:00
r39132 2c1052d100 closes apache/incubator-airflow#3187 *Closed for inactivity* 2018-04-20 09:35:32 -07:00
Kengo Seki 4c02ad76a6 [AIRFLOW-2302] Fix documentation
Closes #3226 from sekikn/AIRFLOW-2302
2018-04-20 10:05:35 +02:00
Sam Garrett efc316d2ad [AIRFLOW-2345] pip is not used in this setup.py
Closes #3241 from sinemetu1/patch-1
2018-04-20 10:03:08 +02:00
r39132 e6145784e6 closes apache/incubator-airflow#3225 *Closed for inactivity* 2018-04-19 18:37:34 -07:00
Sid Anand f1e65c4897 [AIRFLOW-2347] Add Banco de Formaturas to Readme
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2347
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Add a company to the README

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason: N/A -- documentation update only

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3242 from
r39132/Add_banco_Formaturas_to_readme
2018-04-19 18:33:33 -07:00
Sven Varkel c208a56682 [AIRFLOW-2346] Add Investorise as official user of Airflow
Closes #3238 from svenvarkel/master
2018-04-19 18:20:37 -07:00
Berislav Lopac 17d3d1d9dc [AIRFLOW-2330] Do not append destination prefix if not given
Closes #3233 from berislavlopac/AIRFLOW-2330
2018-04-19 10:26:23 +02:00
Marius van Niekerk e95a1251b7 [AIRFLOW-2240][DASK] Added TLS/SSL support for the dask-distributed scheduler.
As of 0.17.0 dask distributed has support for
TLS/SSL.

[dask] Added TLS/SSL support for the dask-
distributed scheduler.

As of 0.17.0 dask distributed has support for
TLS/SSL.

Add a test for tls under dask distributed

Closes #2683 from mariusvniekerk/dask-ssl
2018-04-18 09:45:52 -07:00
John Arnold (AZURE) 3fa55db90c [AIRFLOW-2309] Fix duration calculation on TaskFail
Closes #3208 from johnarnold/duration
2018-04-18 10:28:13 +02:00
Daniel Imberman 0f8507ae35 [AIRFLOW-2335] fix issue with jdk8 download for ci
Make sure you have checked _all_ steps below.

- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2335
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

- [x] Here are some details about my PR, including
screenshots of any UI changes:

There is an issue with travis pulling jdk8 that is
preventing CI jobs from running. This blocks
further development of the project.

Reference: https://github.com/travis-ci/travis-
ci/issues/9512#issuecomment-382235301

- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

This PR can't be unit tested since it is just
configuration. However, the fact that unit tests
run successfully should show that it works.

- [ ] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

- [ ] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

- [ ] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3236 from dimberman/AIRFLOW-
2335_travis_issue
2018-04-17 21:57:42 -07:00
Tao feng 3f1bfd38cd [AIRFLOW-2184] Add druid_checker_operator
Closes #3228 from feng-tao/airflow-2184
2018-04-17 11:12:41 +02:00
Kengo Seki 6e82f1d7c9 [AIRFLOW-2299] Add S3 Select functionarity to S3FileTransformOperator
Currently, S3FileTransformOperator downloads the
whole file from S3
before transforming and uploading it. Adding
extraction feature using
S3 Select to this operator improves its efficiency
and usablitily.

Closes #3227 from sekikn/AIRFLOW-2299
2018-04-17 10:53:05 +02:00
Sathyaprakash Govindasamy a148043107 [AIRFLOW-2254] Put header as first row in unload
Currently, data is ordered by first column in
descending order
Header row comes as first only if the first column
is integer
This fix puts header as first row regardless of
first column data type

Closes #3180 from sathyaprakashg/AIRFLOW-2254
2018-04-16 10:21:22 +02:00
Carl Johan Gustavsson 32c5f445e4 [AIRFLOW-610] Respect _cmd option in config before defaults
The command versions of config parameters were
overriden by the
default config. E.g sql_alchemy_conn got the
default value even
when sql_alchemy_conn_cmd was specified.

Closes #3029 from cjgu/airflow-610
2018-04-16 10:12:08 +02:00
Bolke de Bruin c7a472ed6b [AIRFLOW-2287] Fix incorrect ASF headers
Closes #3219 from bolkedebruin/fix_header
2018-04-14 09:13:23 +02:00
Rui Lopes b3c0ae0030 [AIRFLOW-XXX] Add Zego as an Apache Airflow user
Closes #3224 from ruimffl/add-zego-as-user
2018-04-13 15:55:35 +02:00
Sam Sen a27ea11d64 [AIRFLOW-952] fix save empty extra field in UI
Closes #3222 from luckytaxi/AIRFLOW-952-fix-empty-
extra-field
2018-04-13 11:15:49 +02:00
Kevin Yang ec38ba9594 [AIRFLOW-1325] Add ElasticSearch log handler and reader
Closes #3214 from
yrqls21/kevin_yang_add_es_task_handler
2018-04-13 11:09:50 +02:00
Guillermo Rodriguez Cano 34f827f04c [AIRFLOW-2301] Sync files of an S3 key with a GCS path
Closes #3216 from wileeam/s3-to-gcs-operator
2018-04-13 09:32:22 +02:00