Граф коммитов

4496 Коммитов

Автор SHA1 Сообщение Дата
niels c458a22cfd [AIRFLOW-1760] Password auth for experimental API
Modified the Password authentication to support
HTTP Basic auth

Closes #2730 from NielsZeilemaker/AIRFLOW-1760
2018-02-06 11:22:21 +01:00
Fokko Driesprong e76cda0ff5 [AIRFLOW-2038] Add missing kubernetes dependency for dev
When doing initdb, it fails on the kubernetes
dependency from the
examples

Closes #2978 from Fokko/fd-fix-dependencies
2018-02-05 20:52:28 +01:00
Fabrice 0e748ad86a [AIRFLOW-2040] Escape special chars in task instance logs URL
Closes #2982 from cinhil/AIRFLOW-2040
2018-02-05 16:31:02 +01:00
Stephen Baynham 82a65eeca9 [AIRFLOW-1968][AIRFLOW-1520] Add role_arn and aws_account_id/aws_iam_role support back to aws hook
In PR2532 (AIRFLOW-1520), the AWS credential code
was refactored into a general
 AWS hook.  When that change was made, the existing
assume role code was
 removed, leaving only ID/Secret credentials as an
option.  Our dags rely on
 role assumption to access external S3 buckets, so
this code re-adds role
 assumption via STS.

Additionally, in order to make this a bit easier,
I changed _get_credentials to
 return a functioning boto3 session which is used
by the public methods to
 initialize clients/resources/whatever.  This
seemed a better route than
 adding another returnval in an already long list.

Closes #2918 from CannibalVox/aws_hook_support_sts
2018-02-05 14:15:23 +01:00
Matthew Bowden 1bf5411808 [AIRFLOW-2048] Fix task instance failure string formatting
Closes #2990 from AetherUnbound/bugfix/job-error-
collection
2018-02-05 13:39:41 +01:00
Toby Dacre 670658f30d [AIRFLOW-2046] Fix kerberos error to work with python 3.x
Because Popen returns bytes and not str in python3
we need to join it
using bytes.  This simple fix ensures that we join
with a byte.  Python
2.7 is unaffected by this.

Closes #2988 from tobes/kerberos-python3
2018-02-05 13:35:36 +01:00
Kaxil Naik f4e3e352e1 [AIRFLOW-2063] Add missing docs for GCP
- Add missing operator in `code.rst` and
`integration.rst`
- Fix documentation in DataProc operator
- Minor doc fix in GCS operators

- Fixed codeblocks & links in docstrings for
BigQuery, DataProc, DataFlow, MLEngine, GCS hooks
& operators

Closes #3003 from kaxil/doc_update
2018-02-05 10:48:04 +01:00
cbockman 49ac26dad3 [AIRFLOW-XXX] Fix typo in docs
Closes #3004 from cbockman/patch-1
2018-02-05 10:45:42 +01:00
kodieg 6f60304529 [AIRFLOW-1793] Use docker_url instead of invalid base_url
Closes #2998 from kodieg/patch-3
2018-02-04 14:39:14 +01:00
Matthew Bowden 15b8b7a851 [AIRFLOW-2055] Elaborate on slightly ambiguous documentation
Closes #2999 from AetherUnbound/bugfix/var-doc-
reference
2018-02-03 20:04:01 +01:00
Yu ISHIKAWA 3f50f6bd5b [AIRFLOW-2039] BigQueryOperator supports priority property
Closes #2980 from yu-iskw/modify-bqoperator
2018-02-02 09:33:54 +01:00
Kaxil Naik fd4360b9f0 [AIRFLOW-2053] Fix quote character bug in BQ hook
Modified the condition to check if the
quote_character is set. This will allow to set
`quote_character` as empty string when the data
doesn't contain quoted sections.

Closes #2996 from kaxil/bq_hook_quote_fix
2018-02-02 09:31:35 +01:00
Matthew Housley ba0b1978d3 [AIRFLOW-2057] Add Overstock to list of companies
Closes #3001 from mhousley/add-overstock-to-list
2018-02-01 17:17:24 -08:00
Austin Gibbons 6d88744bee [AIRFLOW-XXX] Add Plaid to Airflow users
Closes #2995 from AustinBGibbons/master
2018-02-01 10:04:14 +01:00
Debdutto Chakraborty 6ee4bbd4b1 [AIRFLOW-2044] Add SparkSubmitOperator to documentation
Added community contributed SparkSubmitOperator to
API documentation

Closes #2987 from Debdutto/docs/added-spark-
submit-operator
2018-01-31 12:58:34 +01:00
Kaxil Naik 80d2ee8acc [AIRFLOW-2037] Add methods to get Hash values of a GCS object
- Added `get_md5hash` and `get_crc32c` in
`gcs_hook` to aid in Data integrity validations.

Closes #2977 from kaxil/hashing_gcs_hook
2018-01-31 12:52:13 +01:00
Fokko Driesprong 48202ad5bd [AIRFLOW-2050] Fix Travis permission problem
The travis build has issues with reading the Flask
Admin wheel.
Explicitly delete the wheel for now.

Closes #2993 from Fokko/fd-fix-permissions-travis
2018-01-31 11:06:51 +01:00
Sasa Brankovic afa6818a13 [AIRFLOW-2043] Add Intercom to list of companies
Closes #2985 from fox/add-intercom
2018-01-30 13:25:24 +01:00
Dan Davydov 61ff29e578 [AIRFLOW-2023] Add debug logging around number of queued files
Add debug logging around number of queued files to
process in the
scheduler. This makes it easy to see when there
are bottlenecks due to parallelism and how long it
takes for all files to be processed.

Closes #2968 from aoen/ddavydov--
add_more_scheduler_metrics
2018-01-29 12:02:02 -08:00
Romain NIO da0e628fa8 [AIRFLOW-XXX] Add Pernod-ricard as a airflow user
Closes #2983 from romain-
nio/AddPernodRicardAsAirflowUser
2018-01-29 13:38:52 +01:00
Swalloow 6b2ca2280d [AIRFLOW-1453] Add 'steps' into template_fields in EmrAddSteps
Closes #2981 from Swalloow/emr-add-steps
2018-01-28 20:45:34 +01:00
Yati Sagade efd8338dc8 [AIRFLOW-2015] Add flag for interactive runs
We capture the standard output and error streams
so that they're handled
by the configured logger. However, sometimes, when
developing dags or
Airflow code itself, it is useful to put pdb
breakpoints in code
triggered using an `airflow run`. Such a flow
would of course require
not redirecting the output and error streams to
the logger.

This patch enables that by adding a flag to the
`airflow run`
subcommand. Note that this does not require
`--local`.

Closes #2957 from yati-sagade/ysagade/airflow-2015
2018-01-28 20:41:14 +01:00
Bolke de Bruin a1d5551777 [AIRFLOW-1895] Fix primary key integrity for mysql
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
 migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.

Closes #2969, #2857

Closes #2979 from bolkedebruin/AIRFLOW-1895
2018-01-27 09:01:10 +01:00
Manish Untwal 0565bdc4ea [AIRFLOW-2030] Fix KeyError:`i` in DbApiHook for insert
Closes #2972 from untwal/dbahook/fix_key_error
2018-01-26 20:03:05 +01:00
Kaxil Naik e1bf38942f [AIRFLOW-1943] Add External BigQuery Table feature
Add ability to create a BigQuery External Table.
- Add new method create_external_table() in
BigQueryHook()
- Add parameters to existing
GoogleCloudStorageToBigQueryOperator()

Closes #2948 from kaxil/external_table
2018-01-26 11:57:12 +01:00
Kaxil Naik f9ddb36df1 [AIRFLOW-2033] Add Google Cloud Storage List Operator
Added an operator to get object names in a GCS
bucket filtered by prefix and delimiter with
example.

Closes #2974 from kaxil/gcs_list_op
2018-01-26 11:28:14 +01:00
Daniel Imberman 55f2674925 [AIRFLOW-2006] Add local log catching to kubernetes operator
Closes #2947 from dimberman/AIRFLOW-2006
-kubernetes-log-aggregation
2018-01-25 22:37:52 +01:00
Kaxil Naik 3dbfdafd7a [AIRFLOW-2031] Add missing gcp_conn_id in the example in DataFlow docstrings
- Added `gcp_conn_id` parameter in the examples
provided in docstrings for DataFlow operators.

Closes #2973 from kaxil/patch-3
2018-01-25 20:51:22 +01:00
fenglu-g bfbdeca653 [AIRFLOW-2029] Fix AttributeError in BigQueryPandasConnector
Closes #2971 from fenglu-g/master
2018-01-25 10:35:27 -08:00
Siddharth Anand cbc02da48f Kick mirroring 2018-01-24 15:46:05 -08:00
Stéphanie Baltus 2b6d112271 [AIRFLOW-2028] Add JobTeaser to official users list
Closes #2966 from stefani75/master
2018-01-24 15:39:24 -08:00
Siddharth Anand f662822772 Closes apache/incubator-airflow#1839 *No movement from submitter* 2018-01-24 15:16:45 -08:00
Siddharth Anand b780f39ea4 Closes apache/incubator-airflow#730 *No movement from submitter* 2018-01-24 15:16:07 -08:00
Dan Sedov 0990ba8c02 [AIRFLOW-2016] Add support for Dataproc Workflow Templates
Closes #2958 from DanSedov/master
2018-01-24 07:23:48 -08:00
Kristian Jones 18d09a9481 [AIRFLOW-2025] Reduced Logging verbosity
Closes #2967 from
britishbadger/reduce_logging_verbosity
2018-01-23 09:06:04 -08:00
Winston Huang 1021f68031 [AIRFLOW-1267][AIRFLOW-1874] Add dialect parameter to BigQueryHook
Allows a default BigQuery dialect to be specified
at the hook level, which is threaded through to
the
underlying cursors.

This allows standard SQL dialect to be used,
while maintaining compatibility with the
`DbApiHook` interface.

Addresses AIRFLOW-1267 and AIRFLOW-1874

Closes #2964 from ji-han/master
2018-01-22 18:27:40 +01:00
Daniel Francis 24bb2b7b6d [AIRFLOW-XXX] Fixed a typo
Closes #2954 from DanielWFrancis/patch-1
2018-01-22 10:34:36 +01:00
Max Countryman 375ed75ff1 [AIRFLOW-XXX] Typo node to nodes
Closes #2950 from maxcountryman/patch-1
2018-01-22 10:32:17 +01:00
Ivan Wirawan 2794819687 [AIRFLOW-2019] Update DataflowHook for updating Streaming type job
Closes #2965 from ivanwirawan/AIRFLOW-2019
2018-01-22 09:50:11 +01:00
Ace Haidrey 97ca9791c3 [AIRFLOW-2017][Airflow 2017] adding query output to PostgresOperator
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow 2017]
(https://issues.apache.org/jira/browse/AIRFLOW-201
7/) issues and references them in the PR title.
For example, "[AIRFLOW-2017] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2017

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Currently we're not getting the output logs of the
postgres operator that you would get otherwise if
you ran a psql command. It's because the postgres
conn has an attribute called [notices](http://init
d.org/psycopg/docs/connection.html#connection.noti
ces) which contains this information.
We need to just print the results of this to get
that output in the airflow logs, which makes it
easy to debug amongst other things.

I've included some images for before and after
pictures.
**BEFORE**
<img width="1146" alt="screen shot 2018-01-19 at 4
46 59 pm" src="https://user-images.githubuserconte
nt.com/10408007/35178405-6f6a1da8-fd3d-11e7-8f50-0
dbd567d8ab4.png">
**AFTER**
<img width="1147" alt="screen shot 2018-01-19 at 4
46 25 pm" src="https://user-images.githubuserconte
nt.com/10408007/35178406-74ea4ae6-fd3d-11e7-9551-6
31eac6bfe7b.png">

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
There isn't anything to test, there is nothing
changing to the current implementation besides an
addition of logging.

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #2959 from Acehaidrey/AIRFLOW-2017
2018-01-20 10:05:16 +01:00
Fokko Driesprong 33c7204212 [AIRFLOW-1889] Split sensors into separate files
Moving the sensors to seperate files increases
readability of the
code. Also this reduces the code in the big
core.py file.

Closes #2875 from Fokko/AIRFLOW-1889-move-sensors-
to-separate-package
2018-01-19 18:59:08 +01:00
Beau Barker e7c118da22 [AIRFLOW-1950] Optionally pass xcom_pull task_ids
Changes the `task_ids` parameter of xcom_pull from
required to optional.

This parameter has always allowed None to be
passed, but since it's a
required parameter, it must be specified as such.

With this change, we're no longer forced to pass
it.

Closes #2902 from bcb/make-xcom-pull-task-ids-
optional
2018-01-19 18:56:16 +01:00
Bolke de Bruin 1e36b37b68 [AIRFLOW-1755] Allow mount below root
This enables Airflow and Celery Flower to live
below root. Draws on the work of Geatan Semet
(@Stibbons).

This closes #2723 and closes #2818

Closes #2952 from bolkedebruin/AIRFLOW-1755
2018-01-19 18:54:26 +01:00
Ace Haidrey c3c4a8fdce [AIRFLOW-511][Airflow 511] add success/failure callbacks on dag level
Closes #2934 from Acehaidrey/AIRFLOW-511
2018-01-19 18:53:27 +01:00
wongwill86 dd2bc8cb97 [AIRFLOW-192] Add weight_rule param to BaseOperator
Improved task generation performance significantly
by using sets of
task_ids and dag_ids instead of lists when
calculating total priority
weight.

Closes #2941 from wongwill86/performance-latest
2018-01-18 16:09:46 +01:00
Bolke de Bruin fbba5ef7c3 [AIRFLOW-2008] Use callable for python column defaults
We were using timezone.utcnow() instead of a
callable. This
resulted in inserts with all the same values.

Closes #2949 from bolkedebruin/AIRFLOW-2008
2018-01-18 13:57:00 +01:00
Siddharth Anand bc72231b13 Closes apache/incubator-airflow#2873 *Not merging due to inactivity* 2018-01-17 17:37:22 -08:00
Siddharth Anand c69080576b Closes apache/incubator-airflow#2919 *Not merging* 2018-01-17 17:35:13 -08:00
Richard Baron Penman 59e3598190 [AIRFLOW-1984] Fix to AWS Batch operator
Correct key is "container" rather than "attempts":
https://docs.aws.amazon.com/batch/latest/APIRefere
nce/API_DescribeJobs.html

Closes #2927 from richardpenman/master
2018-01-16 19:25:48 +01:00
fenglu-g f6a1c3cf7f [AIRFLOW-2000] Support non-main dataflow job class
Closes #2942 from fenglu-g/master
2018-01-16 09:32:32 -08:00