Граф коммитов

4481 Коммитов

Автор SHA1 Сообщение Дата
Kaxil Naik 80d2ee8acc [AIRFLOW-2037] Add methods to get Hash values of a GCS object
- Added `get_md5hash` and `get_crc32c` in
`gcs_hook` to aid in Data integrity validations.

Closes #2977 from kaxil/hashing_gcs_hook
2018-01-31 12:52:13 +01:00
Fokko Driesprong 48202ad5bd [AIRFLOW-2050] Fix Travis permission problem
The travis build has issues with reading the Flask
Admin wheel.
Explicitly delete the wheel for now.

Closes #2993 from Fokko/fd-fix-permissions-travis
2018-01-31 11:06:51 +01:00
Sasa Brankovic afa6818a13 [AIRFLOW-2043] Add Intercom to list of companies
Closes #2985 from fox/add-intercom
2018-01-30 13:25:24 +01:00
Dan Davydov 61ff29e578 [AIRFLOW-2023] Add debug logging around number of queued files
Add debug logging around number of queued files to
process in the
scheduler. This makes it easy to see when there
are bottlenecks due to parallelism and how long it
takes for all files to be processed.

Closes #2968 from aoen/ddavydov--
add_more_scheduler_metrics
2018-01-29 12:02:02 -08:00
Romain NIO da0e628fa8 [AIRFLOW-XXX] Add Pernod-ricard as a airflow user
Closes #2983 from romain-
nio/AddPernodRicardAsAirflowUser
2018-01-29 13:38:52 +01:00
Swalloow 6b2ca2280d [AIRFLOW-1453] Add 'steps' into template_fields in EmrAddSteps
Closes #2981 from Swalloow/emr-add-steps
2018-01-28 20:45:34 +01:00
Yati Sagade efd8338dc8 [AIRFLOW-2015] Add flag for interactive runs
We capture the standard output and error streams
so that they're handled
by the configured logger. However, sometimes, when
developing dags or
Airflow code itself, it is useful to put pdb
breakpoints in code
triggered using an `airflow run`. Such a flow
would of course require
not redirecting the output and error streams to
the logger.

This patch enables that by adding a flag to the
`airflow run`
subcommand. Note that this does not require
`--local`.

Closes #2957 from yati-sagade/ysagade/airflow-2015
2018-01-28 20:41:14 +01:00
Bolke de Bruin a1d5551777 [AIRFLOW-1895] Fix primary key integrity for mysql
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
 migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.

Closes #2969, #2857

Closes #2979 from bolkedebruin/AIRFLOW-1895
2018-01-27 09:01:10 +01:00
Manish Untwal 0565bdc4ea [AIRFLOW-2030] Fix KeyError:`i` in DbApiHook for insert
Closes #2972 from untwal/dbahook/fix_key_error
2018-01-26 20:03:05 +01:00
Kaxil Naik e1bf38942f [AIRFLOW-1943] Add External BigQuery Table feature
Add ability to create a BigQuery External Table.
- Add new method create_external_table() in
BigQueryHook()
- Add parameters to existing
GoogleCloudStorageToBigQueryOperator()

Closes #2948 from kaxil/external_table
2018-01-26 11:57:12 +01:00
Kaxil Naik f9ddb36df1 [AIRFLOW-2033] Add Google Cloud Storage List Operator
Added an operator to get object names in a GCS
bucket filtered by prefix and delimiter with
example.

Closes #2974 from kaxil/gcs_list_op
2018-01-26 11:28:14 +01:00
Daniel Imberman 55f2674925 [AIRFLOW-2006] Add local log catching to kubernetes operator
Closes #2947 from dimberman/AIRFLOW-2006
-kubernetes-log-aggregation
2018-01-25 22:37:52 +01:00
Kaxil Naik 3dbfdafd7a [AIRFLOW-2031] Add missing gcp_conn_id in the example in DataFlow docstrings
- Added `gcp_conn_id` parameter in the examples
provided in docstrings for DataFlow operators.

Closes #2973 from kaxil/patch-3
2018-01-25 20:51:22 +01:00
fenglu-g bfbdeca653 [AIRFLOW-2029] Fix AttributeError in BigQueryPandasConnector
Closes #2971 from fenglu-g/master
2018-01-25 10:35:27 -08:00
Siddharth Anand cbc02da48f Kick mirroring 2018-01-24 15:46:05 -08:00
Stéphanie Baltus 2b6d112271 [AIRFLOW-2028] Add JobTeaser to official users list
Closes #2966 from stefani75/master
2018-01-24 15:39:24 -08:00
Siddharth Anand f662822772 Closes apache/incubator-airflow#1839 *No movement from submitter* 2018-01-24 15:16:45 -08:00
Siddharth Anand b780f39ea4 Closes apache/incubator-airflow#730 *No movement from submitter* 2018-01-24 15:16:07 -08:00
Dan Sedov 0990ba8c02 [AIRFLOW-2016] Add support for Dataproc Workflow Templates
Closes #2958 from DanSedov/master
2018-01-24 07:23:48 -08:00
Kristian Jones 18d09a9481 [AIRFLOW-2025] Reduced Logging verbosity
Closes #2967 from
britishbadger/reduce_logging_verbosity
2018-01-23 09:06:04 -08:00
Winston Huang 1021f68031 [AIRFLOW-1267][AIRFLOW-1874] Add dialect parameter to BigQueryHook
Allows a default BigQuery dialect to be specified
at the hook level, which is threaded through to
the
underlying cursors.

This allows standard SQL dialect to be used,
while maintaining compatibility with the
`DbApiHook` interface.

Addresses AIRFLOW-1267 and AIRFLOW-1874

Closes #2964 from ji-han/master
2018-01-22 18:27:40 +01:00
Daniel Francis 24bb2b7b6d [AIRFLOW-XXX] Fixed a typo
Closes #2954 from DanielWFrancis/patch-1
2018-01-22 10:34:36 +01:00
Max Countryman 375ed75ff1 [AIRFLOW-XXX] Typo node to nodes
Closes #2950 from maxcountryman/patch-1
2018-01-22 10:32:17 +01:00
Ivan Wirawan 2794819687 [AIRFLOW-2019] Update DataflowHook for updating Streaming type job
Closes #2965 from ivanwirawan/AIRFLOW-2019
2018-01-22 09:50:11 +01:00
Ace Haidrey 97ca9791c3 [AIRFLOW-2017][Airflow 2017] adding query output to PostgresOperator
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow 2017]
(https://issues.apache.org/jira/browse/AIRFLOW-201
7/) issues and references them in the PR title.
For example, "[AIRFLOW-2017] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2017

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Currently we're not getting the output logs of the
postgres operator that you would get otherwise if
you ran a psql command. It's because the postgres
conn has an attribute called [notices](http://init
d.org/psycopg/docs/connection.html#connection.noti
ces) which contains this information.
We need to just print the results of this to get
that output in the airflow logs, which makes it
easy to debug amongst other things.

I've included some images for before and after
pictures.
**BEFORE**
<img width="1146" alt="screen shot 2018-01-19 at 4
46 59 pm" src="https://user-images.githubuserconte
nt.com/10408007/35178405-6f6a1da8-fd3d-11e7-8f50-0
dbd567d8ab4.png">
**AFTER**
<img width="1147" alt="screen shot 2018-01-19 at 4
46 25 pm" src="https://user-images.githubuserconte
nt.com/10408007/35178406-74ea4ae6-fd3d-11e7-9551-6
31eac6bfe7b.png">

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
There isn't anything to test, there is nothing
changing to the current implementation besides an
addition of logging.

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #2959 from Acehaidrey/AIRFLOW-2017
2018-01-20 10:05:16 +01:00
Fokko Driesprong 33c7204212 [AIRFLOW-1889] Split sensors into separate files
Moving the sensors to seperate files increases
readability of the
code. Also this reduces the code in the big
core.py file.

Closes #2875 from Fokko/AIRFLOW-1889-move-sensors-
to-separate-package
2018-01-19 18:59:08 +01:00
Beau Barker e7c118da22 [AIRFLOW-1950] Optionally pass xcom_pull task_ids
Changes the `task_ids` parameter of xcom_pull from
required to optional.

This parameter has always allowed None to be
passed, but since it's a
required parameter, it must be specified as such.

With this change, we're no longer forced to pass
it.

Closes #2902 from bcb/make-xcom-pull-task-ids-
optional
2018-01-19 18:56:16 +01:00
Bolke de Bruin 1e36b37b68 [AIRFLOW-1755] Allow mount below root
This enables Airflow and Celery Flower to live
below root. Draws on the work of Geatan Semet
(@Stibbons).

This closes #2723 and closes #2818

Closes #2952 from bolkedebruin/AIRFLOW-1755
2018-01-19 18:54:26 +01:00
Ace Haidrey c3c4a8fdce [AIRFLOW-511][Airflow 511] add success/failure callbacks on dag level
Closes #2934 from Acehaidrey/AIRFLOW-511
2018-01-19 18:53:27 +01:00
wongwill86 dd2bc8cb97 [AIRFLOW-192] Add weight_rule param to BaseOperator
Improved task generation performance significantly
by using sets of
task_ids and dag_ids instead of lists when
calculating total priority
weight.

Closes #2941 from wongwill86/performance-latest
2018-01-18 16:09:46 +01:00
Bolke de Bruin fbba5ef7c3 [AIRFLOW-2008] Use callable for python column defaults
We were using timezone.utcnow() instead of a
callable. This
resulted in inserts with all the same values.

Closes #2949 from bolkedebruin/AIRFLOW-2008
2018-01-18 13:57:00 +01:00
Siddharth Anand bc72231b13 Closes apache/incubator-airflow#2873 *Not merging due to inactivity* 2018-01-17 17:37:22 -08:00
Siddharth Anand c69080576b Closes apache/incubator-airflow#2919 *Not merging* 2018-01-17 17:35:13 -08:00
Richard Baron Penman 59e3598190 [AIRFLOW-1984] Fix to AWS Batch operator
Correct key is "container" rather than "attempts":
https://docs.aws.amazon.com/batch/latest/APIRefere
nce/API_DescribeJobs.html

Closes #2927 from richardpenman/master
2018-01-16 19:25:48 +01:00
fenglu-g f6a1c3cf7f [AIRFLOW-2000] Support non-main dataflow job class
Closes #2942 from fenglu-g/master
2018-01-16 09:32:32 -08:00
Bolke de Bruin 88130a5d7e [AIRFLOW-2003] Use flask-caching instead of flask-cache
Flask-cache has been unmaintained for over three
years,
flask-caching is the community supported version.

Closes #2944 from bolkedebruin/AIRFLOW-2003
2018-01-15 21:12:03 +01:00
Bolke de Bruin a34a4865b1 [AIRFLOW-2002] Do not swallow exception on logging import
Closes #2945 from bolkedebruin/AIRFLOW-2002
2018-01-15 21:06:59 +01:00
Bolke de Bruin 7cf7cd7cae [AIRFLOW-2004] Import flash from flask not flask.login
Closes #2943 from bolkedebruin/AIRFLOW-2004
2018-01-15 21:04:38 +01:00
Bolke de Bruin 1abe7f6d54 Merge pull request #2853 from dimberman/Airflow_1517_kubenetes_operator 2018-01-12 19:02:52 +01:00
Kaxil Naik b48bbbd6f1 [AIRFLOW-1997] Fix GCP operator doc strings
Closes #2939 from kaxil/docstring_fix
2018-01-12 08:59:47 -08:00
Ivan Wirawan a2bb2d70af [AIRFLOW-1996] Update DataflowHook waitfordone for Streaming type job[]
AIRFLOW-1996 Update DataflowHook waitfordone for
Streaming type job

fix flake8

Closes #2938 from ivanwirawan/AIRFLOW-1996
2018-01-12 15:55:14 +01:00
Ace Haidrey 147472b99b [AIRFLOW-1995][Airflow 1995] add on_kill method to SqoopOperator
Closes #2936 from Acehaidrey/AIRFLOW-1995
2018-01-12 15:08:29 +01:00
Alan Ma eb994d683f [AIRFLOW-1770] Allow HiveOperator to take in a file
Clarify and upgrade HiveOperator. Include
description of hql parameter being able to
take in a relative path from the dag file
of a hive script, templated or not. Add
ability to template hiveconf variables. Add
default value to the map reduce job name as
well as add updated hiveconf var for queue.

Closes #2752 from wolfier/AIRFLOW-1770
2018-01-12 11:01:11 +01:00
Alan Ma c208a41fc2 [AIRFLOW-1994] Change background color of Scheduled state Task Instances
On Task Instances page 'Scheduled' state TIs are
not-visible due to white
background. Changing background color to tan for
better UX.

Closes #2933 from wolfier/AIRFLOW-1994
2018-01-12 09:36:25 +01:00
Swalloow 404bee8d85 [AIRFLOW-1436][AIRFLOW-1475] EmrJobFlowSensor considers Cancelled step as Successful
Closes #2937 from Swalloow/master
2018-01-12 09:26:10 +01:00
GRANT NICHOLAS 7fb5906e68 [AIRFLOW-1517] Kubernetes operator PR fixes
Fix python flake8 linting issues and AIRFLOW license issues
2018-01-11 15:29:34 -08:00
Daniel Imberman d5b13a3dad [AIRFLOW-1517] addressed PR comments 2018-01-11 15:29:27 -08:00
Daniel Imberman 12b725df15 [AIRFLOW-1517] started documentation of k8s operator 2018-01-11 15:29:17 -08:00
GRANT NICHOLAS 28d9d7f00f [AIRFLOW-1517] Restore authorship of resources
Collaboration authors got destroyed when splitting up a PR, this commit adds back in the code which was be removed in the previous commit to restore authorship
2018-01-11 15:29:17 -08:00
GRANT NICHOLAS 540b724a0d [AIRFLOW-1517] Remove authorship of resources
Collaboration authors got destroyed when splitting up a PR, this commit removes code which will be readded in the next commit to restore authorship
2018-01-11 15:29:16 -08:00