Граф коммитов

4104 Коммитов

Автор SHA1 Сообщение Дата
Ash Berlin-Taylor a6b23a36e0 [AIRFLOW-1594] Don't install test packages into python root.[]
By default `find_packages()` will find _any_ valid
python package,
including things under tests. We don't want to
install the tests
packages into the python path, so exclude those.

Closes #2597 from ashb/AIRFLOW-1594-dont-install-
tests
2017-09-13 10:11:39 +02:00
Fokko Driesprong a7a518902d [AIRFLOW-1582] Improve logging within Airflow
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).

Closes #2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
2017-09-13 09:36:58 +02:00
Bolke de Bruin 5de632e07b Merge branch 'pr_nicer' 2017-09-11 15:24:11 +02:00
Maxime Beauchemin da76ac72e8 [AIRFLOW-1476] add INSTALL instruction for source releases
Closes #2492 from mistercrunch/install
2017-09-11 15:23:29 +02:00
Bolke de Bruin f9dcc7d6e0 [AIRFLOW-XXX] Save username and password in airflow-pr 2017-09-11 15:19:58 +02:00
Alex Guziel 8a2d24856b [AIRFLOW-1522] Increase text size for var field in variables for MySQL
Closes #2535 from saguziel/aguziel-increase-text
2017-09-11 15:18:26 +02:00
Swalloow 01be025c5e [AIRFLOW-950] Missing AWS integrations on documentation::integrations
Closes #2552 from Swalloow/master
2017-09-11 15:03:40 +02:00
Maxime Beauchemin 728817f4ce [AIRFLOW-XXX] 1.8.2 release notes
Closes #2562 from mistercrunch/release_182
2017-09-11 13:29:54 +02:00
Dan Fuller aa95f25796 [AIRFLOW-1573] Remove `thrift < 0.10.0` requirement
Closes #2574 from dan-disqus/Thrift
2017-09-11 13:14:47 +02:00
Dan Davydov 17ac070b29 [AIRFLOW-1584] Remove insecure /headers endpoint
Closes #2588 from aoen/ddavydov--
remove_headers_endpoint
2017-09-11 13:12:20 +02:00
Mike Ghen e83012589b [AIRFLOW-1586] Add mapping for date type to mysql_to_gcs operator
Closes #2589 from mikeghen/bug/airflow-1586
2017-09-11 13:10:37 +02:00
Daniel Lee 5b978b28bc [AIRFLOW-1579] Adds support for jagged rows in Bigquery hook for BQ load jobs
Closes #2582 from DannyLee12/master
2017-09-08 11:37:42 -07:00
Andrew Chen c2c51518e8 [AIRFLOW-1577] Add token support to DatabricksHook
Closes #2579 from andrewmchen/token
2017-09-08 11:24:14 -07:00
Fokko Driesprong ea9ab96cb7 [AIRFLOW-1580] Error in string formating
The string formatting should be done on the
string, and not on the
exception that is being raised.

Closes #2583 from Fokko/AIRFLOW-1580-error-in-
checkout-operator
2017-09-08 14:45:38 +02:00
Younghee Kwon f3258bb539 [AIRFLOW-1567] Updated docs for Google ML Engine operators/hooks
Closes #2573 from yk5/master
2017-09-07 14:04:16 -07:00
Ace Haidrey 2d40694482 [AIRFLOW-1574] add 'to' attribute to templated vars of email operator
The to field may sometimes want to be to be
template-able when you have a DAG that is using
XCOM to find the user to send the information to
(i.e. we have a form that a user submits and based
on the ldap user we send this specific user the
information). It's a rather easy fix to add the
'to' user to the template-able options.

Closes #2577 from Acehaidrey/AIRFLOW-1574
2017-09-07 11:53:20 -07:00
Adam Boscarino 8f1ec4dee6 [AIRFLOW-1572] add carbonite to company list
Closes #2571 from ajbosco/add_carbonite
2017-09-06 16:51:34 -07:00
Felipe Sabino 5a4870e186 Add ContaAzul as an Airflow user
Closes #2566 from sabino/patch-1
2017-09-06 16:48:30 -07:00
Joy Gao 03af6105e8 [AIRFLOW-1568] Fix typo in BigQueryHook
Closes #2575 from jgao54/ds-import-export
2017-09-06 16:44:31 -07:00
Alex Guziel b2e1753f5b [AIRFLOW-1493][AIRFLOW-XXXX][WIP] fixed dumb thing
Closes #2505 from saguziel/aguziel-fix-double-
trigger
2017-09-06 13:49:34 -07:00
Younghee Kwon af91e2ac06 [AIRFLOW-1567][Airflow-1567] Renamed cloudml hook and operator to mlengine
Closes #2567 from yk5/cmle
2017-09-06 09:51:17 -07:00
Joy Gao 86063ba4e9 [AIRFLOW-1568] Add datastore export/import operators
Closes #2568 from jgao54/ds-import-export
2017-09-06 09:45:56 -07:00
Niels Zeilemaker 4c674ccffd [AIRFLOW-1564] Use Jinja2 to render logging filename
Still backwards compatible with python format

Closes #2565 from NielsZeilemaker/AIRFLOW-1564
2017-09-06 13:41:20 +02:00
Fokko Driesprong 32750601ad [AIRFLOW-1562] Spark-sql logging contains deadlock
Logging in SparkSqlOperator does not work as
intended. Spark-sql
internally redirects all logs to stdout (including
stderr),
which causes the current two iterator logging to
get stuck with
the stderr pipe. This situation can lead to a
deadlock
because the std-err can grow too big and it will
start to block
until it will be consumed, which will only happen
when the process
ends, so the process stalls.

Closes #2563 from Fokko/AIRFLOW-1562-Spark-sql-
loggin-contains-deadlock
2017-09-06 13:37:31 +02:00
Rajiv Bharadwaja 9df0ac64c0 [AIRFLOW-1556][Airflow 1556] Add support for SQL parameters in BigQueryBaseCursor
Closes #2557 from rajivpb/sql-parameters
2017-09-01 12:59:11 -07:00
Sid Anand de593216d9 [AIRFLOW-108] Add CreditCards.com to companies list
Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

### JIRA
- [/] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "[AIRFLOW-XXX] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-108

### Description
- [/] Here are some details about my PR, including
screenshots of any UI changes:
Adding an entry to the companies list in README.md
file.

### Tests
- [/] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason: Documentation change only.

### Commits
- [/] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Closes #2554 from r39132/master
2017-08-30 18:17:11 -07:00
Ace Haidrey 9450d8db6f [AIRFLOW-1541] Add channel to template fields of slack_operator
Closes #2549 from Acehaidrey/AIRFLOW-1541
2017-08-30 11:52:48 -07:00
fenglu-g b1f902e63e [AIRFLOW-1535] Add service account/scopes in dataproc
Closes #2546 from fenglu-g/master
2017-08-30 09:13:43 -07:00
Siddharth Anand 5c8075526e closes apache/incubator-airflow#1186 *Won't fix* 2017-08-29 22:44:26 -07:00
Siddharth Anand 73e28feb45 closes apache/incubator-airflow#1382 *Won't fix* 2017-08-29 22:43:03 -07:00
Siddharth Anand d445bbb517 closes apache/incubator-airflow#1415 *Won't fix* 2017-08-29 22:41:16 -07:00
Siddharth Anand 401bc82ab0 closes apache/incubator-airflow#1444 *Won't fix* 2017-08-29 22:39:45 -07:00
Varun 66a95d02cb [AIRFLOW-1384] Add to README.md CaDC/ARGO
added to  currently **officially** using Airflow:

[California Data Collaborative](http://californiad
atacollaborative.org) powered by [ARGO
Labs](http://www.argolabs.org)

Dear Airflow maintainers,

Please accept this PR. I understand that it will
not be reviewed until I have checked off all the
steps below!

- [x] My PR addresses the following [Airflow JIRA]
**https://issues.apache.org/jira/browse/AIRFLOW-13
84**

- The California Data Collaborative is a unique
coalition of forward thinking municipal water
managers in California who along with ARGO, a
startup non-profit that builds, operates, and
maintains data infrastructures, are pioneering new
standards of collaborating around and
administering water data for millions
Californians.

ARGO has deployed a hosted version of Airflow on
AWS and it is used to orchestrate data pipelines
to parse water use data from participating
utilities to power analytics. Furthermore, ARGO
also uses Airflow to power a data infrastructure
for citywide street maintenance via
https://github.com/ARGO-SQUID

- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
Change to README.md does not require unit testing.

- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Update README.md

added to  currently **officially** using Airflow
section of README.md

[California Data Collaborative](https://github.com
/California-Data-Collaborative) powered by [ARGO
Labs](http://www.argolabs.org)

Added CaDC/ARGO Labs to README.md

Please consider adding [Argo
Labs](www.argolabs.org) to the Airflow users
section.
**Context**
- The California Data Collaborative is a unique
coalition of forward thinking municipal water
managers in California who along with ARGO, a
startup non-profit that builds, operates, and
maintains data infrastructures, are pioneering new
standards of collaborating around and
administering water data for millions
Californians.

- ARGO has deployed a hosted version of Airflow on
AWS and it is used to orchestrate data pipelines
to parse water use data from participating
utilities to power analytics. Furthermore, ARGO
also uses Airflow to power a data infrastructure
for citywide street maintenance via
https://github.com/ARGO-SQUID

Closes #2421 from vr00n/patch-3
2017-08-29 21:13:34 -07:00
Maxime Beauchemin 7cc3461557 [AIRFLOW-1546] add Zymergen 80to org list in README
Closes #2512 from mistercrunch/add_zymergen
2017-08-29 21:05:06 -07:00
Siva Pandeti fe051cf85f [AIRFLOW-1545] Add Nextdoor to companies list
Add Nextdoor to company list

Add Nextdoor to companies list

Closes #2448 from SivaPandeti/master
2017-08-29 21:01:13 -07:00
Kevin Gao 0e49871571 [AIRFLOW-1544] Add DataFox to companies list
Closes #2544 from sudowork/datafox-companies
2017-08-29 14:54:01 -07:00
Guillermo Rodriguez Cano 4a4b024cb1 [AIRFLOW-1529] Add logic supporting quoted newlines in Google BigQuery load jobs
Closes #2545 from wileeam/bq-allow-quoted-nl
2017-08-23 14:36:49 -07:00
Richard Garcia 1e2d237389 add Grand Rounds to companies list
Closes #2533 from richddr/add-grand-rounds-to-
company-list
2017-08-21 10:48:19 -07:00
Moe Nadal f1a7c00510 [AIRFLOW-1521] Fix emplate rendering for BigqueryTableDeleteOperator
The list of template_fields contains only 1 entry and was
interpreted by python as a list of character. That was
breaking the render_template function (see AIRFLOW-1521
ticket)

Closes #2534 from moe-nadal-ck/AIRFLOW-1521/fix_table_delete_operator_template_fields_list
2017-08-18 15:25:00 -07:00
Fokko Driesprong de99aa20f4 [AIRFLOW-1324] Generalize Druid operator and hook
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit

Closes #2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
2017-08-18 21:34:03 +02:00
Edgar Rodriguez d22340aab0 [AIRFLOW-1516] Fix error handling getting fernet
There were unhandled cases for exceptions when
importing fernet in
models.py. This seems to be a remanent of a
previous refactor,
replacing logic that would depend on the
definition of a global variable
for Fernet if it was imported correctly.

Generally catching all exceptions from get_fernet
function, given that
other functions are already handling it that way
and the only
error handling case here is to not use encryption.

Closes #2527 from edgarRd/erod-fernet-error-
handling
2017-08-18 11:31:27 -07:00
George Leslie-Waksman ea86895d5b [AIRFLOW-1420][AIRFLOW-1473] Fix deadlock check
Update the deadlock check to prevent false
positives on upstream
failure or skip conditions.

Closes #2506 from gwax/fix_dead_dagruns
2017-08-17 15:19:52 -07:00
Edgar Rodriguez 67b47c9589 [AIRFLOW-1495] Fix migration on index on job_id
There was a merge conflict on the migration hash
for down revision
at the time that two commits including migrations
were merged.

This commit restores the chain of revisions for
the migrations,
pointing to the last one. The job_id index
migration was regenerated
from the top migration.

Closes #2524 from edgarRd/erod-ti-jobid-index-fix
2017-08-15 15:27:06 -07:00
Edgar Rodriguez 04bfba3aa9 [AIRFLOW-1483] Making page size consistent in list
Views showing model listings had large page sizes
which made page
loading really slow client-side, mostly due to DOM
processing and
JS plugin rendering.
Also, the page size was inconsistent across some
listings.

This commit introduces a configurable page size,
and by default
it'll use a page_size = 100. Also, the same page
size is applied to
all the model views controlled by flask_admin to
be consistent.

Closes #2497 from edgarRd/erod-ui-page-size-conf
2017-08-15 15:01:19 -07:00
Edgar Rodriguez e1772c008d [AIRFLOW-1495] Add TaskInstance index on job_id
Column job_id is unindexed in TaskInstance, it was
used as
default sort column in TaskInstanceView.

This commit adds the required migration to add the
index on
task_instance.job_id on future db upgrades.

Closes #2520 from edgarRd/erod-ti-jobid-index
2017-08-15 14:57:28 -07:00
Dan Davydov 4cf904cf5a [AIRFLOW-855] Replace PickleType with LargeBinary in XCom
PickleType in Xcom allows remote code execution.
In order to deprecate
it without changing mysql table schema, change
PickleType to LargeBinary
 because they both maps to blob type in mysql. Add
"enable_pickling" to
function signature to control using ether pickle
type or JSON. "enable_pickling"
 should also be added to core section of
airflow.cfg

Picked up where https://github.com/apache
/incubator-airflow/pull/2132 left off. Took this
PR, fixed merge conflicts, added
documentation/tests, fixed broken tests/operators,
and fixed the python3 issues.

Closes #2518 from aoen/disable-pickle-type
2017-08-15 12:24:07 -07:00
Trevor Edwards 984a87c0cb [AIRFLOW-1505] Document when Jinja substitution occurs
Closes #2523 from TrevorEdwards/airflow-1505
2017-08-15 10:35:46 -07:00
Trevor Edwards 1cd6c4b0e8 [AIRFLOW-1504] Log dataproc cluster name
Closes #2517 from
TrevorEdwards/dataproc_log_clustername
2017-08-15 10:22:03 -07:00
Ace Haidrey 42cad60698 [AIRFLOW-1239] Fix unicode error for logs in base_task_runner
The details here are that there exists a PR for
this JIRA already (https://github.com/apache/incubator-
airflow/pull/2318). The issue is that in python 2.7 not
all literals are automatically unicode like they
are in python 3. That's what's the root cause, and
that can simply be fixed by just explicitly
stating all literals should be treated as unicode,
which is an import from the `__future__` module.
https://stackoverflow.com/questions/3235386/python-using-format-on-
a-unicode-escaped-string also explains this
same solution, which I found helpful.

Closes #2496 from Acehaidrey/master
2017-08-14 15:13:02 -07:00
Stanislav Kudriashev 565423a397 [AIRFLOW-1280] Fix Gantt chart height
Closes #2502 from skudriashev/airflow-1280
2017-08-14 14:19:49 -07:00