Граф коммитов

3548 Коммитов

Автор SHA1 Сообщение Дата
Siddharth Anand 4d567f439b Merge branch '1822' 2016-10-05 08:33:46 -07:00
Ilya Rakoshes c970b09c4c [AIRFLOW-539] Updated BQ hook and BQ operator to support Standard SQL.
Closes #1820 from illop/master
2016-10-05 08:09:21 -07:00
Pablo Seibelt 3bc49f69df Add Auth0 to companies using Airflow
We are using airflow and loving it :)
2016-10-05 10:22:33 -03:00
Daniel van der Ende 573fb991f4 [AIRFLOW-378] Add string casting to params of spark-sql operator
For parameters num_executors and executor_cores
add casts to strings to prevent issues when these
parameters are passed as integers (as comments specify).
Also fix minor typo that breaks the use of num-executors param.

Closes #1694 from
danielvdende/spark_sql_operator_bugfixes
2016-10-05 12:56:52 +02:00
Dave Paola 2aaa629a71 [AIRFLOW-545] Add Bloc as Airflow user[]
Closes #1819 from dpaola2/patch-1
https://issues.apache.org/jira/browse/AIRFLOW-545
2016-10-03 15:57:22 -07:00
Sumit Maheshwari 2d07a161d8 [AIRFLOW-544] Add Pause/Resume toggle button
Add Pause/Resume toggle button to DAG details
page, so one does not
need to go back and forth to view the details and
do the action.

Closes #1818 from msumit/AIRFLOW-544
2016-10-03 14:17:40 -07:00
George Leslie-Waksman eb5982d4aa [AIRFLOW-333][AIRFLOW-258] Fix non-module plugin components
* Distinguish between module and non-module plugin
components
* Fix handling of non-module plugin components

  * admin views, flask blueprints, and menu links
need to not be
    wrapped in modules

* Fix improper use of zope.deprecation.deprecated

  * zope.deprecation.deprecated does NOT support
classes as
    first parameter
  * deprecating classes must be handled by calling
the deprecate
    function on the class name

* Added tests for plugin loading
* Updated plugin documentation to match test
plugin
* Updated executors to always load plugins
* More logging

Closes #1738 from gwax/plugin_module_fixes
2016-10-01 23:43:20 -07:00
mmaia c37740f531 [AIRFLOW-542] Add tooltip to DAGs links icons
Closes #1817 from mmmaia/master
2016-10-01 23:36:41 -07:00
Daniel Zohar c02425d483 [AIRFLOW-530] Update docs to reflect connection environment var has to be in uppercase
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
https://issues.apache.org/jira/browse/AIRFLOW-530

Right now, the documentation does not clearly
state that connection names are converted to
uppercase form when searched in the environment
(https://github.com/apache/incubator-airflow/blob/
master/airflow/hooks/base_hook.py#L60-L60).
This is confusing as the best practice in Airflow
seems to be to define connections in lower case
form.

Closes #1811 from danielzohar/connection_env_var
2016-10-01 00:33:50 -07:00
Kevin Mullen 84f6599a67 [AIRFLOW-537] Add WiseBanyan as Airflow user[]
Closes #1815 from
kevinjmullen/AIRFLOW-537-WiseBanyan
2016-09-30 18:54:54 -07:00
Sumit Maheshwari 8ca8f66d8f [AIRFLOW-525] Update template_fields in Qubole Op
There were couple of more fields in Qubole
Operator which requires
support of Jinja templating, so added these missed
out fields as well
to template_fields. Also added a missing doc
(about notify) and an
example of using macros.

Closes #1808 from msumit/AIRFLOW-525
2016-09-29 16:57:42 -07:00
Alexander Volkmann f0db42c627 [AIRFLOW-480] Support binary file download from GCS
Allow for binary file download from Google Cloud Storage

Closes #1793 from al-
xv/feature/allow_binary_input_gcs_hook
2016-09-29 15:55:07 -07:00
Jakob Homan 89dc501582 [AIRFLOW-535][AIRFLOW-1] Add OfferUp as an Airflow user.[]
Closes #1814 from jghoman/AIRFLOW-535
2016-09-29 15:45:41 -07:00
Siddharth Anand 081fd00884 [AIRFLOW-531] Add T2 Systems as Airflow User
Closes #1812 from r39132/master
2016-09-28 08:13:14 -07:00
Siddharth Anand c72c0b760e closes apache/incubator-airflow#1810 *no movement from submitter* 2016-09-28 07:59:00 -07:00
Siddharth Anand 5a8a448f33 closes apache/incubator-airflow#1562 *fixed by another pr* 2016-09-27 17:32:00 -07:00
George Leslie-Waksman edf033be65 [AIRFLOW-198] Implement latest_only_operator
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-198

Testing Done:
- Local testing of dag operation with
LatestOnlyOperator
- Unit test added

Closes #1752 from gwax/latest_only
2016-09-27 17:07:14 -07:00
Pedro M Duarte d4013f9190 [AIRFLOW-523] Add AltX as Airflow user
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
- [AIRFLOW-523](https://issues.apache.org/jira/bro
wse/AIRFLOW-523)

Testing Done:
- Tests passing in Travis CI

Closes #1807 from PedroMDuarte/add-altx-airflow-
user
2016-09-21 18:44:59 -07:00
aj e54a855779 [AIRFLOW-521] Add IFTTT as Airflow user
We at IFTTT currently use Airflow to monitor and
schedule all our Data Pipelines and ETLs.

Closes #1805 from apurvajoshi/patch-1
2016-09-20 13:14:12 -07:00
Felipe Benevides 5df21cf213 [AIRFLOW-519] Add 99 as an Airflow user
Closes #1795 from fbenevides/patch-1
2016-09-20 11:18:56 -07:00
Casey Ching b28cedb98d [AIRFLOW-91] Add SSL config option for the webserver
SSL can now be enabled by providing certificate
and key in the usual
ways (config file or CLI options). Providing the
cert and key will
automatically enable SSL. The web server port will
not automatically
change.

The Security page in the docs now includes an SSL
section with basic
setup information.

Closes #1760 from caseyching/master
2016-09-19 15:55:10 +02:00
Alexander Shorin 4905a5563d [AIRFLOW-191] Fix connection leak with PostgreSQL backend
This issue happens because job falls asleep during
heartbeat without
closing a session, which holds a connection. This
turns database
connection into IDLE state, but doesn't releases
it for other clients,
so when connection poll get exhausted, they get
blocked for ~heartbeat
timeframe causing global performance degradation.

Closes #1790 from kxepal/AIRFLOW-191-postgresql-
connection-leak
2016-09-19 15:44:39 +02:00
David Gingrich ff45d8f221 [AIRFLOW-512] Fix 'bellow' typo in docs & comments
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-512

Testing Done:
- N/A, but ran core tests: `./run_unit_tests.sh
tests.core:CoreTest -s`

Closes #1800 from dgingrich/master
2016-09-16 09:45:12 -07:00
Ilya Rakoshes 0e3ed447b5 [AIRFLOW-509][AIRFLOW-1] Create operator to delete tables in BigQuery
We have a use case to delete BigQuery tables and views. This patch
adds a delete operator that allows us to do so.

Closes #1798 from illop/BigQueryDeleteOperator
2016-09-15 10:17:04 -07:00
Julian V. Modesto fb7b98b636 [AIRFLOW-498] Remove hard-coded gcp project id
Closes #1786 from julianvmodesto/bug/AIRFLOW-498
--fix-gcp-dataflow-hook
2016-09-14 08:35:35 -04:00
jlowin 8fba0acab6 [AIRFLOW-505] Support unicode characters in authors' names
Closes #1792 from jlowin/pr-tool-ascii
2016-09-14 08:32:22 -04:00
Bolke de Bruin 1a0d07e496 Do not use migrations in coverage 2016-09-13 23:29:13 +02:00
Bolke de Bruin c21b11416e Do not include testing and directories in coverage reporting 2016-09-13 17:28:19 +02:00
Hervé afed622d83 [AIRFLOW-469] Add MFG Labs as Airflow user
Hello.

We'd like to be added as official Airflow users.

Regards

Closes #1768 from dud225/patch-1
2016-09-11 12:17:12 -04:00
Dan Davydov daa326cb4d [AIRFLOW-494] Add per-operator success/failure metrics
Adds metrics for success/failure rates of each operator, that way
when we e.g. do a new release we will have some
signal if there is a regression in an operator. It
will also be useful if e.g. a user wants to
upgrade their infrastructure and make sure that
all of the operators still work as expected.

Testing Done:
- Local staging and make sure that several
operators successes/failures were accurately
reflected

Closes #1785 from aoen/ddavydov/add_per_operator_s
uccess_fail_metrics
2016-09-09 10:37:32 -07:00
Bolke de Bruin 3a1be4aacf Revert "[AIRFLOW-78] airflow clear leaves dag_runs"
This reverts commit 197c9050ef.

Regressions were observed and tasks were not scheduled in case of
max_dag_runs reached.
2016-09-09 11:34:46 +02:00
forevernull 32f3c1c5d4 [AIRFLOW-488] Fix test_simple fail
Make unittest test_simple pass on all platforms
including MacOs.

Closes #1782 from forevernull/master
2016-09-09 12:19:53 +05:30
Alex Van Boxel 247955d422 [AIRFLOW-468] Update Panda requirement to 0.17.1
BigQuery Hook requires at least Panda 0.17.1

Closes #1767 from alexvanboxel/feature/panda-
upgrade
2016-09-04 15:16:40 +02:00
Alex Van Boxel c08b52aadb [AIRFLOW-159] Add cloud integration section + GCP documentation
Closes #1773 from alexvanboxel/feature/gcp-docs
2016-09-04 15:15:07 +02:00
Alex Van Boxel 86fe23f111 [AIRFLOW-477][AIRFLOW-478] Restructure security section for clarity
Closes #1775 from alexvanboxel/docs/security
2016-09-04 15:13:18 +02:00
Alex Van Boxel c6cc01f4e9 [AIRFLOW-467] Allow defining of project_id in BigQueryHook
Introduced a toplevel table splitter that also
allows for defining a project_id as a suffix seperated by
the legacy colon or the new dot. If the project is not
defined, the default project_id will be used.

As it's so common of all the BigQuery classes
(Hook, Cursor, ...) the same splitter is used over all
those classes.

The documentation is adapted to allow defining the
suffix seperated as with the legacy colon as well
as with the new SQL dotted notation.

The change is 100% backwards compatible. Unit
tests are added for all scenario's, including negative
and compatibility.

Closes #1781 from alexvanboxel/feature/bq_tablename
2016-09-04 15:09:15 +02:00
alexf 2de790a988 [AIRFLOW-483] Change print to logging statement
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:

https://issues.apache.org/jira/browse/AIRFLOW-483

Testing Done:

This fix prevented the stdout from being spammed
by the file content.

Closes #1780 from
skogsbaeck/fix/gcs_download_operator
2016-09-02 23:59:30 +08:00
Georg Walther 0bdcdbe556 [AIRFLOW-481] Add Markovian
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-481

Closes #1777 from waltherg/patch-2
2016-09-01 11:09:36 -07:00
Alex Van Boxel 6b6faeecbf [AIRFLOW-476] Add link to Apache Incubation page
Closes #1774 from alexvanboxel/docs/incub-status
2016-09-01 15:29:33 +05:30
Hongbo Zeng 0e568602f8 [AIRFLOW-475] make the segment granularity in Druid hook configurable
The Druid hook now has hardcoded
`segmentGranularity` - "DAY", we need it
configurable for different use cases.

mistercrunch aoen plypaul

Closes #1771 from
hongbozeng/hongbo/segment_granularity
2016-08-31 15:59:18 -07:00
Bolke de Bruin e6cc1c0551 Merge pull request #1772 from mistercrunch/v1_7_2 2016-08-31 22:19:54 +02:00
Maxime Beauchemin 0e95d67ca4 Dropping .txt etension on repo's root files 2016-08-31 13:09:01 -07:00
Maxime Beauchemin 7bb750d7d2 Bumping to v1.7.2.dev0 2016-08-31 13:02:18 -07:00
Maxime Beauchemin 216e5c3c6b Bump version number to v1.7.2 2016-08-31 13:02:09 -07:00
Maxime Beauchemin 12fa55db3c Adding DISCLAIMER.txt file to the repo 2016-08-31 13:00:45 -07:00
Maxime Beauchemin 259c634434 Removing highchart reference from NOTICE.txt 2016-08-31 13:00:35 -07:00
Tamas Szuromi 6b9ad4bf47 [AIRFLOW-472] Add liligo as an Airflow user
Dear Airflow Maintainers,

Could you pls add liligo to the Airflow users list
in the Readme?

Thanks in advance!

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-472

Closes #1769 from tromika/liligo
2016-08-30 21:07:59 -07:00
Alex Van Boxel 465dfd9ba2 [AIRFLOW-466] Add Vente-Exclusive.com as an official Airflow user
Please accept this PR that addresses the following
issues:
(https://issues.apache.org/jira/browse/AIRFLOW-466
/).

Trivial doc add

Closes #1766 from alexvanboxel/feature/vex-as-user
2016-08-30 21:04:03 -07:00
Sumit Maheshwari d649cfa86c [AIRFLOW-463] Link Airflow icon to landing page
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
- *https://issues.apache.org/jira/browse/AIRFLOW-4
63*

As of now the Airflow image icon on top left
doesn't leads users to anywhere. It should take
users to initial landing page, which is generally
happened on most of the other sites.

Closes #1764 from msumit/AIRFLOW-463
2016-08-28 10:17:27 -07:00
Dan Davydov f360414774 [AIRFLOW-149] Task Dependency Engine + Why Isn't My Task Running View
Here is the original PR with Max's LGTM:
https://github.com/aoen/incubator-airflow/pull/1
Since then I have made some fixes but this PR is essentially the same.
It could definitely use more eyes as there are likely still issues.

**Goals**
- Simplify, consolidate, and make consistent the logic of whether or not
  a task should be run
- Provide a view/better logging that gives insight into why a task
  instance is not currently running (no more viewing the scheduler logs
  to find out why a task instance isn't running for the majority of
  cases):
![image](https://cloud.githubusercontent.com/assets/1592778/17637621/aa669f5e-6099-11e6-81c2-d988d2073aac.png)

**Notable Functional Changes**
- Webserver view + task_failing_deps CLI command to explain why a given
  task instance isn't being run by the scheduler
- Running a backfill in the command line and running a task in the UI
  will now display detailed error messages based on which dependencies
  were not met for a task instead of appearing to succeed but actually
  failing silently
- Maximum task concurrency and pools are not respected by backfills
- Backfill now has the equivalent of the old force flag to run even for
  successful tasks
  This will break one use case:
  Using pools to restrict some resource on airflow executors themselves
  (rather than an external resource like a DB), e.g. some task uses 60%
  of cpu on a worker so we restrict that task's pool size to 1 to
  prevent two of the tasks from running on the same host. When
  backfilling a task of this type, now the backfill will wait on the
  pool to have slots open up before running the task even though we
  don't need to do this if backfilling on a different host outside of
  the pool. I think breaking this use case is OK since the use case is a
  hack due to not having a proper resource isolation solution (e.g.
  mesos should be used in this case instead).
- To make things less confusing for users, there is now a "ignore all
  dependencies" option for running tasks, "ignore dependencies" has been
  renamed to "ignore task dependencies", and "force" has been renamed to
  "ignore task instance state". The new "Ignore all dependencies" flag
  will ignore the following:
  - task instance's pool being full
  - execution date for a task instance being in the future
  - a task instance being in the retry waiting period
  - the task instance's task ending prior to the task instance's
    execution date
  - task instance is already queued
  - task instance has already completed
  - task instance is in the shutdown state
  - WILL NOT IGNORE task instance is already running
- SLA miss emails will now include all tasks that did not finish for a
  particular DAG run, even if
  the tasks didn't run because depends_on_past was not met for a task
- Tasks with pools won't get queued automatically the first time they
  reach a worker; if they are ready to run they will be run immediately
- Running a task via the UI or via the command line (backfill/run
  commands) will now log why a task could not get run if one if it's
  dependencies isn't met. For tasks kicked off via the web UI this
  means that tasks don't silently fail to get queued despite a
  successful message in the UI.
- Queuing a task into a pool that doesn't exist will now get stopped in
  the scheduler instead of a worker

**Follow Up Items**
- Update the docs to reference the new explainer views/CLI command

Closes #1729 from aoen/ddavydov/blockedTIExplainerRebasedMaster
2016-08-26 15:07:44 -07:00