Граф коммитов

3142 Коммитов

Автор SHA1 Сообщение Дата
Chris Riccomini 959d1fe100 Merge branch '1508' 2016-05-18 19:48:32 -07:00
Chris Riccomini 20a536cc77 Merge branch '1507' 2016-05-18 15:49:15 -07:00
Eric Stern 4b25a7d34e [AIRFLOW-125] Add file to GCS operator
Adds an operator to upload a file to Google Cloud Storage. Used as follows:
```py
from airflow.contrib.operators.file_to_gcs import FileToGoogleCloudStorageOperator
gcs = FileToGoogleCloudStorageOperator(
        bucket='a-bucket-i-have-access-to-on-gcs',
        dag=dag,
        task_id='upload_stuff',
        google_cloud_storage_conn_id='an-airflow-bigquery-connection',
        src=os.path.join(os.path.dirname(__file__), 'csv/some_file.csv'),
        dst='project/some_file.csv')
```
2016-05-18 15:32:57 -07:00
jlowin af43db5af0 [AIRFLOW-86] Wrap dict.items() in list for Py3 compatibility
Author: jlowin <jlowin@users.noreply.github.com>

Closes #1483 from jlowin/AIRFLOW-86.
2016-05-18 14:26:43 -04:00
Bolke de Bruin 4343234836 Merge branch 'dag_run' 2016-05-18 19:49:47 +02:00
ppotu f01854a4bb Adding Nerdwallet to the list of Currently officially using Airflow: 2016-05-18 10:25:17 -07:00
Hervé Werner 843a22f74c [AIRFLOW-127] Makes filter_by_owner aware of multi-owner DAG 2016-05-18 12:31:12 +02:00
Bolke de Bruin cb56289743 AIRFLOW-124 Implement create_dagrun
This adds the create_dagrun function to DAG and the staticmethod
DagRun.find. create_dagrun will create a dagrun including its tasks.

By having taskinstances created at dagrun instantiation time,
deadlocks that were tested for will not take place anymore. Tests
have been adjusted accordingly.

In addition, integrity has been improved by a bugfix to add_task
of the BaseOperator to make sure to always assign a Dag if it is
present to a task.

DagRun.find is a convenience function that returns the DagRuns
for a given dag. It makes sure to have a single place how to
find dagruns.
2016-05-18 12:30:50 +02:00
Siddharth Anand 4d8578398c Merge branch '1478' 2016-05-17 21:35:13 +00:00
Chris Riccomini abc43c1445 Merge branch '1493' 2016-05-17 08:14:30 -07:00
Hervé Werner 150568228b [AIRFLOW-121] Documenting dag doc_md feature 2016-05-17 09:57:23 +02:00
Arthur Wiedmer 6f4696ba2e [AIRFLOW-109] Fix try catch handling in PrestoHook
This addresses the issue with executing the SQL statement outside of
the try block. In the case of a syntax error in the statement, the
underlying library raises a Databases error which was meant to be
handled (i.e., json parsed) by the catch.
2016-05-16 14:12:12 -07:00
Dan Davydov db07e04f97 Merge branch '1503' 2016-05-16 13:15:49 -07:00
Hongbo Zeng 199e07a455 change TARGET_PARTITION_SIZE to DEFAULT_TARGET_PARTITION_SIZE 2016-05-16 11:20:49 -07:00
Bolke de Bruin 72ab63e83d Use incubating instead of incubator in title 2016-05-16 17:52:36 +02:00
Bolke de Bruin fb1616a3c9 Merge branch '1502' 2016-05-16 16:27:52 +02:00
Hongbo Zeng b565ef9952 use targetPartitionSize as the default partition spec 2016-05-14 17:00:42 -07:00
Maxime Beauchemin 0b1c7ffcaa [AIRFLOW-117] fix links in README.md 2016-05-14 16:26:38 -07:00
Dan Davydov 07fe7d7b4a Merge branch '1498' 2016-05-13 15:04:23 -07:00
Dan Davydov 17bcf10fe5 [AIRFLOW-112] no-op README change to close this jira's PR 2016-05-13 14:57:01 -07:00
Dan Davydov 1feac380d5 [AIRFLOW-112] Change default DAG view from tree view to graph view 2016-05-13 13:55:06 -07:00
Dan Davydov 30608b8ba9 Change default DAG view from tree view to graph view 2016-05-13 13:36:19 -07:00
Siddharth Anand 10d70d9d7e Merge branch '1490' 2016-05-13 01:58:22 +00:00
Siddharth Anand ab5d445992 Fix : Don't treat premature tasks as could_not_run tasks 2016-05-13 01:39:39 +00:00
Arthur Wiedmer d18a782f3b Move presto.execute inside try catch to handle error
This commit fixes an issue where malformed SQL would raise a
DatabaseError outside of the try catch block in the hook. This
should now raise a PrestoException as expected.
2016-05-12 16:15:15 -07:00
Chris Riccomini 31f01b8380 Revert "ssl gunicorn support"
This reverts commit e332f63620.
2016-05-12 10:35:28 -07:00
Stanilovsky Evgeny e332f63620 ssl gunicorn support 2016-05-12 10:07:45 +03:00
Bolke de Bruin dddfd3b5bf AIRFLOW-92 Avoid unneeded upstream_failed session closes apache/incubator-airflow#1485 2016-05-10 15:08:11 +02:00
jlowin 40b3fffa07 Merge pull request #1378 from jlowin/queued-tasks 2016-05-09 19:20:37 -04:00
jlowin 385add2bf3 AIRFLOW-52 Warn about overwriting tasks in a DAG 2016-05-09 18:53:24 -04:00
jlowin c1aa93f1a7 Add logic to lock DB and avoid race condition
The scheduler can encounter a queued task twice before the
task actually starts to run -- this locks the task and avoids
that condition.
2016-05-09 17:19:02 -04:00
jlowin 43bdd7a4c8 Handle queued tasks from multiple jobs/executors
When Scheduler is run with `—num-runs`, there can be multiple
Schedulers and Executors all trying to run tasks. For queued tasks,
Scheduler was previously only trying to run tasks that it itself had
queued — but that doesn’t work if the Scheduler is restarting. This PR
reverts that behavior and adds two types of “best effort” executions —
before running a TI, executors check if it is already running, and
before ending executors call sync() one last time
2016-05-09 17:18:58 -04:00
Stephen Cattaneo 61f35782fa [AIRFLOW-80] Move example_twitter dag to contrib/example_dags as it requires hive 2016-05-09 12:06:08 -07:00
apapanico eb09609751 [AIRFLOW-75] Fix bug in S3 config file parsing 2016-05-09 10:40:06 -07:00
Waldemar Hummer 7a1fa7b104 AIRFLOW-77: Enable UI toggle whether to apply 'clear' operation recursively to sub-DAGs or not 2016-05-09 12:04:24 +10:00
aaur0 181d37321a Use getfqdn to make sure urls are fully qualified
gethostname only resolves host part while often fully qualified domain names are required.

* Resolves #1437
2016-05-08 19:13:33 +02:00
Jeremiah Lowin ce220e0c38 [AIRFLOW-52] Fix bottlenecks when working with many tasks
Dag hash function tried (and failed) to hash the list of tasks, then fell back on repr-ing the list, which took forever. Instead, hash tuple(task_dict.keys()). In addition this replaces two slow list comprehensions with much faster hash lookups (using the new task_dict).
2016-05-07 13:20:47 +02:00
Bence Nagy aff5d8c8a2 Add bulk_dump abstract method to DbApiHook (#1471) 2016-05-06 09:21:10 -07:00
Jeremiah Lowin 415b363eb8 Fix corner case with joining processes/queues (#1473)
If a process places items in a queue and the process is joined before the queue is emptied, it can lead to a deadlock under some circumstances. Closes AIRFLOW-61.

See for example: https://docs.python.org/3/library/multiprocessing.html#all-start-methods ("Joining processes that use queues")
http://stackoverflow.com/questions/31665328/python-3-multiprocessing-queue-deadlock-when-calling-join-before-the-queue-is-em
http://stackoverflow.com/questions/31708646/process-join-and-queue-dont-work-with-large-numbers
http://stackoverflow.com/questions/19071529/python-multiprocessing-125-list-never-finishes
2016-05-06 12:11:16 -04:00
Maxime Beauchemin d2f3fb4366 [AIRFLOW-53] Adding DagBag stats report to CLI's list_dags (#1468)
Adding DagBag stats report to CLI's list_dags

Removing logging call in favor of CLI, on-demend based approach

Addressing Dan's feedback
2016-05-06 08:54:31 -07:00
Sid Anand c3614d1867 Merge pull request #1466 from r39132/master
[AIRFLOW-39] Don't insert dag_runs beyond the min task end_date
2016-05-05 19:09:44 +00:00
Siddharth Anand e15a92b669 Don't insert dag_runs beyond the min task end_date 2016-05-05 18:53:54 +00:00
Sid Anand 93538f4682 Merge pull request #1464 from geeknam/feature/cx_oracle_bulk_insert
[AIRFLOW-50] Add bulk_insert_rows() to OracleHook for more performant inserts.
2016-05-05 17:22:25 +00:00
Chris Riccomini 36be57e990 Merge pull request #1453 from alexvanboxel/feature/AIRFLOW-21-upgrade-gcp-lib
AIRFLOW-21 upgrade GCP client lib
2016-05-05 10:05:23 -07:00
Alex Van Boxel b7f0245e36 AIRFLOW-21 upgrade GCP client lib 2016-05-05 08:56:50 +02:00
Nam Ngo bece6af289 Add bulk_insert_rows() for more performant inserts. 2016-05-04 18:21:51 +10:00
Maxime Beauchemin aeb5a07ff9 Docs tweaks while generating the docs 2016-05-03 22:13:35 -07:00
Maxime Beauchemin 3c3f5a67ff [AIRFLOW-42] Adding logging.debug DagBag loading stats (#1460)
* Adding logging.debug DagBag loading stats

* Linting

* Fix py3

* Tweaks
2016-05-03 14:21:21 -07:00
Jeremiah Lowin d6f4d7c063 Merge pull request #1457 from jlowin/docker-import
[AIRFLOW-38] Gracefully fail unit tests when docker-py isn't installed
2016-05-03 15:39:53 -04:00
Sid Anand 2a2b7e8c2b Merge pull request #1430 from whummer/success_recursive
[AIRFLOW-35] Enable UI feature to recursively set success for nested DAG operators
2016-05-03 11:33:04 -07:00