Chris Riccomini
959d1fe100
Merge branch '1508'
2016-05-18 19:48:32 -07:00
Chris Riccomini
20a536cc77
Merge branch '1507'
2016-05-18 15:49:15 -07:00
Eric Stern
4b25a7d34e
[AIRFLOW-125] Add file to GCS operator
...
Adds an operator to upload a file to Google Cloud Storage. Used as follows:
```py
from airflow.contrib.operators.file_to_gcs import FileToGoogleCloudStorageOperator
gcs = FileToGoogleCloudStorageOperator(
bucket='a-bucket-i-have-access-to-on-gcs',
dag=dag,
task_id='upload_stuff',
google_cloud_storage_conn_id='an-airflow-bigquery-connection',
src=os.path.join(os.path.dirname(__file__), 'csv/some_file.csv'),
dst='project/some_file.csv')
```
2016-05-18 15:32:57 -07:00
jlowin
af43db5af0
[AIRFLOW-86] Wrap dict.items() in list for Py3 compatibility
...
Author: jlowin <jlowin@users.noreply.github.com>
Closes #1483 from jlowin/AIRFLOW-86.
2016-05-18 14:26:43 -04:00
Bolke de Bruin
4343234836
Merge branch 'dag_run'
2016-05-18 19:49:47 +02:00
ppotu
f01854a4bb
Adding Nerdwallet to the list of Currently officially using Airflow:
2016-05-18 10:25:17 -07:00
Hervé Werner
843a22f74c
[AIRFLOW-127] Makes filter_by_owner aware of multi-owner DAG
2016-05-18 12:31:12 +02:00
Bolke de Bruin
cb56289743
AIRFLOW-124 Implement create_dagrun
...
This adds the create_dagrun function to DAG and the staticmethod
DagRun.find. create_dagrun will create a dagrun including its tasks.
By having taskinstances created at dagrun instantiation time,
deadlocks that were tested for will not take place anymore. Tests
have been adjusted accordingly.
In addition, integrity has been improved by a bugfix to add_task
of the BaseOperator to make sure to always assign a Dag if it is
present to a task.
DagRun.find is a convenience function that returns the DagRuns
for a given dag. It makes sure to have a single place how to
find dagruns.
2016-05-18 12:30:50 +02:00
Siddharth Anand
4d8578398c
Merge branch '1478'
2016-05-17 21:35:13 +00:00
Chris Riccomini
abc43c1445
Merge branch '1493'
2016-05-17 08:14:30 -07:00
Hervé Werner
150568228b
[AIRFLOW-121] Documenting dag doc_md feature
2016-05-17 09:57:23 +02:00
Arthur Wiedmer
6f4696ba2e
[AIRFLOW-109] Fix try catch handling in PrestoHook
...
This addresses the issue with executing the SQL statement outside of
the try block. In the case of a syntax error in the statement, the
underlying library raises a Databases error which was meant to be
handled (i.e., json parsed) by the catch.
2016-05-16 14:12:12 -07:00
Dan Davydov
db07e04f97
Merge branch '1503'
2016-05-16 13:15:49 -07:00
Hongbo Zeng
199e07a455
change TARGET_PARTITION_SIZE to DEFAULT_TARGET_PARTITION_SIZE
2016-05-16 11:20:49 -07:00
Bolke de Bruin
72ab63e83d
Use incubating instead of incubator in title
2016-05-16 17:52:36 +02:00
Bolke de Bruin
fb1616a3c9
Merge branch '1502'
2016-05-16 16:27:52 +02:00
Hongbo Zeng
b565ef9952
use targetPartitionSize as the default partition spec
2016-05-14 17:00:42 -07:00
Maxime Beauchemin
0b1c7ffcaa
[AIRFLOW-117] fix links in README.md
2016-05-14 16:26:38 -07:00
Dan Davydov
07fe7d7b4a
Merge branch '1498'
2016-05-13 15:04:23 -07:00
Dan Davydov
17bcf10fe5
[AIRFLOW-112] no-op README change to close this jira's PR
2016-05-13 14:57:01 -07:00
Dan Davydov
1feac380d5
[AIRFLOW-112] Change default DAG view from tree view to graph view
2016-05-13 13:55:06 -07:00
Dan Davydov
30608b8ba9
Change default DAG view from tree view to graph view
2016-05-13 13:36:19 -07:00
Siddharth Anand
10d70d9d7e
Merge branch '1490'
2016-05-13 01:58:22 +00:00
Siddharth Anand
ab5d445992
Fix : Don't treat premature tasks as could_not_run tasks
2016-05-13 01:39:39 +00:00
Arthur Wiedmer
d18a782f3b
Move presto.execute inside try catch to handle error
...
This commit fixes an issue where malformed SQL would raise a
DatabaseError outside of the try catch block in the hook. This
should now raise a PrestoException as expected.
2016-05-12 16:15:15 -07:00
Chris Riccomini
31f01b8380
Revert "ssl gunicorn support"
...
This reverts commit e332f63620
.
2016-05-12 10:35:28 -07:00
Stanilovsky Evgeny
e332f63620
ssl gunicorn support
2016-05-12 10:07:45 +03:00
Bolke de Bruin
dddfd3b5bf
AIRFLOW-92 Avoid unneeded upstream_failed session closes apache/incubator-airflow#1485
2016-05-10 15:08:11 +02:00
jlowin
40b3fffa07
Merge pull request #1378 from jlowin/queued-tasks
2016-05-09 19:20:37 -04:00
jlowin
385add2bf3
AIRFLOW-52 Warn about overwriting tasks in a DAG
2016-05-09 18:53:24 -04:00
jlowin
c1aa93f1a7
Add logic to lock DB and avoid race condition
...
The scheduler can encounter a queued task twice before the
task actually starts to run -- this locks the task and avoids
that condition.
2016-05-09 17:19:02 -04:00
jlowin
43bdd7a4c8
Handle queued tasks from multiple jobs/executors
...
When Scheduler is run with `—num-runs`, there can be multiple
Schedulers and Executors all trying to run tasks. For queued tasks,
Scheduler was previously only trying to run tasks that it itself had
queued — but that doesn’t work if the Scheduler is restarting. This PR
reverts that behavior and adds two types of “best effort” executions —
before running a TI, executors check if it is already running, and
before ending executors call sync() one last time
2016-05-09 17:18:58 -04:00
Stephen Cattaneo
61f35782fa
[AIRFLOW-80] Move example_twitter dag to contrib/example_dags as it requires hive
2016-05-09 12:06:08 -07:00
apapanico
eb09609751
[AIRFLOW-75] Fix bug in S3 config file parsing
2016-05-09 10:40:06 -07:00
Waldemar Hummer
7a1fa7b104
AIRFLOW-77: Enable UI toggle whether to apply 'clear' operation recursively to sub-DAGs or not
2016-05-09 12:04:24 +10:00
aaur0
181d37321a
Use getfqdn to make sure urls are fully qualified
...
gethostname only resolves host part while often fully qualified domain names are required.
* Resolves #1437
2016-05-08 19:13:33 +02:00
Jeremiah Lowin
ce220e0c38
[AIRFLOW-52] Fix bottlenecks when working with many tasks
...
Dag hash function tried (and failed) to hash the list of tasks, then fell back on repr-ing the list, which took forever. Instead, hash tuple(task_dict.keys()). In addition this replaces two slow list comprehensions with much faster hash lookups (using the new task_dict).
2016-05-07 13:20:47 +02:00
Bence Nagy
aff5d8c8a2
Add bulk_dump abstract method to DbApiHook ( #1471 )
2016-05-06 09:21:10 -07:00
Jeremiah Lowin
415b363eb8
Fix corner case with joining processes/queues ( #1473 )
...
If a process places items in a queue and the process is joined before the queue is emptied, it can lead to a deadlock under some circumstances. Closes AIRFLOW-61.
See for example: https://docs.python.org/3/library/multiprocessing.html#all-start-methods ("Joining processes that use queues")
http://stackoverflow.com/questions/31665328/python-3-multiprocessing-queue-deadlock-when-calling-join-before-the-queue-is-em
http://stackoverflow.com/questions/31708646/process-join-and-queue-dont-work-with-large-numbers
http://stackoverflow.com/questions/19071529/python-multiprocessing-125-list-never-finishes
2016-05-06 12:11:16 -04:00
Maxime Beauchemin
d2f3fb4366
[AIRFLOW-53] Adding DagBag stats report to CLI's list_dags ( #1468 )
...
Adding DagBag stats report to CLI's list_dags
Removing logging call in favor of CLI, on-demend based approach
Addressing Dan's feedback
2016-05-06 08:54:31 -07:00
Sid Anand
c3614d1867
Merge pull request #1466 from r39132/master
...
[AIRFLOW-39] Don't insert dag_runs beyond the min task end_date
2016-05-05 19:09:44 +00:00
Siddharth Anand
e15a92b669
Don't insert dag_runs beyond the min task end_date
2016-05-05 18:53:54 +00:00
Sid Anand
93538f4682
Merge pull request #1464 from geeknam/feature/cx_oracle_bulk_insert
...
[AIRFLOW-50] Add bulk_insert_rows() to OracleHook for more performant inserts.
2016-05-05 17:22:25 +00:00
Chris Riccomini
36be57e990
Merge pull request #1453 from alexvanboxel/feature/AIRFLOW-21-upgrade-gcp-lib
...
AIRFLOW-21 upgrade GCP client lib
2016-05-05 10:05:23 -07:00
Alex Van Boxel
b7f0245e36
AIRFLOW-21 upgrade GCP client lib
2016-05-05 08:56:50 +02:00
Nam Ngo
bece6af289
Add bulk_insert_rows() for more performant inserts.
2016-05-04 18:21:51 +10:00
Maxime Beauchemin
aeb5a07ff9
Docs tweaks while generating the docs
2016-05-03 22:13:35 -07:00
Maxime Beauchemin
3c3f5a67ff
[AIRFLOW-42] Adding logging.debug DagBag loading stats ( #1460 )
...
* Adding logging.debug DagBag loading stats
* Linting
* Fix py3
* Tweaks
2016-05-03 14:21:21 -07:00
Jeremiah Lowin
d6f4d7c063
Merge pull request #1457 from jlowin/docker-import
...
[AIRFLOW-38] Gracefully fail unit tests when docker-py isn't installed
2016-05-03 15:39:53 -04:00
Sid Anand
2a2b7e8c2b
Merge pull request #1430 from whummer/success_recursive
...
[AIRFLOW-35] Enable UI feature to recursively set success for nested DAG operators
2016-05-03 11:33:04 -07:00