incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
Xiaodong	c0c63ae2a4	[AIRFLOW-2839] Refine Doc Concepts->Connections (#3678 )	2018-08-05 19:08:15 +01:00
Tao Feng	da4f254283	[AIRFLOW-XXX] Add Feng Tao to committers list (#3689 )	2018-08-03 20:42:30 +01:00
Cameron Moberg	b4f43e6c48	[AIRFLOW-2658] Add GCP specific k8s pod operator (#3532 ) Executes a task in a Kubernetes pod in the specified Google Kubernetes Engine cluster. This makes it easier to interact with GCP kubernetes engine service because it encapsulates acquiring credentials.	2018-08-02 20:44:16 +01:00
Xiaodong	b120427b65	[AIRFLOW-2820] Add Web UI triggger in doc "Scheduling & Triggers" In documentation page "Scheduling & Triggers", it only mentioned the CLI method to manually trigger a DAG run. However, the manual trigger feature in Web UI should be mentioned as well (it may be even more frequently used by users).	2018-08-01 14:08:21 -07:00
Marcus Rehm	9983466fd1	[AIRFLOW-2795] Oracle to Oracle Transfer Operator (#3639 )	2018-07-31 21:22:40 +02:00
Bolke de Bruin	af15f1150d	[AIRFLOW-2816] Fix license text in docs/license.rst	2018-07-28 13:20:26 +02:00
Amir Shahatit	98c7080361	Fix Typo in Scheduler documentation Closes #3618 from amir656/patch-1	2018-07-21 13:33:29 +01:00
Marcus Rehm	52c745da71	[AIRFLOW-2596] Add Oracle to Azure Datalake Transfer Operator Closes #3613 from marcusrehm/oracle_to_azure_datalake_transfer	2018-07-20 22:46:59 +02:00
Ivan Arozamena	ee4fc35774	[AIRFLOW-2749] Add feature to delete BQ Dataset Closes #3598 from MENA1717/Add-bq-op	2018-07-17 13:56:05 +01:00
Matthew Thorley	6b7645261b	[AIRFLOW-2710] Clarify fernet key value in documentation Closes #3574 from padwasabimasala/AIRFLOW-2710	2018-07-08 20:52:51 +02:00
Tim Swast	89c1f530da	[AIRFLOW-2682] Add how-to guides for bash and python operators Closes #3552 from tswast/airflow-2682-bash-python- how-to	2018-06-29 14:15:16 +02:00
Kevin Yang	284dbdb60a	[AIRFLOW-2359] Add set failed for DagRun and task in tree view Closes #3255 from yrqls21/kevin_yang_add_set_failed	2018-06-28 13:30:36 -07:00
Kaxil Naik	7961ee8f08	[AIRFLOW-2663] Add instructions to install SSH dependencies Closes #3536 from kaxil/patch-1	2018-06-22 16:35:48 +02:00
Kengo Seki	5f49ebf018	[AIRFLOW-2640] Add Cassandra table sensor Just like a partition sensor for Hive, this PR adds a sensor that waits for a table to be created in Cassandra cluster. Closes #3518 from sekikn/AIRFLOW-2640	2018-06-20 20:36:32 +02:00
niels	3dade5413f	[AIRFLOW-2559] Azure Fileshare hook Closes #3457 from NielsZeilemaker/fileshare_hook	2018-06-18 22:23:53 +01:00
Cameron Moberg	dc38b2f46d	[AIRFLOW-2613] Fix Airflow searching .zip bug When Airflow was populating a DagBag from a .zip file, if a single file in the root directory did not contain the strings 'airflow' and 'DAG' it would ignore the entire .zip file. Also added a small amount of logging to not bombard user with info about skipping their .py files. Closes #3505 from Noremac201/dag_name	2018-06-17 19:16:12 +01:00
Kengo Seki	4d153ad4e8	[AIRFLOW-2627] Add a sensor for Cassandra Closes #3510 from sekikn/AIRFLOW-2627	2018-06-17 19:10:48 +01:00
Cameron Moberg	7255589f95	[AIRFLOW-2562] Add Google Kubernetes Engine Operators Add Google Kubernetes Engine create_cluster, delete_cluster operators This allows users to use airflow to create or delete clusters in the google cloud platform Closes #3477 from Noremac201/gke_create	2018-06-15 20:44:29 +01:00
Tim Swast	0f4d681f6f	[AIRFLOW-2512][AIRFLOW-2522] Use google-auth instead of oauth2client * Updates the GCP hooks to use the google-auth library and removes dependencies on the deprecated oauth2client package. * Removes inconsistent handling of the scope parameter for different auth methods. Note: using google-auth for credentials requires a newer version of the google-api-python-client package, so this commit also updates the minimum version for that. To avoid some annoying warnings about the discovery cache not being supported, so disable the discovery cache explicitly as recommend here: https://stackoverflow.com/a/44518587/101923 Tested by running: nosetests tests/contrib/operators/test_dataflow_operator.py \ tests/contrib/operators/test_gcs.py \ tests/contrib/operators/test_mlengine_.py \ tests/contrib/operators/test_pubsub_operator.py \ tests/contrib/hooks/test_gcp*.py \ tests/contrib/hooks/test_gcs_hook.py \ tests/contrib/hooks/test_bigquery_hook.py and also tested by running some GCP-related DAGs locally, such as the Dataproc DAG example at https://cloud.google.com/composer/docs/quickstart Closes #3488 from tswast/google-auth	2018-06-12 23:53:21 +01:00
renzofrigato	be3d551f72	[AIRFLOW-1115] fix github oauth api URL Closes #3469 from renzofrigato/airflow_1115	2018-06-11 15:14:02 -07:00
Andy Cooper	9e1d8ee837	[AIRFLOW-83] add mongo hook and operator Closes #3440 from andscoop/AIRFLOW_83_add_mongo_hooks_and_operators	2018-06-05 23:30:02 +01:00
Charles Caygill	817296a7be	[AIRFLOW-XXX] Fix doc typos Closes #3459 from ccayg-sainsburys/master	2018-06-04 11:15:38 -07:00
Chao-Han Tsai	2800c8e556	[AIRFLOW-2526] dag_run.conf can override params Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2526 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: params can be overridden by the dictionary passed through `airflow backfill -c` ``` templated_command = """ echo "text = {{ params.text }}" """ bash_operator = BashOperator( task_id='bash_task', bash_command=templated_command, dag=dag, params= { "text" : "normal processing" }) ``` In daily processing it prints: ``` normal processing ``` In backfill processing `airflow trigger_dag -c "{"text": "override success"}"`, it prints ``` override success ``` ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" \| flake8 --diff` Closes #3422 from milton0825/params-overridden- through-cli	2018-06-01 11:22:10 -07:00
Tao feng	b81bd08a33	[AIRFLOW-2538] Update faq doc on how to reduce airflow scheduler latency Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2538 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Update the faq doc on how to reduce airflow scheduler latency. This comes from our internal production setting which also aligns with Maxime's email(https://lists.apache.org/thread.html/%3CCAHE Ep7WFAivyMJZ0N+0Zd1T3nvfyCJRudL3XSRLM4utSigR3dQmai l.gmail.com%3E). ### Tests - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [ ] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [ ] Passes `git diff upstream/master -u -- "*.py" \| flake8 --diff` Closes #3434 from feng-tao/update_faq	2018-05-31 22:01:59 -07:00
Chao-Han Tsai	d5d97dc971	[AIRFLOW-2536] docs about how to deal with airflow initdb failure Add docs to faq.rst to talk about how to deal with Exception: Global variable explicit_defaults_for_timestamp needs to be on (1) for mysql Closes #3429 from milton0825/fix-docs	2018-05-29 20:29:27 +01:00
Tim Swast	4c0d67f0d0	[AIRFLOW-2523] Add how-to for managing GCP connections I'd like to have how-to guides for all connection types, or at least the different categories of connection types. I found it difficult to figure out how to manage a GCP connection, this commit add a how-to guide for this. Also, since creating and editing connections really aren't all that different, the PR renames the "creating connections" how-to to "managing connections". Closes #3419 from tswast/howto	2018-05-25 09:37:29 +01:00
Chao-Han Tsai	66f00bbf7b	[AIRFLOW-2510] Introduce new macros: prev_ds and next_ds Closes #3418 from milton0825/introduce-next_ds- prev_ds	2018-05-25 10:13:49 +02:00
Kengo Seki	e4e7b55ad7	[AIRFLOW-2518] Fix broken ToC links in integration.rst Closes #3412 from sekikn/AIRFLOW-2518	2018-05-24 21:55:19 +01:00
Tim Swast	084bc91367	[AIRFLOW-2509] Separate config docs into how-to guides Also moves how-to style instructions for logging from "integration" page to a "Writing Logs" how-to. Closes #3400 from tswast/howto	2018-05-23 10:08:53 +01:00
roc	fff87b5cfd	[AIRFLOW-2397] Support affinity policies for Kubernetes executor/operator KubernetesPodOperator now accept a dict type parameter called "affinity", which represents a group of affinity scheduling rules (nodeAffinity, podAffinity, podAntiAffinity). API reference: https://kubernetes.io/docs/referenc e/generated/kubernetes-api/v1.10/#affinity-v1-core Closes #3369 from imroc/AIRFLOW-2397	2018-05-19 00:47:53 +02:00
Tao feng	8a2cd08ce8	[AIRFLOW-2479] Improve doc FAQ section Closes #3373 from feng-tao/airflow-2478	2018-05-19 00:38:27 +02:00
Joy Gao	f5115b7e6a	[ARIFLOW-2458] Add cassandra-to-gcs operator Closes #3354 from jgao54/cassandra-to-gcs	2018-05-18 02:02:57 +01:00
Marcus Rehm	7c233179e9	[AIRFLOW-2420] Azure Data Lake Hook Add AzureDataLakeHook as a first step to enable Airflow connect to Azure Data Lake. The hook has a simple interface to upload and download files with all parameters available in Azure Data Lake sdk and also a check_for_file to query if a file exists in data lake. [AIRFLOW-2420] Add functionality for Azure Data Lake Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW-242 0) issues and references them in the PR title. - https://issues.apache.org/jira/browse/AIRFLOW-2420 ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: This PR creates Azure Data Lake hook (adl_hook.AdlHook) and all the setup required to create a new Azure Data Lake connection. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Adds tests to airflow.hooks.adl_hook.py in tests.hooks.test_adl_hook.py ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added. ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" \| flake8 --diff` Closes #3333 from marcusrehm/master	2018-05-15 10:30:54 -07:00
Sakshi Bansal	c1d583f91a	[AIRFLOW-2213] Add Quoble check operator Closes #3300 from sakshi2894/AIRFLOW-2213	2018-05-15 14:51:35 +05:30
Kengo Seki	b76d560ce1	[AIRFLOW-2465] Fix wrong module names in the doc Closes #3359 from sekikn/AIRFLOW-2465	2018-05-15 08:54:04 +02:00
Daniel Imberman	8fa0bbd56e	[AIRFLOW-2460] Users can now use volume mounts and volumes When launching pods using k8s operator Closes #3356 from dimberman/k8s-mounts	2018-05-14 21:59:59 +02:00
Kaxil Naik	cb9ba02cfe	[AIRFLOW-XXX] Updated contributors list Closes #3358 from kaxil/patch-3	2018-05-14 20:47:33 +01:00
Cristòfol Torrens	9c915c1c8b	[AIRFLOW-2461] Add support for cluster scaling on dataproc operator Closes #3357 from piffall/master	2018-05-14 16:38:28 +01:00
Bolke de Bruin	648e1e6930	[AIRFLOW-2425] Add lineage support Add lineage support by having inlets and oulets that are made available to dependent upstream or downstream tasks. If configured to do so can send lineage data to a backend. Apache Atlas is supported out of the box. Closes #3321 from bolkedebruin/lineage_exp	2018-05-14 09:09:25 +02:00
Jordan Zucker	4d43b78f11	[AIRFLOW-2333] Add Segment Hook and TrackEventOperator Add support for Segment with an accompanying hook and an operator for sending track events Closes #3335 from jzucker2/add-segment-support	2018-05-11 09:25:19 +02:00
Kengo Seki	686e805e67	[AIRFLOW-2446] Add S3ToRedshiftTransfer into the "Integration" doc This PR adds an undocumented AWS-related operator into the "Integration" section and fixes some obsolete description. Closes #3340 from sekikn/AIRFLOW-2446	2018-05-10 20:02:49 +02:00
Luke Bodeen	e5f2a38d6a	[AIRFLOW-1978] Add WinRM windows operator and hook Closes #3316 from cloneluke/winrm_connector2	2018-05-08 11:12:59 -07:00
Kengo Seki	6f6884641f	[AIRFLOW-XXX] Fix wrong table header in scheduler.rst Closes #3306 from sekikn/table_header	2018-05-02 23:47:33 -07:00
Sergio Ballesteros	12ab796b11	[AIRFLOW-2394] default cmds and arguments in kubernetes operator Commands aand arguments to docker image in kubernetes operator Closes #3289 from ese/k8soperator	2018-05-02 15:43:51 +02:00
Moe Nadal	a67c13e44c	[AIRFLOW-2401] Document the use of variables in Jinja template Closes #2847 from moe-nadal-ck/patch-1	2018-04-30 15:06:10 -07:00
Tao feng	700c0f488f	[AIRFLOW-2389] Create a pinot db api hook Closes #3274 from feng-tao/pinot_db_hook	2018-04-30 08:41:43 +02:00
Bovard Doerschuk-Tiberi	2a8bb0e1b7	[AIRFLOW-1835] Update docs: Variable file is json Searching through all the documentation I couldn't find anywhere that explained what file format it expected for uploading settings. Closes #2802 from bovard/variable_files_are_json	2018-04-25 14:21:35 -07:00
Tristram Oaten	fd6f1d1a07	[AIRFLOW-2041] Correct Syntax in python examples I parsed it with the ol' eyeball compiler. Someone could flake8 it better, perhaps. Changes: - correct `def` syntax on line 50 - use literal dict on line 67 Closes #2479 from 0atman/patch-1	2018-04-24 23:04:38 -07:00
Agraj Mangal	1f86299cf9	[AIRFLOW-2068] MesosExecutor allows optional Docker image In its current form, MesosExecutor schedules tasks on mesos slaves which just contain airflow commands assuming that the mesos slaves already have airflow installed and configured on them. This assumption goes against the Mesos philosophy of having a heterogeneous cluster. Since Mesos provides an option to pull a Docker image before actually running the actual task/command so this improvement changes the mesos_executor.py to specify an optional docker image containing airflow which can be pulled on slaves before running the actual airflow command. This also opens the door for an optimization of resources in a future PR, by allowing the specification of CPU and memory needed for each airflow task. Closes #3008 from agrajm/AIRFLOW-2068	2018-04-23 22:22:35 +02:00
Fokko Driesprong	e30a1f451a	[AIRFLOW-2357] Add persistent volume for the logs The logs are kept inside of the worker pod. By attaching a persistent disk we keep the logs and make them available for the webserver. - Remove the requirements.txt since we dont want to maintain another dependency file - Fix some small casing stuff - Removed some unused code - Add missing shebang lines - Started on some docs - Fixed the logging Closes #3252 from Fokko/airflow-2357-pd-for-logs	2018-04-23 18:43:24 +02:00

1 2 3 4 5 ...

427 Коммитов