Граф коммитов

645 Коммитов

Автор SHA1 Сообщение Дата
Tao Feng d151f75d9d [AIRFLOW-XXX] Update committer list based on latest TLP discussion (#4427) 2019-01-05 14:10:31 +00:00
Tao Feng 67572025cc [AIRFLOW-3612] Remove incubation/incubator mention (#4419) 2019-01-05 14:05:25 +00:00
Jarek Potiuk c31c2bde25 [AIRFLOW-3480] Add GCP Spanner Database Operators (#4353) 2019-01-05 12:06:56 +00:00
Dariusz Aniszewski d0233ba643 [AIRFLOW-3446] Add Google Cloud BigTable operators (#4354) 2019-01-04 13:50:15 +00:00
Tao Feng 02106e77ca
[AIRFLOW-XXX] Add a doc on how to add a new role in RBAC UI (#4426) 2019-01-03 11:34:49 -08:00
Kaxil Naik 55ec82439e [AIRFLOW-3560] Add DayOfWeek Sensor (#4363)
* [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors

* Change to using Enum

* Fix Docstring

* Refactor into a Single Sensor
2019-01-03 10:35:35 +01:00
Jimmy Cao 9266c0fb60 [AIRFLOW-2548] Output plugin import errors to web UI (#3930) 2019-01-02 14:11:15 +01:00
Riccardo Bini 9de9721b48 [AIRFLOW-3281] Fix Kubernetes operator with git-sync (#3770)
* Refactor Kubernetes operator with git-sync

Currently the implementation of git-sync is broken because:
- git-sync clones the repository in /tmp and not in airflow-dags volume
- git-sync add a link to point to the revision required but it is not
taken into account in AIRFLOW__CORE__DAGS_FOLDER

Dags/logs hostPath volume has been added (needed if airflow run in
kubernetes in local environment)

To avoid false positive in CI `load_examples` is set to `False`
otherwise DAGs from `airflow/example_dags` are always loaded. In this
way is possible to test `import` in DAGs

Remove `worker_dags_folder` config:
`worker_dags_folder` is redundant and can lead to confusion.
In WorkerConfiguration `self.kube_config.dags_folder` defines the path of
the dags and can be set in the worker using airflow_configmap
Refactor worker_configuration.py
Use a docker container to run setup.py
Compile web assets
Fix codecov application path

* Fix kube_config.dags_in_image
2018-12-30 21:03:32 -08:00
yileic 319a659dda [AIRFLOW-XXX] Update tutorial.rst (#4336) 2018-12-26 16:45:19 -08:00
Omeed Musavi 01880dcb3f [AIRFLOW-2568] Azure Container Instances operator (#4121)
Add an operator to create a Docker container in Azure Container
Instances. Azure Container Instances hosts a container and abstracts
away the infrastructure around orchestration of a container service.

Operator supports creating an ACI container and pull an image from Azure
Container Registry or public Docker registries.
2018-12-26 21:06:28 +01:00
eladkal 69c75b117c [AIRFLOW-1684] - Branching based on XCom variable (Docs) (#4365)
Elaborate how to use branching with xcoms
2018-12-26 20:56:53 +01:00
Tao Feng 0e36566595 [AIRFLOW-850] Add a PythonSensor (#4349) 2018-12-22 18:13:39 +00:00
BasPH c9a82d48a3 [AIRFLOW-3458] Move models.Connection into separate file (#4335) 2018-12-20 13:15:37 +01:00
Szymon Przedwojski 16d69a272b [AIRFLOW-3398] Google Cloud Spanner instance database query operator (#4314) 2018-12-19 21:41:53 +00:00
Felix c68ec18a6b [AIRFLOW-XXX] Add missing remote logging field (#4333) 2018-12-17 18:05:21 +00:00
Kaxil Naik e44fc88397
[AIRFLOW-3447] Add 2 options for ts_nodash Macro (#4323) 2018-12-15 23:13:36 +00:00
tal181 d7c954c97e [AIRFLOW-3411] Add OpenFaaS hook (#4267) 2018-12-13 01:23:47 +00:00
Szymon Przedwojski a8386233aa [AIRFLOW-3310] Google Cloud Spanner deploy / delete operators (#4286) 2018-12-13 01:15:43 +00:00
Kaxil Naik 437196917f [AIRFLOW-XXX] Fix Minor issues with Azure Cosmos Operator (#4289)
- Fixed Documentation in integration.rst
- Fixed Incorrect type in docstring of `AzureCosmosInsertDocumentOperator`
- Added the Hook, Sensor and Operator in code.rst
- Updated the name of example DAG and its filename to follow the convention
2018-12-08 14:31:39 -08:00
yangaws 788bd6fcb3 [AIRFLOW-2524] Add SageMaker doc to AWS integration section (#4278) 2018-12-06 19:51:11 +00:00
Tom Miller 9a80ab04f5 [AIRFLOW-3406] Implement an Azure CosmosDB operator (#4265)
Add an operator and hook to manipulate and use Azure
CosmosDB documents, including creation, deletion, and
updating documents and collections.

Includes sensor to detect documents being added to a
collection.
2018-12-06 10:18:44 -08:00
Kaxil Naik 9dce1f0740 [AIRFLOW-3408] Remove outdated info from Systemd Instructions (#4269) 2018-12-05 12:50:16 -08:00
Jarek Potiuk 4cde579d12 [AIRFLOW-XXX] GCP operators documentation clarifications (#4273) 2018-12-05 20:35:29 +00:00
Szymon Przedwojski 6fda8f4ff1 [AIRFLOW-2440] Google Cloud SQL import/export operator (#4251) 2018-12-05 20:33:00 +00:00
Gabriel Nicolas Avellaneda d4f545fbe3 [AIRFLOW-XXX] Add Kubernetes Dependency in Extra Packages Doc (#4281) 2018-12-05 19:55:38 +00:00
SUNNY c9f987173d [AIRFLOW-XXX] Update kubernetes.rst (#4280)
import modules to complete the example set.
2018-12-05 10:17:55 -08:00
Ash Berlin-Taylor 1bbf219b49
[AIRFLOW-3431] Document how to report security vulnerabilities. (#4262)
Wording based on Kafka's

[ci-skip]
2018-12-03 10:01:53 +00:00
Ash Berlin-Taylor 22bd7c6da5 [AIRFLOW-XXX] Fix display of SageMaker operators/hook docs (#4263) 2018-12-02 20:51:30 +00:00
Kaxil Naik 77e233e525
[AIRFLOW-XXX] Add missing GCP operators to Docs (#4260) 2018-12-02 11:08:26 +00:00
tal181 24963d10e6 [AIRFLOW-3403] Add AWS Athena Sensor (#4244) 2018-12-01 21:14:20 +00:00
Kaxil Naik 53a365365a
[AIRFLOW-3410] Add feature to allow Host Key Change for SSH Op (#4249) 2018-11-28 16:52:15 +00:00
Fokko Driesprong 2fd409d194 [AIRFLOW-XXX] Replace airflow with apache-airflow (#4246) 2018-11-27 19:53:58 +00:00
Iuliia Volkova be6d35d3a4 [AIRFLOW-3395] added the REST API endpoints to the doc (#4236) 2018-11-26 10:58:33 +01:00
Benji Visser 44df8a1d47 [AIRFLOW-XXX] Remove quotes from domains in Google Oauth (#4226)
Related SO: https://stackoverflow.com/a/52528091/10638329
2018-11-26 10:12:10 +01:00
rmn36 5955db1c76 [AIRFLOW-3336] Add new TriggerRule for 0 upstream failures (#4182)
Add new TriggerRule that triggers only if all upstream do not fail (success or skipped tasks are allowed)
2018-11-23 18:41:04 +00:00
Brandon Kvarda 6dee66f466 [AIRFLOW-3213] Create ADLS to GCS operator (#4134) 2018-11-21 23:18:40 +00:00
Bartosz Ługowski 69eeab4e69 [AIRFLOW-3380] Add metrics documentation (#4219) 2018-11-21 09:44:43 -08:00
BasPH ad2c968fc8 [AIRFLOW-3375] Support returning multiple tasks with BranchPythonOperator (#4215) 2018-11-21 15:59:19 +01:00
Gabriel Nicolas Avellaneda e56e625152 [AIRFLOW-XXX] Better instructions for airflow flower (#4214)
* Better instructions for airflow flower

It is not clear in the documentation that you need to have flower installed to successful run airflow flower. If you don't have flower installed, running airflow flower will show the following error which is not of much help:

airflow flower                                                                                       
[2018-11-20 17:01:14,836] {__init__.py:51} INFO - Using executor SequentialExecutor                                                      
Traceback (most recent call last):                                                                                                       
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/bin/airflow", line 32, in <module>                     
    args.func(args)                                                                                                                      
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/site-packages/airflow/utils/cli.py", line
 74, in wrapper                                                                                                                          
    return f(*args, **kwargs)                                                                                                            
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/site-packages/airflow/bin/cli.py", line 1
221, in flower                                                                                                                           
    broka, address, port, api, flower_conf, url_prefix])                                                                                 
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 559, in execvp              
    _execvpe(file, args)                                                                                                                 
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 604, in _execvpe            
    raise last_exc.with_traceback(tb)                                                                                                    
  File "/mnt/secondary/workspace/f4/typo-backend/pipelines/model-pipeline/airflow/lib/python3.6/os.py", line 594, in _execvpe            
    exec_func(fullname, *argrest)                                                                                                        FileNotFoundError: [Errno 2] No such file or directory

* Update use-celery.rst
2018-11-20 15:13:02 -08:00
BasPH fa69f22c78 [AIRFLOW-XXX] Remove spots in all Airflow logos (#4206) 2018-11-17 18:51:02 +00:00
Joshua Carp 6b68f08edd [AIRFLOW-3346] Add hook and operator for GCP transfer service (#4189) 2018-11-16 23:54:50 +00:00
phanindhra 7d23dd83a2 [AIRFLOW-3266] Add AWS Athena Hook and Operator (#4111)
Provides AWS Athena hook and operator to submit Athena(presto) queries on AWS.

Authored-by: Phanindhra <phani8996@gmail.com>
2018-11-16 13:08:37 +00:00
ron819 8668ef869d [AIRFLOW-3350] Explain how to use Bitshift Composition with lists (#4191) 2018-11-15 20:33:47 +00:00
Szymon Przedwojski 0f7eca2b60 [AIRFLOW-3345] Add Google Cloud Storage (GCS) operators for ACL (#4192)
Add 2 operators for adding ACL entries to GCS buckets and objects:
- GoogleCloudStorageBucketCreateAclEntryOperator
- GoogleCloudStorageObjectCreateAclEntryOperator
2018-11-15 20:27:37 +00:00
Alvin Ali Khaled 6e9f16f310 [AIRFLOW-XXX] Revise template variables documentation (#4172)
Updated documentation to elaborate on the (yesterday|tomorrow)_.*
variables' relations to the execution date.
2018-11-14 16:29:30 +05:30
Joshua Carp 456d955953 [AIRFLOW-XXX] Fix typo in plugin docs (#4183) 2018-11-14 16:26:56 +05:30
Xiaodong 86a83bfff3 [AIRFLOW-3323] Support HTTP basic authentication for Airflow Flower (#4166)
The current `airflow flower` doesn't come with any authentication.
This may make essential information exposed in an untrusted environment.

This commit add support to HTTP basic authentication for Airflow Flower

Ref:
https://flower.readthedocs.io/en/latest/auth.html
2018-11-13 14:48:23 +00:00
Ash Berlin-Taylor b9fc03ea1a [AIRFLOW-2779] Add license headers to doc files (#4178)
This adds ASF license headers to all the .rst and .md files with the
exception of the Pull Request template (as that is included verbatim
when opening a Pull Request on Github which would be messy)
2018-11-13 15:01:44 +01:00
Jarek Potiuk d7e80d60f9 [AIRFLOW-3275] Add Google Cloud SQL Query operator (#4170) 2018-11-13 09:43:36 +00:00
Vincent Ketelaars b81cceb8bd [AIRFLOW-XXX] Correct schedule_interval in Scheduler docs (#4157) 2018-11-13 11:56:03 +05:30
Jarek Potiuk 7ee30b6fac [AIRFLOW-3220] Add Instance Group Manager Operators for GCE (#4167) 2018-11-12 23:32:19 +00:00
Szymon Biliński 4462021da6 [AIRFLOW-XXX] Add missing docs for SNS classes (#4155) 2018-11-11 22:26:29 +00:00
Felix eb7f959858 [AIRFLOW-3315] Add ImapAttachmentSensor (#4161)
- update license header in imap_hook and test_imap_hook
2018-11-11 22:24:56 +00:00
bolkedebruin 2c4b0eab7d [AIRFLOW-3164] Verify server certificate when connecting to LDAP (#4006) 2018-11-09 13:58:34 +00:00
yangaws 52ee34c5fc [AIRFLOW-2524] More AWS SageMaker operators, sensors for model, endpoint-config and endpoint (#4126) 2018-11-08 21:59:38 +00:00
Sumit Maheshwari 3483b9743f [AIRFLOW-XXX] Update chat channel details from gitter to slack (#4149) 2018-11-07 09:51:27 +00:00
Felix 44c43363f3 [AIRFLOW-2780] Add IMAP Hook to retrieve email attachments (#4119)
[AIRFLOW-2780] Add IMAP Hook to retrieve email attachments
- Add has_mail_attachments to check if there are mail attachments in the given mailbox with the given attachment name
- Add retrieve_mail_attachments to download the attachments to a local directory
- Add some test cases but more are coming
- Add license header
- Change retrieve_mail_attachments to download_mail_attachments
- Add retrieve_mail_attachments that return a list of tuple containing the attachments found
- Change IMAP4_SSL close() method to be called after retrieving the attachments and not before logging out
- Change test_connect to not check for close method because no mail folder will be opened when only connecting
- Add some test cases that are still in WIP
- Fixes a bug causing multiple attachments in a single mail not being correctly added to the all mails attachments
- Fixes a bug where MailPart is_attachment always returns None
- Add logging when an attachment has been found that matches the name
- Add more test cases with sample mail
2018-11-05 19:41:14 +01:00
Jarek Potiuk b5ecb8a78b [AIRFLOW-3268] Better handling of extras field in MySQL connection (#4113) 2018-11-05 16:22:17 +00:00
Sumit Maheshwari 55a62b419d [AIRFLOW-3294] Update connections form and integration docs (#4129) 2018-11-05 09:56:50 +00:00
Szymon Przedwojski 92cb5c74d8 [AIRFLOW-3276] Cloud SQL: database create / patch / delete operators (#4124) 2018-11-02 13:38:31 +00:00
yangaws bc3108edc1 [AIRFLOW-2524] Update SageMaker hook and operators (#4091)
This re-works the SageMaker functionality in Airflow to be more complete, and more useful for the kinds of operations that SageMaker supports.

We removed some files and operators here, but these were only added after the last release so we don't need to worry about any sort of back-compat.
2018-11-01 20:31:59 +00:00
Brandon Kvarda 66cad8d6a0 [AIRFLOW-3236] Create AzureDataLakeStorageListOperator (#4094) 2018-10-31 23:08:52 +00:00
Szymon Przedwojski 5e248ba3f2 [AIRFLOW-3231] Basic operators for Google Cloud SQL (#4097)
Add CloudSqlInstanceInsertOperator, CloudSqlInstancePatchOperator and CloudSqlInstanceDeleteOperator.

Each operator includes:
- core logic
- input params validation
- unit tests
- presence in the example DAG
- docstrings
- How-to and Integration documentation

Additionally, small improvements to GcpBodyFieldValidator were made:
- add simple list validation capability (type="list")
- introduced parameter allow_empty, which can be set to False
	to test for non-emptiness of a string instead of specifying
	a regexp.

Co-authored-by: sprzedwojski <szymon.przedwojski@polidea.com>
Co-authored-by: potiuk <jarek.potiuk@polidea.com>
2018-10-31 23:00:46 +00:00
Xiaodong c4e5151bcd [AIRFLOW-XXX] Minor fix of docs/scheduler.rst (#4115)
DaskExecutor is not mentioned in docs/scheduler.rst,
while it's listed as one of the main executors in 
`airflow/config_templates/default_airflow.cfg`.
2018-10-30 09:32:49 +00:00
Jarek f95c7b4639 [AIRFLOW-1970] Let empty Fernet key or special `no encryption` phrase. (#4038)
Once the user has installed Fernet package then the application enforces setting valid Fernet key.
This change will alter this behavior into letting empty Fernet key or special `no encryption` phrase and interpreting those two cases as no encryption desirable.
2018-10-26 14:29:00 +02:00
Kevin Yang 75e2288a3f [Airflow-2760] Decouple DAG parsing loop from scheduler loop (#3873) 2018-10-26 09:37:10 +01:00
BasPH 62b21d7475 [AIRFLOW-3237] Refactor example DAGs (#4071) 2018-10-26 09:19:52 +01:00
Olivier Morissette 1c76e8a9a5 [AIRFLOW-2744] Allow RBAC to accept plugins for views and links. (#4036)
Airflow Users that wish to create plugins for the new www_rbac UI
can not add plugin views or links. This PR fixes that by letting
a user specify their plugins for www_rbac and maintains backwards
compatibility with the existing plugins system.
2018-10-23 17:21:30 +01:00
Jarek Potiuk f5e3b03aa6 [AIRFLOW-3232] More readable GCF operator documentation (#4067) 2018-10-18 17:49:08 +01:00
BasPH 9681954c97 [AIRFLOW-XXX] Removed spot in Airflow logo (#4065) 2018-10-17 20:37:56 +02:00
Kaxil Naik bf7045364a
[AIRFLOW-XXX] Add `BigQueryGetDataOperator` to Integration Docs (#4063) 2018-10-17 14:46:39 +01:00
Siddharth Gupta b862ca8e1f [AIRFLOW-3202] add missing documentation for AWS hooks/operator (#4048)
Add missing documentation for AWS hooks/operator
2018-10-13 09:16:40 +02:00
PaulVelthuis93 7ae0fe2bb4 [AIRFLOW-3187] Update airflow.gif file with a slower version (#4033) 2018-10-12 14:13:58 +02:00
Ash Berlin-Taylor c4f3f6b199 [AIRFLOW-3178] Handle percents signs in configs for airflow run (#4029)
* [AIRFLOW-3178] Don't mask defaults() function from ConfigParser

ConfigParser (the base class for AirflowConfigParser) expects defaults()
to be a function - so when we re-assign it to be a property some of the
methods from ConfigParser no longer work.

* [AIRFLOW-3178] Correctly escape percent signs when creating temp config

Otherwise we have a problem when we come to use those values.

* [AIRFLOW-3178] Use os.chmod instead of shelling out

There's no need to run another process for a built in Python function.

This also removes a possible race condition that would make temporary
config file be readable by more than the airflow or run-as user
The exact behaviour would depend on the umask we run under, and the
primary group of our user, likely this would mean the file was readably
by members of the airflow group (which in most cases would be just the
airflow user). To remove any such possibility we chmod the file
before we write to it
2018-10-12 11:13:05 +02:00
Justin Holmes 3f0fb0582c [AIRFLOW-2956] Add kubernetes tolerations (#3806) 2018-10-12 08:06:13 +02:00
Szymon Przedwojski cdbdcae7c0 [AIRFLOW-3078] Basic operators for Google Compute Engine (#4022)
Add GceInstanceStartOperator, GceInstanceStopOperator and GceSetMachineTypeOperator.

Each operator includes:
- core logic
- input params validation
- unit tests
- presence in the example DAG
- docstrings
- How-to and Integration documentation

Additionally, in GceHook error checking if response is 200 OK was added:

Some types of errors are only visible in the response's "error" field
and the overall HTTP response is 200 OK.

That is why apart from checking if status is "done" we also check
if "error" is empty, and if not an exception is raised with error
message extracted from the "error" field of the response.

In this commit we also separated out Body Field Validator to
separate module in tools - this way it can be reused between
various GCP operators, it has proven to be usable in at least
two of them now.

Co-authored-by: sprzedwojski <szymon.przedwojski@polidea.com>
Co-authored-by: potiuk <jarek.potiuk@polidea.com>
2018-10-10 10:49:57 +01:00
Ash Berlin-Taylor 76ad6f0938 [AIRFLOW-3173] Add _cmd options for password config options (#4024)
There were a few more "password" config options added over the last few
months that didn't have _cmd options. Any config option that is a
password should be able to be provided via a _cmd version.
2018-10-10 10:46:28 +01:00
Joshua Carp 1f3c95b368 [AIRFLOW-3086] Add extras group for google auth to setup.py. (#3917)
To clarify installation instructions for the google auth backend, add an
install group to `setup.py` that installs dependencies google auth via
`pip install apache-airflow[google_auth]`.
2018-10-09 16:14:07 +01:00
ron819 b8be322d3b [AIRFLOW-XXX] Update manage-connections.rst (#4020)
Explain how to connect with MySQL
2018-10-08 09:01:35 -07:00
Songkran Nethan c23db51531 [AIRFLOW-XXX] Fix wrong {{ next_ds }} description (#4017) 2018-10-08 17:48:05 +05:30
bolkedebruin 9bea6228d9 [AIRFLOW-3165] Document interpolation of '%' and warn (#4007) 2018-10-06 21:51:13 +01:00
akshayi1 5c18a28722 [AIRFLOW-3159] Update GCS logging docs for latest code (#3952) 2018-10-05 09:47:48 +01:00
Seth Woodworth 2e606f3167 [AIRFLOW-3088] Include slack-compatible emoji image 2018-10-04 19:58:05 -07:00
Joshua Carp 45ed3cec82 [AIRFLOW-3137] Make ProxyFix middleware optional. (#3983)
The ProxyFix middleware should only be used when airflow is running
behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that
defaults to `False`.
2018-10-02 16:16:55 +01:00
Sumit Maheshwari e8f90657b7
[AIRFLOW-3062] Add Qubole in integration docs (#3946) 2018-10-01 11:58:27 +05:30
Jarek Potiuk 8595530865 [AIRFLOW-2912] Add Deploy and Delete operators for GCF (#3969)
Both Deploy and Delete operators interact with Google
Cloud Functions to manage functions. Both are idempotent
and make use of GcfHook - hook that encapsulates
communication with GCP over GCP API.
2018-09-29 08:49:31 +01:00
Kaxil Naik 4c572a4b2d [AIRFLOW-3130] Add CLI docs for users command 2018-09-28 08:49:36 -07:00
Xiaodong 4968a52116 [AIRFLOW-3104] Add .airflowignore info into doc (#3939)
.airflowignore is a nice feature, but it was not mentioned at all in the documentation.
2018-09-28 13:43:55 +01:00
Xiaodong 481daeec0b [AIRFLOW-3127] Fix out-dated doc for Celery SSL (#3967)
Now in `airflow.cfg`, for Celery-SSL, the item names are
"ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert".
(since PR https://github.com/apache/incubator-airflow/pull/2806/files)

But in the documentation
https://airflow.incubator.apache.org/security.html?highlight=celery
or
https://github.com/apache/incubator-airflow/blob/master/docs/security.rst,
it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and
"CELERY_SSL_CACERT", which is out-dated and may confuse readers.
2018-09-28 09:56:43 +01:00
Brylie Christopher Oxley 17a9a0fbe9 [AIRFLOW-3117] Add instructions to allow GPL dependency (#3949)
The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency.
2018-09-26 10:28:47 +01:00
Fokko Driesprong 491fd743da [AIRFLOW-2918] Remove unused imports 2018-09-21 13:21:42 -07:00
Iuliia Volkova dd85126f26 [AIRFLOW-2887] Added BigQueryCreateEmptyDatasetOperator and create_emty_dataset to bigquery_hook (#3876) 2018-09-21 15:46:59 +01:00
Xiaodong 4c1282c43b [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (#3924) 2018-09-21 11:24:13 +01:00
Trevor Edwards 17767498ce [AIRFLOW-1441] Fix inconsistent tutorial code (#2466) 2018-09-19 20:46:59 +02:00
Xiaodong 2f50083c8d [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (#3909)
Adhoc queries and Charts features are no longer supported in new
FAB-based webserver and UI. But this is not mentioned at all in the doc
"Data Profiling" (https://airflow.incubator.apache.org/profiling.html)

This commit adds a note to remind users for this.
2018-09-17 20:59:13 -07:00
Xiaodong 7194c81f6d [AIRFLOW-3070] Refine web UI authentication-related docs (#3863) 2018-09-16 13:38:09 +01:00
Xiaodong e71bac99dd [AIRFLOW-XXX] Fix typo in docs/timezone.rst (#3904) 2018-09-16 13:33:16 +01:00
Sid Anand 4c30d402c4 [AIRFLOW-3052] Add logo options to Airflow (#3892) 2018-09-13 15:08:06 +01:00
Thejas Babu b63cb61b47 [AIRFLOW-XXX] Update kubernetes.rst docs (#3875)
Update kubernetes.rst with correct KubernetesPodOperator inputs
for the volumes.
2018-09-10 15:13:23 +02:00
Xiaodong ee17a033e0 [AIRFLOW-2985] Operators for S3 object copying/deleting (#3823)
1. Copying:
Under the hood, it's `boto3.client.copy_object()`.
It can only handle the situation in which the
S3 connection used can access both source and
destination bucket/key.

2. Deleting:
2.1 Under the hood, it's `boto3.client.delete_objects()`.
It supports either deleting one single object or
multiple objects.
2.2 If users try to delete a non-existent object, the
request will still succeed, but there will be an
entry 'Errors' in the response. There may also be
other reasons which may cause similar 'Errors' (
request itself would succeed without explicit
exception). So an argument `silent_on_errors` is added
to let users decide if this sort of 'Errors' should
fail the operator.

The corresponding methods are added into S3Hook, and
these two operators are 'wrappers' of these methods.
2018-09-10 10:06:28 +01:00
r39132 55633fa78b [AIRFLOW-3028] Update Text & Images in Readme.md 2018-09-09 09:44:20 -07:00
Naman Bhalla 7cd3a85469 [AIRFLOW-XXX] Remove redundant space in Kerberos (#3866) 2018-09-08 18:10:27 +02:00
ziheng 08ecca4786 [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. 2018-09-07 14:40:36 -07:00
Kaxil Naik b7f33a7310 [AIRFLOW-3006] Add note on using None for schedule_interval 2018-09-05 10:14:55 -07:00
Kaxil Naik 1f88a4c044
[AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) 2018-09-05 01:04:42 +01:00
Kaxil Naik 93c6a50352 [AIRFLOW-3007] Update backfill example in Scheduler docs
The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg.
2018-09-04 16:57:40 -07:00
Xiaodong f255bd80fd [AIRFLOW-XXX] Fix typos in faq.rst (#3837) 2018-09-03 09:36:44 +01:00
Tao Feng 3b8d036990
[AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (#3821) 2018-08-30 20:01:28 -07:00
bolkedebruin 6c8d35c44f
[AIRFLOW-2984] Convert operator dates to UTC (#3822)
Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.
2018-08-30 14:26:11 +02:00
pengc 751a332414 [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (#3697)
* kubernetes_pod_operator add more configuration items
* fix test_kubernetes_pod_operator test_faulty_service_account failure case
* fix review comment issues
* pod_operator add hostnetwork config
* add doc example
2018-08-29 09:41:08 +02:00
Håvard Wall 8100f1f27b [AIRFLOW-XXX] Typo in the write-logs.rst (#3781)
Fixed airflow_tas_runner -> airflow_task_runner
2018-08-22 13:02:43 +02:00
Taylor D. Edmiston b7f63c59d7 [AIRFLOW-XXX] Fix some operator names in the docs (#3778) 2018-08-21 23:01:25 +01:00
Tim Swast 94fe9f11cc [AIRFLOW-2915] Add example DAG for GoogleCloudStorageToBigQueryOperator (#3763)
* Don't add DAG if GCP operators aren't installed.
* Remove extra blank lines for flake8.
2018-08-20 13:57:50 -07:00
Robin Edwards 404be4b021 [AIRFLOW-XXX] Specify email domain in documentation (#3771)
This makes it less ambigious
2018-08-20 11:42:55 +02:00
Taylor D. Edmiston e346f89435 [AIRFLOW-XXX] Clean up installation extra packages table (#3750)
Sort the extra packages table, use official product names, improve
capitalization, and make table whitespace consistent.
2018-08-15 07:09:26 +02:00
Taylor D. Edmiston 9d516c7134 [AIRFLOW-XXX] Make pip install commands consistent (#3752)
- Replace airflow PyPI package name with apache-airflow
- Remove unnecessary quotes on pip install commands
- Make install extras commands consistent with pip docs [1] (e.g.,
alpha sort order without spaces or quotes)

[1]: https://pip.pypa.io/en/stable/reference/pip_install/#examples
2018-08-14 17:02:56 -07:00
Kazuhiro Sera b78c7fb851 [AIRFLOW-2889] Fix typos detected by github.com/client9/misspell (#3732) 2018-08-11 21:11:19 -07:00
Xiaodong 3157287e8c [AIRFLOW-2821] Refine Doc "Plugins" (#3664) 2018-08-05 19:24:20 +01:00
Xiaodong c0c63ae2a4 [AIRFLOW-2839] Refine Doc Concepts->Connections (#3678) 2018-08-05 19:08:15 +01:00
Tao Feng da4f254283 [AIRFLOW-XXX] Add Feng Tao to committers list (#3689) 2018-08-03 20:42:30 +01:00
Cameron Moberg b4f43e6c48 [AIRFLOW-2658] Add GCP specific k8s pod operator (#3532)
Executes a task in a Kubernetes pod in the specified Google Kubernetes
Engine cluster. This makes it easier to interact with GCP kubernetes
engine service because it encapsulates acquiring credentials.
2018-08-02 20:44:16 +01:00
Xiaodong b120427b65 [AIRFLOW-2820] Add Web UI triggger in doc "Scheduling & Triggers"
In documentation page "Scheduling & Triggers",
it only mentioned the CLI method to
manually trigger a DAG run.

However, the manual trigger feature in Web UI
should be mentioned as well
(it may be even more frequently used by users).
2018-08-01 14:08:21 -07:00
Marcus Rehm 9983466fd1 [AIRFLOW-2795] Oracle to Oracle Transfer Operator (#3639) 2018-07-31 21:22:40 +02:00
Bolke de Bruin af15f1150d [AIRFLOW-2816] Fix license text in docs/license.rst 2018-07-28 13:20:26 +02:00
Amir Shahatit 98c7080361 Fix Typo in Scheduler documentation
Closes #3618 from amir656/patch-1
2018-07-21 13:33:29 +01:00
Marcus Rehm 52c745da71 [AIRFLOW-2596] Add Oracle to Azure Datalake Transfer Operator
Closes #3613 from
marcusrehm/oracle_to_azure_datalake_transfer
2018-07-20 22:46:59 +02:00
Ivan Arozamena ee4fc35774 [AIRFLOW-2749] Add feature to delete BQ Dataset
Closes #3598 from MENA1717/Add-bq-op
2018-07-17 13:56:05 +01:00
Matthew Thorley 6b7645261b [AIRFLOW-2710] Clarify fernet key value in documentation
Closes #3574 from padwasabimasala/AIRFLOW-2710
2018-07-08 20:52:51 +02:00
Tim Swast 89c1f530da [AIRFLOW-2682] Add how-to guides for bash and python operators
Closes #3552 from tswast/airflow-2682-bash-python-
how-to
2018-06-29 14:15:16 +02:00
Kevin Yang 284dbdb60a [AIRFLOW-2359] Add set failed for DagRun and task in tree view
Closes #3255 from
yrqls21/kevin_yang_add_set_failed
2018-06-28 13:30:36 -07:00
Kaxil Naik 7961ee8f08 [AIRFLOW-2663] Add instructions to install SSH dependencies
Closes #3536 from kaxil/patch-1
2018-06-22 16:35:48 +02:00
Kengo Seki 5f49ebf018 [AIRFLOW-2640] Add Cassandra table sensor
Just like a partition sensor for Hive,
this PR adds a sensor that waits for
a table to be created in Cassandra cluster.

Closes #3518 from sekikn/AIRFLOW-2640
2018-06-20 20:36:32 +02:00
niels 3dade5413f [AIRFLOW-2559] Azure Fileshare hook
Closes #3457 from NielsZeilemaker/fileshare_hook
2018-06-18 22:23:53 +01:00
Cameron Moberg dc38b2f46d [AIRFLOW-2613] Fix Airflow searching .zip bug
When Airflow was populating a DagBag from a .zip
file, if a single
file in the root directory did not contain the
strings 'airflow' and
'DAG' it would ignore the entire .zip file.

Also added a small amount of logging to not
bombard user with info
about skipping their .py files.

Closes #3505 from Noremac201/dag_name
2018-06-17 19:16:12 +01:00
Kengo Seki 4d153ad4e8 [AIRFLOW-2627] Add a sensor for Cassandra
Closes #3510 from sekikn/AIRFLOW-2627
2018-06-17 19:10:48 +01:00
Cameron Moberg 7255589f95 [AIRFLOW-2562] Add Google Kubernetes Engine Operators
Add Google Kubernetes Engine create_cluster,
delete_cluster operators
This allows users to use airflow to create or
delete clusters in the
google cloud platform

Closes #3477 from Noremac201/gke_create
2018-06-15 20:44:29 +01:00
Tim Swast 0f4d681f6f [AIRFLOW-2512][AIRFLOW-2522] Use google-auth instead of oauth2client
* Updates the GCP hooks to use the google-auth
library and removes
  dependencies on the deprecated oauth2client
package.
* Removes inconsistent handling of the scope
parameter for different
  auth methods.

Note: using google-auth for credentials requires a
newer version of the
google-api-python-client package, so this commit
also updates the
minimum version for that.

To avoid some annoying warnings about the
discovery cache not being
supported, so disable the discovery cache
explicitly as recommend here:
https://stackoverflow.com/a/44518587/101923

Tested by running:

    nosetests
tests/contrib/operators/test_dataflow_operator.py
\
        tests/contrib/operators/test_gcs*.py \
        tests/contrib/operators/test_mlengine_*.py \
        tests/contrib/operators/test_pubsub_operator.py \
        tests/contrib/hooks/test_gcp*.py \
        tests/contrib/hooks/test_gcs_hook.py \
        tests/contrib/hooks/test_bigquery_hook.py

and also tested by running some GCP-related DAGs
locally, such as the
Dataproc DAG example at
https://cloud.google.com/composer/docs/quickstart

Closes #3488 from tswast/google-auth
2018-06-12 23:53:21 +01:00
renzofrigato be3d551f72 [AIRFLOW-1115] fix github oauth api URL
Closes #3469 from renzofrigato/airflow_1115
2018-06-11 15:14:02 -07:00
Andy Cooper 9e1d8ee837 [AIRFLOW-83] add mongo hook and operator
Closes #3440 from
andscoop/AIRFLOW_83_add_mongo_hooks_and_operators
2018-06-05 23:30:02 +01:00
Charles Caygill 817296a7be [AIRFLOW-XXX] Fix doc typos
Closes #3459 from ccayg-sainsburys/master
2018-06-04 11:15:38 -07:00
Chao-Han Tsai 2800c8e556 [AIRFLOW-2526] dag_run.conf can override params
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2526
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
params can be overridden by the dictionary passed
through `airflow backfill -c`

```
templated_command = """
    echo "text = {{ params.text }}"
"""

bash_operator = BashOperator(
    task_id='bash_task',
    bash_command=templated_command,
    dag=dag,
    params= {
        "text" : "normal processing"
    })
```

In daily processing it prints:
```
normal processing
```

In backfill processing `airflow trigger_dag -c
"{"text": "override success"}"`, it prints
```
override success
```

### Tests
- [ ] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3422 from milton0825/params-overridden-
through-cli
2018-06-01 11:22:10 -07:00
Tao feng b81bd08a33 [AIRFLOW-2538] Update faq doc on how to reduce airflow scheduler latency
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2538
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Update the faq doc on how to reduce airflow
scheduler latency. This comes from our internal
production setting which also aligns with Maxime's
email(https://lists.apache.org/thread.html/%3CCAHE
Ep7WFAivyMJZ0N+0Zd1T3nvfyCJRudL3XSRLM4utSigR3dQmai
l.gmail.com%3E).

### Tests
- [ ] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [ ] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [ ] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [ ] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3434 from feng-tao/update_faq
2018-05-31 22:01:59 -07:00
Chao-Han Tsai d5d97dc971 [AIRFLOW-2536] docs about how to deal with airflow initdb failure
Add docs to faq.rst to talk about how to deal with
Exception: Global variable
explicit_defaults_for_timestamp needs to be on (1)
for mysql

Closes #3429 from milton0825/fix-docs
2018-05-29 20:29:27 +01:00
Tim Swast 4c0d67f0d0 [AIRFLOW-2523] Add how-to for managing GCP connections
I'd like to have how-to guides for all connection
types, or at least the
different categories of connection types. I found
it difficult to figure
out how to manage a GCP connection, this commit
add a how-to guide for
this.

Also, since creating and editing connections
really aren't all that
different, the PR renames the "creating
connections" how-to to "managing
connections".

Closes #3419 from tswast/howto
2018-05-25 09:37:29 +01:00
Chao-Han Tsai 66f00bbf7b [AIRFLOW-2510] Introduce new macros: prev_ds and next_ds
Closes #3418 from milton0825/introduce-next_ds-
prev_ds
2018-05-25 10:13:49 +02:00
Kengo Seki e4e7b55ad7 [AIRFLOW-2518] Fix broken ToC links in integration.rst
Closes #3412 from sekikn/AIRFLOW-2518
2018-05-24 21:55:19 +01:00
Tim Swast 084bc91367 [AIRFLOW-2509] Separate config docs into how-to guides
Also moves how-to style instructions for logging
from "integration" page
to a "Writing Logs" how-to.

Closes #3400 from tswast/howto
2018-05-23 10:08:53 +01:00
roc fff87b5cfd [AIRFLOW-2397] Support affinity policies for Kubernetes executor/operator
KubernetesPodOperator now accept a dict type
parameter called "affinity", which represents a
group of affinity scheduling rules (nodeAffinity,
podAffinity, podAntiAffinity).

API reference: https://kubernetes.io/docs/referenc
e/generated/kubernetes-api/v1.10/#affinity-v1-core

Closes #3369 from imroc/AIRFLOW-2397
2018-05-19 00:47:53 +02:00
Tao feng 8a2cd08ce8 [AIRFLOW-2479] Improve doc FAQ section
Closes #3373 from feng-tao/airflow-2478
2018-05-19 00:38:27 +02:00
Joy Gao f5115b7e6a [ARIFLOW-2458] Add cassandra-to-gcs operator
Closes #3354 from jgao54/cassandra-to-gcs
2018-05-18 02:02:57 +01:00