- changes the order of arguments for `has_mail_attachment`, `retrieve_mail_attachments` and `download_mail_attachments`
- add `get_conn` function
- refactor code
- fix pylint issues
- add imap_mail_filter arg to ImapAttachmentToS3Operator
- add mail_filter arg to ImapAttachmentSensor
- remove superfluous tests
- changes the order of arguments in the sensors + operators __init__
Set autodetect default value from false to be true to avoid breaking downstream
services using GoogleCloudStorageToBigQueryOperator but not aware of the newly
added autodetect field.
This is to fix the current regression introduced by #3880
The recent commit 3724c2aa to master introduced a __init__.py file in
the project root folder, which basically breaks all imports in local
development (`pip install -e .`) as it turns the project root into a
package.
[ci skip]
For large DAGs, iterating over template fields to find template files can be time intensive.
Save this time for tasks that do not specify a template file extension.
* [AIRFLOW-4843] Allow orchestration via Docker Swarm (SwarmOperator)
Add support for running Docker containers via Docker Swarm
which allows the task to run on any machine (node) which
is a part of your Swarm cluster
More details: https://issues.apache.org/jira/browse/AIRFLOW-4843
Built with <3 at Agoda!
`GCPTransferServiceHook.wait_for_transfer_job` defeaults its `timeout`
parameter to 60 and assumes it is an integer or at least comparable to
one. This is a problem as some of the built-in operators that use it
like `S3ToGoogleCloudStorageTransferOperator` and
`GoogleCloudStorageToGoogleCloudStorageTransferOperator` default their
`timeout` param to `None`, and when they call this method with their
default value, it causes an error. Fix this by allowing
`wait_for_transfer_job` to accept a timeout of `None` and fill in
appropriate defaults. This also adds functionality to allow it to take
a `timedelta` instead of an integer, allows seconds to be any real, as
there is really no need for them to actually be an integer, and fixes
the counting of time for determining timeout to be a bit more accurate.
The GCS Transfer Service REST API requires that a schedule be set, even for
one-time immediate runs. This adds code to
`S3ToGoogleCloudStorageTransferOperator` and
`GoogleCloudStorageToGoogleCloudStorageTransferOperator` to set a default
one-time immediate run schedule when no `schedule` argument is passed.
Use from google.api_core.gapic_v1.client_info import ClientInfo
because this object inherits from
from google.api_core.client_info import ClientInfo
and implements one additional method used by Python SDKs.
The scheduler calls `list_py_file_paths` to find DAGs to schedule. It does so
without passing any parameters other than the directory. This means that
it *won't* discover DAGs that are missing the words "airflow" and "DAG" even
if DAG_DISCOVERY_SAFE_MODE is disabled.
Since `list_py_file_paths` will refer to the configuration if
`include_examples` is not provided, it makes sense to have the same behaviour
for `safe_mode`.
- Refactors `BaseOperator.render_template()` and removes `render_template_from_field()`. The functionality could be greatly simplified into a single `render_template()` function.
- Removes six usage.
- Improves performance by removing two `hasattr` calls and avoiding recreating Jinja environments.
- Removes the argument `attr` to `render_template()` which wasn't used.
- Squashes multiple similar tests into two parameterized tests.
- Adheres to 110 line length.
- Adds support for templating sets.
- Adds Pydoc.
- Adds typing.
* AIRFLOW-5139 Allow custom ES configs
While attempting to create a self-signed TLS connection between airflow
and ES, we discovered that airflow does now allow users to modify the
SSL state of the elasticsearchtaskhandler. This commit will allow users
to define ES settings in the airflow.cfg