- changes the order of arguments for `has_mail_attachment`, `retrieve_mail_attachments` and `download_mail_attachments`
- add `get_conn` function
- refactor code
- fix pylint issues
- add imap_mail_filter arg to ImapAttachmentToS3Operator
- add mail_filter arg to ImapAttachmentSensor
- remove superfluous tests
- changes the order of arguments in the sensors + operators __init__
Change SubDagOperator to use Airflow scheduler to schedule
tasks in subdags instead of backfill.
In the past, SubDagOperator relies on backfill scheduler
to schedule tasks in the subdags. Tasks in parent DAG
are scheduled via Airflow scheduler while tasks in
a subdag are scheduled via backfill, which complicates
the scheduling logic and adds difficulties to maintain
the two scheduling code path.
This PR simplifies how tasks in subdags are scheduled.
SubDagOperator is reponsible for creating a DagRun for subdag
and wait until all the tasks in the subdag finish. Airflow
scheduler picks up the DagRun created by SubDagOperator,
create andschedule the tasks accordingly.
Note: The order of arguments has changed for `check_for_prefix`.
The `bucket_name` is now optional. It falls back to the `connection schema` attribute.
- refactor code
- complete docs
The support to different Celery pool implementation has been added
in https://github.com/apache/airflow/pull/4308.
But it's not reflected in the default_airflow.cfg yet, while it's
the main portal of config options to most users.
`non_pooled_task_slot_count` and `non_pooled_backfill_task_slot_count`
are removed in favor of a real pool, e.g. `default_pool`.
By default tasks are running in `default_pool`.
`default_pool` is initialized with 128 slots and user can change the
number of slots through UI/CLI. `default_pool` cannot be removed.
google-storage-client 1.16 introduced a breaking change where the
signature of client.get_bucket changed from (bucket_name) to
(bucket_or_name). Calls with named arguments to this method now fail.
This commit makes all calls positional to work around this.
When using potentially larger offets than javascript can handle, they can get parsed incorrectly on the client, resulting in the offset query getting stuck on a certain number. This patch ensures that we return a string to the client to avoid being parsed. When we run the query, we ensure the offset is set as an integer.
Add unnecesary prefix_ in config for elastic search section
- upgrade cloudant version from `>=0.5.9,<2.0` to `>=2.0`
- remove the use of the `schema` attribute in the connection
- remove `db` function since the database object can also be retrieved by calling `cloudant_session['database_name']`
- update docs
- refactor code
- update default used version for connecting to the Admin API from v1beta1 to v1
- move the establishment of the connection to the function calls instead of the hook init
- change get_conn signature to be able to pass an is_admin arg to set an admin connection
- rename GoogleCloudBaseHook._authorize function to GoogleCloudBaseHook.authorize
- rename the `partialKeys` argument of function `allocate_ids` to `partial_keys`.
- add tests
- update docs
- refactor code
Move version attribute from get_conn to __init__
- revert renaming of authorize function
- improve docs
- refactor code
* [AIRFLOW-4172] Fix changes for driver class path option in Spark Submit Operator
* [AIRFLOW-4172] Fix changes for driver class path option in Spark Submit
Some command for installing extra packages like
`pip install apache-airflow[devel]` cause error
in special situation/shell, We should clear them
by add quotation like
`pip install 'apache-airflow[devel]'`
There were a few ways of getting the AIRFLOW_HOME directory used
throughout the code base, giving possibly conflicting answer if they
weren't kept in sync:
- the AIRFLOW_HOME environment variable
- core/airflow_home from the config
- settings.AIRFLOW_HOME
- configuration.AIRFLOW_HOME
Since the home directory is used to compute the default path of the
config file to load, specifying the home directory Again in the config
file didn't make any sense to me, and I have deprecated that.
This commit makes everything in the code base use
`settings.AIRFLOW_HOME` as the source of truth, and deprecates the
core/airflow_home config option.
There was an import cycle form settings -> logging_config ->
module_loading -> settings that needed to be broken on Python 2 - so I
have moved all adjusting of sys.path in to the settings module
(This issue caused me a problem where the RBAC UI wouldn't work as it
didn't find the right webserver_config.py)
Some command for installing extra packages are
`pip install apache-airflow[devel]` we should
clear install extra package command to
`pip install 'apache-airflow[devel]'`
[ci skip]
* [AIRFLOW-4122] Remove chain function
Bit operation like `>>` or `<<` are suggested
to set dependency, which visual and easier to
explain. and have multiple ways is confusion
* change UPDATING.md as recommend
[ci skip]
This will not change existing regular functions in the `Variable` class. If
variable `foo` doesn't exist:
```
foo = Variable.get("foo")
-> KeyError
```
For passing `default_var=None` to get, `None` is returned instead:
```
foo = Variable.get("foo", default_var=None)
if foo is None:
handle_missing_foo()
```