We were seeing an intermittent issue where executor reports task instance finished while task says it's in queue state, it was due to a race condition between scheduler which was clearing event_buffer in _process_executor_events method in jobs.py executor was about to put next_retry task's status as running which was failed in previous try. So, we thought to add retry_number as the member of TaskInstance key property.
Caused by an update in PR #3740.
execute_command.apply_async(args=command, ...)
-command is a list of short unicode strings and the above code pass multiple
arguments to a function defined as taking only one argument.
-command = ["airflow", "run", "dag323",...]
-args = command = ["airflow", "run", "dag323", ...]
-execute_command("airflow","run","dag3s3", ...) will be error and exit.
Bug-1:
if a task state becomes either SUCCESS or FAILURE or REVOKED,
it will be removed from self.tasks() and self.last_state().
However, because line 108 is not indented properly,
this task will be added back to self.last_state() again.
Bug-2:
When the state is updated, it's referring to the latest
state `task.state` rather than variable `state`.
This may result in dead-lock if the state changed from
`STARTED` to `SUCCESS` after the if-elif-else block
started.
Test case is updated for fix to bug-1.
As of 0.17.0 dask distributed has support for
TLS/SSL.
[dask] Added TLS/SSL support for the dask-
distributed scheduler.
As of 0.17.0 dask distributed has support for
TLS/SSL.
Add a test for tls under dask distributed
Closes#2683 from mariusvniekerk/dask-ssl
The command versions of config parameters were
overriden by the
default config. E.g sql_alchemy_conn got the
default value even
when sql_alchemy_conn_cmd was specified.
Closes#3029 from cjgu/airflow-610
Explicitly set the celery backend from the config
and align the config
with the celery config as this might be confusing.
Closes#2806 from Fokko/AIRFLOW-1840-Fix-celery-
config
While in Backfills we do handle the executor
state,
we do not in the Scheduler. In case there is an
unspecified
error (e.g. a timeout, airflow command failure)
tasks
can get stuck.
Closes#2715 from bolkedebruin/AIRFLOW-1641
Before, if unlimited parallelism was used passing
`0` for the
parallelism value, the local executor would stall
execution since no
worker was being created, violating the
BaseExecutor contract on the
parallelism option.
Now, if unbound parallelism is used, processes
will be created on demand
for each task submitted for execution.
Closes#2658 from edgarRd/erod-localexecutor-fix
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.
Closes#2542 from bolkedebruin/upgrade_celery
In all the popular languages the variable name log
is the de facto
standard for the logging. Rename LoggingMixin.py
to logging_mixin.py
to comply with the Python standard.
When using the .logger a deprecation warning will
be emitted.
Closes#2604 from Fokko/AIRFLOW-1604-logger-to-log
The refactor to use dag runs in backfills caused a
regression
in task execution performance as dag runs were
executed
sequentially. Next to that, the backfills were non
deterministic
due to the random execution of tasks, causing root
tasks
being added to the non ready list too soon.
This updates the backfill logic as follows:
* Parallelize execution of tasks
* Use a leave first execution model
* Replace state updates from the executor by task
based only
Closes#2107 from bolkedebruin/AIRFLOW-910