Граф коммитов

130 Коммитов

Автор SHA1 Сообщение Дата
Tomek 6fffa5b0d7 [AIRFLOW-6343] Make tests/* pylint compatible (#6899) 2019-12-26 21:30:43 +01:00
Andrey Klochkov 702005fe35 [AIRFLOW-6171] Apply .airflowignore to correct subdirectories (#6784)
Fix the defect that applied .airflowignore rules from one subdirectory
to all other subdirectories scanned later.
2019-12-12 18:01:49 +00:00
Daniel Imberman f3bb4c31b8 [AIRFLOW-6175] Fixes bug when tasks get stuck in "scheduled" state (#6732)
There is a bug caused by scheduler_jobs refactor which leads to task failure
and scheduler locking.

Essentially when a there is an overflow of tasks going into the scheduler, the
tasks are set back to scheduled, but are not removed from the executor's
queued_tasks queue.

This means that the executor will attempt to run tasks that are in the scheduled
state, but those tasks will fail dependency checks. Eventually the queue is
filled with scheduled tasks, and the scheduler can no longer run.

Co-Authored-By: Kaxil Naik <kaxilnaik@gmail.com>, Kevin Yang <kevin.yang@airbnb.com>
2019-12-10 11:17:30 +00:00
Kamil Breguła c9a97baa86
[AIRFLOW-6135] Extract DAG processing from SchedulerJob into separate class (#6697) 2019-12-09 00:29:46 +01:00
Tomek e61025e1ac [AIRFLOW-6058] Running tests with pytest (#6472)
This commit runs Airflow's test suite using pytest.
2019-12-05 10:40:28 +01:00
Qingping Hou dbf81df244 [AIRFLOW-5902] avoid unnecessary sleep to maintain local task job heart rate (#6553)
sleep to maintain heart rate is already done by the hearbeat() call
2019-12-04 10:43:38 +00:00
Jarek Potiuk a36cfe049a
[AIRFLOW-6004] Untangle Executors class to avoid cyclic imports (#6596)
There are cyclic imports detected seemingly randomly by pylint checks when some
    of the PRs are run in CI

    It was not deterministic because pylint usually uses as many processors as
    many are available and it splits the list of .py files between the separate
    pylint processors - depending on how the split is done, pylint check might
    or might not detect it. The cycle is always detected when all files are used.

    In order to make it more deterministic, all pylint and mypy errors were resolved
    in all executors package and in dag_processor.

    At the same time plugins_manager had also been moved out of the executors
    and all of the operators/hooks/sensors/macros because it was also causing
    cyclic dependencies and it's far easier to untangle those dependencies
    in executor when we move the intialisation of all plugins to plugins_manager.

    Additionally require_serial is set in pre-commit configuration to
    make sure cycle detection is deterministic.
2019-12-03 16:02:20 +01:00
Kamil Breguła 4a344f13d2
[AIRFLOW-6001] Lazy load CLI commands (#6594)
* [AIRFLOW-YYY] Lazy load API Client

* [AIRFLOW-YYY] Introduce order in CLI's function names

* [AIRFLOW-YYY] Create cli package

* [AIRLFOW-YYY] Move user and roles command to seperate files

* [AIRLFOW-YYY] Move sync_perm command to seperate file

* [AIRLFOW-YYY] Move task commands to separate file

* [AIRLFOW-YYY] Move pool commands to separate file

* [AIRLFOW-YYY] Move variable commands to separate file

* [AIRLFOW-YYY] Move db commands to separate file

* fixup! [AIRLFOW-YYY] Move variable commands to separate file

* [AIRLFOW-YYY] Move connection commands to separate file

* [AIRLFOW-YYY] Move version command to separate file

* [AIRLFOW-YYY] Move scheduler command to separate file

* [AIRLFOW-YYY] Move worker command to separate file

* [AIRLFOW-YYY] Move webserver command to separate file

* [AIRLFOW-YYY] Move dag commands to separate file

* [AIRLFOW-YYY] Move serve logs command to separate file

* [AIRLFOW-YYY] Move flower command to separate file

* [AIRLFOW-YYY] Move kerberos command to separate file

* [AIRFLOW-YYY] Lazy load CLI commands

* [AIRFLOW-YYY] Fix migration

* fixup! [AIRFLOW-YYY] Fix migration

* fixup! fixup! [AIRFLOW-YYY] Fix migration
2019-11-19 17:12:40 +01:00
Qingping Hou 6bcbd48792 [AIRFLOW-5811] add metric for externally killed task count (#6466) 2019-11-06 12:46:58 -08:00
Chao-Han Tsai bc53412234
[AIRFLOW-5714] Collect SLA miss emails only from tasks missed SLA (#6384)
Currently when a task in the DAG missed the SLA,
Airflow would traverse through all the tasks in the DAG
and collect all the task-level emails. Then Airflow would
send an SLA miss email to all those collected emails,
which can add unnecessary noise to task owners that
does not contribute to the SLA miss.

Thus, changing the code to only collect emails
from the tasks that missed the SLA.
2019-10-22 11:37:22 -07:00
Ash Berlin-Taylor 68b8ec5f41
[AIRFLOW-5102] Worker jobs should terminate themselves if they can't heartbeat (#6284)
If a LocalTaskJob fails to heartbeat for
scheduler_zombie_task_threshold, it should shut itself down.

However, at some point, a change was made to catch exceptions inside the
heartbeat, so the LocalTaskJob thought it had managed to heartbeat
successfully.

This effectively means that zombie tasks don't shut themselves down.
When the scheduler reschedules the job, this means we could have two
instances of the task running concurrently.
2019-10-08 17:00:15 +01:00
Kevin Yang d719e1fd67 [AIRFLOW-5362] Reorder imports (#5944) 2019-10-02 16:30:03 +01:00
Hao Liang f497d1d5aa [AIRFLOW-4858] Deprecate "Historical convenience functions" in airflow.configuration (#5495)
1. Issue old conf method deprecation warnings properly and remove current old conf method usages.
2. Unify the way to use conf as `from airflow.configuration import conf`
2019-09-03 17:08:55 +01:00
Tomek 23d104203e [AIRFLOW-5309] Use assert_called_once or has_calls in tests (#5912)
Using mock.assert_call_with method can result in flaky tests
(ex. iterating through dict in python 3.5 which does not
store order of elements). That's why it's better to
use assert_called_once_with or has_calls methods.
2019-08-26 07:03:43 +02:00
Bas Harenslak 5196db38f2 [AIRFLOW-5241] Make all test class names consistent (#5847)
Make all test class names consistent by starting with Test
2019-08-22 14:14:25 +02:00
Kevin Yang 203e6e891e
[AIRFLOW-4285] Update task dependency context defination and usage (#5079) 2019-08-15 15:06:12 -07:00
Kevin Yang e07e30460c [AIRFLOW-4956] Fix LocalTaskJob heartbeat log spamming (#5589) 2019-08-08 08:59:53 +02:00
Chao-Han Tsai 0be39219cd [AIRFLOW-4509] SubDagOperator using scheduler instead of backfill (#5498)
Change SubDagOperator to use Airflow scheduler to schedule
tasks in subdags instead of backfill.

In the past, SubDagOperator relies on backfill scheduler
to schedule tasks in the subdags. Tasks in parent DAG
are scheduled via Airflow scheduler while tasks in
a subdag are scheduled via backfill, which complicates
the scheduling logic and adds difficulties to maintain
the two scheduling code path.

This PR simplifies how tasks in subdags are scheduled.
SubDagOperator is reponsible for creating a DagRun for subdag
and wait until all the tasks in the subdag finish. Airflow
scheduler picks up the DagRun created by SubDagOperator,
create andschedule the tasks accordingly.
2019-08-07 21:17:50 +02:00
Du Jiangfan 05c01a9749 [AIRFLOW-4822] Fix bug where parent-dag task instances are wrongly cleared (#5444)
Full matching required in this case, so the regex should start and end
with "^$". Blurred matching might result in irrelevant task instances be
cleared.

Also in this commit:

* Added independent test dag: `clear_subdag_test_dag`
* Polished related unit test: `test_subdag_clear_parentdag_downstream_clear`
2019-07-30 13:06:17 +01:00
Kamil Breguła 96933b0797 [AIRFLOW-4952] Remove unused arguments in tests (#5586) 2019-07-20 11:00:26 +02:00
Joshua Carp 30defe130d [AIRFLOW-3998] Use nested commands in cli. (#4821) 2019-07-19 08:40:14 +01:00
Jarek Potiuk 2d086d77f1
[AIRFLOW-4117] Travis CI uses multi-stage images to run tests (#4938) 2019-07-17 22:42:43 +02:00
Kamil Breguła 7d08cacfbb [AIRFLOW-4954] Remove unused variables from tests (#5588) 2019-07-15 21:42:42 +02:00
Kamil Breguła 18e60f4a29 [AIRFLOW-4942] Drop six.next (#5576) 2019-07-14 14:23:16 +02:00
Kamil Breguła b5fb370446 [AIRFLOW-4943] Replace six assertion method with native (#5577) 2019-07-14 07:12:20 +02:00
Ash Berlin-Taylor 6ae5f2b69a
[AIRFLOW-4864] Remove calls to load_test_config (#5502)
We already set the environment variable in the test runner so that
airflow.configuration will do this -- we don't need to do it again
2019-07-01 16:43:03 +01:00
Chao-Han Tsai 2c99ec624b [AIRFLOW-4591] Make default_pool a real pool (#5349)
`non_pooled_task_slot_count` and `non_pooled_backfill_task_slot_count`
are removed in favor of a real pool, e.g. `default_pool`.

By default tasks are running in `default_pool`.
`default_pool` is initialized with 128 slots and user can change the
number of slots through UI/CLI. `default_pool` cannot be removed.
2019-06-20 10:16:50 -07:00
Kamil Breguła e77d78db9e [AIRFLOW-4817] Remove deprecated methods from tests (#5438) 2019-06-20 04:56:44 +02:00
Ash Berlin-Taylor 3dcfe2801c
[AIRFLOW-4343] Show warning in UI if scheduler is not running (#5127)
Now that the webserver is more stateless, if the scheduler is not
running the list of dags won't populate, making it harder for new
starters to work out what is going on.

New dep is BSD-2 which is Cat-A under ASF
2019-05-29 10:50:07 +01:00
Chao-Han Tsai d337c93c28 [AIRFLOW-4535] Break jobs.py into multiple files (#5303) 2019-05-19 20:43:54 +02:00