It was possible to "block" the scheduler such that it would not
schedule or queue tasks for a dag if you triggered a DAG run when the
DAG was already at the max active runs.
This approach works around the problem for now, but a better longer term
fix for this would be to introduce a "queued" state for DagRuns, and
then when manually creating dag runs (or clearing) set it to queued, and
only have the scheduler set DagRuns to running, nothing else -- this
would mean we wouldn't need to examine active runs in the TI part of the
scheduler loop, only in DagRun creation part.
Fixes#11582
..so that whenever the Airflow server restarts, it does not leave rogue ECS Tasks. Instead the operator will seek for any running instance and attach to it.
We've implemented the capability of running the tests in smaller
chunks and selective running only some of those, but this
capability have been disabled by mistake by default setting of
TEST_TYPE to "All" and not removing it when TEST_TYPES are set
to the sets of tests that should be run.
This should speed up many of our tests and also hopefully
lower the chance of EXIT 137 errors.
This is an improvement to the UI response time when clearing dozens of DagRuns of large DAGs (thousands of tasks) containing many ExternalTaskSensor + ExternalTaskMarker pairs. In the current implementation, clearing tasks can get slow especially if the user chooses to clear with Future, Downstream and Recursive all selected.
This PR speeds it up. There are two major improvements:
Updating self._task_group in dag.sub_dag() is improved to not deep copy _task_group because it's a waste of time. Instead, do something like dag.task_dict, set it to None first and then copy explicitly.
Pass the TaskInstance already visited down the recursive calls of dag.clear() as visited_external_tis. This speeds up the example in test_clear_overlapping_external_task_marker by almost five folds.
For real large dags containing 500 tasks set up in a similar manner, the time it takes to clear 30 DagRun is cut from around 100s to less than 10s.
This was messing up the "max_active_runs" calculation, and this fix is a
"hack" until we add a better approach of adding a queued state to
DagRuns -- at which point we don't even have to do this calculation at
all.
This reverts commit 02ce45cafe.
That refactored Clery worker to be compatible with 5.0. However this
introduced some incompatibilities.
Closes: #11622Closes: #11697
Some contexts try to close their reference to the stderr stream at logging shutdown, this ensures these don't break.
* Make pylint happy
An explicit `pass` is better here, but the docstring _is_ a statement.
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
The tests for connection export failed when CLI tests are
run in isolation. The problem was with non-deterministic
sequence of returned rows from connection export query.
Rather than fixing the test to accept the non-deterministic
sequence, it is better idea to return them always in the
connection_id order. This does not change functionality and
is backwards compatible, but at the same time it gives stability
in the export, which might be important if someone uses export
to determine for example if some connections were added/removed.
* fix: 🐛 Float to Int columns conversion
The `_fix_int_dytpes` method is applying the `astype` transformation to
the return of a `np.where` call. I added an extra step to the method in
order to apply this to the whole pd.Series. Note that Int64Dtype must be
used as an instance, since Pandas will raise an Exception if a class is
used.
* test: Add dtype test for integers
* style: Change line length
So far breeze used in-container data for persisting it (mysql redis,
postgres). This means that the data was kept as long, as long the
containers were running. If you stopped Breeze via `stop` command
the data was always deleted.
This changes the behaviour - each of the Breeze containers has
a named volume where data is kept. Those volumes are also deleted
by default when Breeze is stopped, but you can choose to preserve
them by adding ``--preserve-volumes`` when you run ``stop`` or
``restart`` command.
Fixes: #11625
Fixes random failure when processes are still running
on teardown of some webserver tests. We simply ignor that
after we send sigkill to those processes.
Fixes#11615
This approach is documented in https://docs.python.org/3.6/library/enum.html#others:
```
While IntEnum is part of the enum module, it would be very simple to
implement independently:
class IntEnum(int, Enum):
pass
```
We just extend this to a str -- this means the SQLAlchemy has no trouble
putting these in to queries, and `"scheduled" == DagRunType.SCHEDULED`
is true.
This change makes it simpler to use `dagrun.run_type`.
This PR introduces creating_job_id column in DagRun table that links a
DagRun to job that created it. Part of #11302
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Although these lists are short, there's no need to re-create them each
time, and also no need for them to be a method.
I have made them lowercase (`finished`, `running`) instead of uppercase
(`FINISHED`, `RUNNING`) to distinguish them from the actual states.
- The dag_run argument is only there for test mocks, and only to access a static method. Removing this simplifies the function, reduces confusion.
- Give optional arguments a default value, reduce indentation of arg list to PEP / Black standard.
- Clean up tests for readability
If the `kubernetes.client` import fails, then `airflow.kubernetes.pod_generator` also can't be imported, and there won't be attributes on `k8s` to use in `isinstance()` calls.
Instead of setting `k8s` to `None`, use an explicit flag so later code can disable kubernetes-specific branches explicitly.
Also, when de-serializing a Kubernetes pod with no kubernetes library installed is an error.
This adds the prefix DAG: to newly created dag permissions. It supports checking permissions on both prefixed and un-prefixed DAG permission names.
This will make it easier to identify permissions that related to granular dag access.
This PR does not modify existing dag permission names to use the new prefixed naming scheme. That will come in a separate PR.
Related to issue #10469
* Add reset_dag_run option on dagrun_operator so that user can clear target dag run if exists.
* Logging coding style changes.
* Make pylint check pass.
* Make pylint check pass.
* Make pylint check pass on unit test file.
* Make static check pass.
* Use settings.STORE_SERIALIZED_DAGS
Co-authored-by: Kaz Ukigai <kukigai@apple.com>
Created a new Airflow tutorial to use Decorated Flows (a.k.a. functional
DAGs). Also created a DAG to perform the same operations without using
functional DAGs to be compatible with Airflow 1.10.x and to show the
difference.
* Apply suggestions from code review
It makes sense to simplify the return variables being passed around without needlessly converting to JSON and then reconverting back.
* Update tutorial_functional_etl_dag.py
Fixed data passing between tasks to be more natural without converting to JSON and converting back to variables.
* Updated dag options and task doc formating
Based on feedback on the PR, updated the DAG options (including schedule) and the fixed the task documentation to avoid indentation.
* Added documentation file for functional dag tutorial
Added the tutorial documentation to the docs directory. Fixed linting errors in the example dags.
Tweaked some doc references in the example dags for inclusion into the tutorial documentation.
Added the example dags to example tests.
* Removed multiple_outputs from task defn
Had a multiple_outputs=True defined in the Extract task defn, which was unnecessary. - Removed based on feedback.
Co-authored-by: Gerard Casas Saez <casassg@users.noreply.github.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>