This approach is documented in https://docs.python.org/3.6/library/enum.html#others:
```
While IntEnum is part of the enum module, it would be very simple to
implement independently:
class IntEnum(int, Enum):
pass
```
We just extend this to a str -- this means the SQLAlchemy has no trouble
putting these in to queries, and `"scheduled" == DagRunType.SCHEDULED`
is true.
This change makes it simpler to use `dagrun.run_type`.
This PR introduces creating_job_id column in DagRun table that links a
DagRun to job that created it. Part of #11302
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
This adds the prefix DAG: to newly created dag permissions. It supports checking permissions on both prefixed and un-prefixed DAG permission names.
This will make it easier to identify permissions that related to granular dag access.
This PR does not modify existing dag permission names to use the new prefixed naming scheme. That will come in a separate PR.
Related to issue #10469
* Fully support running more than one scheduler concurrently.
This PR implements scheduler HA as proposed in AIP-15. The high level
design is as follows:
- Move all scheduling decisions into SchedulerJob (requiring DAG
serialization in the scheduler)
- Use row-level locks to ensure schedulers don't stomp on each other
(`SELECT ... FOR UPDATE`)
- Use `SKIP LOCKED` for better performance when multiple schedulers are
running. (Mysql < 8 and MariaDB don't support this)
- Scheduling decisions are not tied to the parsing speed, but can
operate just on the database
*DagFileProcessorProcess*:
Previously this component was responsible for more than just parsing the
DAG files as it's name might imply. It also was responsible for creating
DagRuns, and also making scheduling decisions of TIs, sending them from
"None" to "scheduled" state.
This commit changes it so that the DagFileProcessorProcess now will
update the SerializedDAG row for this DAG, and make no scheduling
decisions itself.
To make the scheduler's job easier (so that it can make as many
decisions as possible without having to load the possibly-large
SerializedDAG row) we store/update some columns on the DagModel table:
- `next_dagrun`: The execution_date of the next dag run that should be created (or
None)
- `next_dagrun_create_after`: The earliest point at which the next dag
run can be created
Pre-computing these values (and updating them every time the DAG is
parsed) reduce the overall load on the DB as many decisions can be taken
by selecting just these two columns/the small DagModel row.
In case of max_active_runs, or `@once` these columns will be set to
null, meaning "don't create any dag runs"
*SchedulerJob*
The SchedulerJob used to only queue/send tasks to the executor after
they were parsed, and returned from the DagFileProcessorProcess.
This PR breaks the link between parsing and enqueuing of tasks, instead
of looking at DAGs as they are parsed, we now:
- store a new datetime column, `last_scheduling_decision` on DagRun
table, signifying when a scheduler last examined a DagRun
- Each time around the loop the scheduler will get (and lock) the next
_n_ DagRuns via `DagRun.next_dagruns_to_examine`, prioritising DagRuns
which haven't been touched by a scheduler in the longest period
- SimpleTaskInstance etc have been almost entirely removed now, as we
use the serialized versions
* Move callbacks execution from Scheduler loop to DagProcessorProcess
* Don’t run verify_integrity if the Serialized DAG hasn’t changed
dag_run.verify_integrity is slow, and we don't want to call it every time, just when the dag structure changes (which we can know now thanks to DAG Serialization)
* Add escape hatch to disable newly added "SELECT ... FOR UPDATE" queries
We are worried that these extra uses of row-level locking will cause
problems on MySQL 5.x (most likely deadlocks) so we are providing users
an "escape hatch" to be able to make these queries non-locking -- this
means that only a singe scheduler should be run, but being able to run
one is better than having the scheduler crash.
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
This can have *extremely* bad consequences. After this change, a jinja2
template like the one below will cause the task instance to fail, if the
DAG being executed is not a sub-DAG. This may also display an error on
the Rendered tab of the Task Instance page.
task_instance.xcom_pull('z', key='return_value', dag_id=dag.parent_dag.dag_id)
Prior to the change in this commit, the above template would pull the
latest value for task_id 'z', for the given execution_date, from *any DAG*.
If your task_ids between DAGs are all unique, or if DAGs using the same
task_id always have different execution_date values, this will appear to
act like dag_id=None.
Our current theory is SQLAlchemy/Python doesn't behave as expected when
comparing `jinja2.Undefined` to `None`.
__lshift__ and __rshift__ methods should return other not self.
This PR fixes XComArg implementation to support chain like this one:
BaseOprator >> XComArg >> BaseOperator
Related to: #10153
The `@provide_session` wrapper will already commit the transaction when
returned, unless an explicit session is passed in -- removing this
parameter changes the behaviour to be:
- If session explicitly passed in: don't commit (caller's
responsibility)
- If no session passed in, `@provide_session` will commit for us already.
We have already fixed a lot of problems that were marked
with those, also IntelluiJ gotten a bit smarter on not
detecting false positives as well as understand more
pylint annotation. Wherever the problem remained
we replaced it with # noqa comments - as it is
also well understood by IntelliJ.
This change will allow users to throw other exceptions (namely `AirflowClusterPolicyViolation`) than `DagCycleException` as part of Cluster Policies.
This can be helpful for running checks on tasks / DAGs (e.g. asserting task has a non-airflow owner) and failing to run tasks aren't compliant with these checks.
This is meant as a tool for airflow admins to prevent user mistakes (especially in shared Airflow infrastructure with newbies) than as a strong technical control for security/compliance posture.
While doing a trigger_dag from UI, DagRun gets created first and then WebServer starts creating TIs. Meanwhile, Scheduler also picks up the DagRun and starts creating the TIs, which results in IntegrityError as the Primary key constraint gets violated. This happens when a DAG has a good number of tasks.
Also, changing the TIs array with a set for faster lookups for Dags with too many tasks.
Before this change, if DAG Serialization was enabled the Webserver would not update the DAGs once they are fetched from DB. The default worker_refresh_interval was `30` so whenever the gunicorn workers were restarted, they used to pull the updated DAGs when needed.
This change will allow us to have a larged worker_refresh_interval (e.g 30 mins or even 1 day)
We should not update the "last_updated" column unnecessarily. This is first of few optimizations to DAG Serialization that would also aid in DAG Versioning
PR https://github.com/apache/airflow/pull/9554 introduced this error and because of Github issue currently (github is down / has degraded performance) the CI didn't run fully
It is slower to call e.g. dict() than using the empty literal, because the name dict must be looked up in the global scope in case it has been rebound. Same for the other two types like list() and tuple().
The issue was caused because the `rendered_task_instance_fields` table did not have precision and hence causing `_mysql_exceptions.IntegrityError`.
closes https://github.com/apache/airflow/issues/9148
* Resolve upstream tasks when template field is XComArg
closes: #8054
* fixup! Resolve upstream tasks when template field is XComArg
* Resolve task relations in DagRun and DagBag
* Add tests for serialized DAG
* Set dependencies only in bag_dag, refactor tests
* Traverse template_fields attribute
* Use provide_test_dag_bag in all tests
* fixup! Use provide_test_dag_bag in all tests
* Use metaclass + setattr
* Add prepare_for_execution method
* Check signature of __init__ not class
* Apply suggestions from code review
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
* Update airflow/models/baseoperator.py
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>