Added a FAQ question to the Upgrading to 2 doc and added an initial
question and answer around needing providers to be installed before
connection types show up in the UI.
(cherry picked from commit 8e0db6eae3)
* pass image_pull_policy to V1Container
image_pull_policy is not being passed into the V1Container in
KubernetesPodOperator. This commit fixes this.
* add test for image_pull_policy not set
image_pull_policy should be IfNotPresent by default if
it's not set. The test ensure the correct value is passed
to the V1Container object.
(cherry picked from commit 7a560ab6de)
* Updated taskflow api doc to show dependency with sensor
Updated the taskflow api tutorial document to show how to setup a
dependency to a python-based decorated task from a classic
FileSensor task.
(cherry picked from commit df11a1d7dc)
In #13923, all permissions were removed from the Public role. This adds a test to ensure that the default public role doesn't have any permissions.
related: #13923
(cherry picked from commit a52e77d0b4)
K8S pod names follows DNS_SUBDOMAIN naming convention, which can be
broken down into one or more DNS_LABEL separated by `.`.
While the max length of pod name (DNS_SUBDOMAIN) is 253, each label
component (DNS_LABEL) of a the name cannot be longer than 63. Pod names
generated by k8s executor right now only contains one label, which means
the total effective name length cannot be greater than 63.
This patch concats uuid to pod_id using `.` to generate the pod anem,
thus extending the max name length to 63 + len(uuid).
Reference: https://github.com/kubernetes/kubernetes/blob/release-1.1/docs/design/identifiers.md
Relevant discussion: https://github.com/kubernetes/kubernetes/issues/79351#issuecomment-505228196
(cherry picked from commit 862443f6d3)
Resolves Issue #10186 (Move Tips & Tricks for Oracle shops should be moved to Airflow Docs)
Fixes broken link, add UI connection documentation link, and connection tips.
(cherry picked from commit 74da0faa7b)
By default sqlalchemy pass query params as is to db dialect drivers for
query execution. This causes inconsistent behavior of query param
evaluation between different db drivers. For example, MySQLdb will
convert `DagRunType.SCHEDULED` to string `'DagRunType.SCHEDULED'`
instead of string `'scheduled'`.
see https://github.com/apache/airflow/pull/11621 for relevant
discussions.
(cherry picked from commit 53e8283871)
closes https://github.com/apache/airflow/issues/13667
The following error happens when Serialized DAGs exist in Webserver or Scheduler but it has just been removed from serialized_dag table,
mainly due to the removal of DAG file.
```
Traceback (most recent call last):
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1275, in _execute
self._run_scheduler_loop()
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1377, in _run_scheduler_loop
num_queued_tis = self._do_scheduling(session)
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1516, in _do_scheduling
self._schedule_dag_run(dag_run, active_runs_by_dag_id.get(dag_run.dag_id, set()), session)
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1629, in _schedule_dag_run
dag = dag_run.dag = self.dagbag.get_dag(dag_run.dag_id, session=session)
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/utils/session.py", line 62, in wrapper
return func(*args, **kwargs)
File "/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/models/dagbag.py", line 187, in get_dag
if sd_last_updated_datetime > self.dags_last_fetched[dag_id]
```
A simple fix is to just check if `sd_last_updated_datetime` is not `None` i.e. Serialized DAG for that dag_id is not None
(cherry picked from commit 8958d125cd)
When running `airflow dags unpause` with a DAG that does not exist, it
currently shows this error
```
root@6f086ba87198:/opt/airflow# airflow dags unpause example_bash_operatoredd
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 33, in <module>
sys.exit(load_entry_point('apache-airflow', 'console_scripts', 'airflow')())
File "/opt/airflow/airflow/__main__.py", line 40, in main
args.func(args)
File "/opt/airflow/airflow/cli/cli_parser.py", line 48, in command
return func(*args, **kwargs)
File "/opt/airflow/airflow/utils/cli.py", line 92, in wrapper
return f(*args, **kwargs)
File "/opt/airflow/airflow/cli/commands/dag_command.py", line 160, in dag_unpause
set_is_paused(False, args)
File "/opt/airflow/airflow/cli/commands/dag_command.py", line 170, in set_is_paused
dag.set_is_paused(is_paused=is_paused)
AttributeError: 'NoneType' object has no attribute 'set_is_paused'
```
This commit changes the error to show a helpful error:
```
root@6f086ba87198:/opt/airflow# airflow dags unpause example_bash_operatoredd
DAG: example_bash_operatoredd does not exit in 'dag' table
```
(cherry picked from commit 8723b1feb8)
closes https://github.com/apache/airflow/issues/13504
Currently, the DagFileProcessor parses the DAG files, writes it to the
`dag` table and then writes DAGs to `serialized_dag` table.
At the same time, the scheduler loop is constantly looking for the next
DAGs to process based on ``next_dagrun_create_after`` column of the DAG
table.
It might happen that as soon as the DagFileProcessor writes DAG to `dag`
table, the scheduling loop in the Scheduler picks up the DAG for processing.
However, as the DagFileProcessor has not written to serialized DAG table yet
the scheduler will error with "Serialized Dag not Found" error.
This would mainly happen when the DAGs are dynamic where the result of one DAG,
creates multiple DAGs.
This commit changes the order of writing DAG and Serialized DAG and hence
before a DAG is written to `dag` table it will be written to `serialized_dag` table.
(cherry picked from commit b9eb51a0fb)
Some users were not aware that we are not relasing images from
`stable` branch. This change clarifies branching strategy used
and what they can expect from the reference image published in
DockerHub.
(cherry picked from commit 0e540ab28d)