A dag will accept, in its default_args, a start_date as simple as 2019-06-01. If it detects
a string, it converts to a richer type. However, it did not accept a similar string for
end_date instead an exception was thrown.
That's a very confusing user experience. end_date should be as permissive as start_date
* [AIRFLOW-4781] Added the ability to specify ports in kubernetesOperator
* [AIRFLOW-4781] Added the ability to specify ports in kubernetesOperator
* [AIRFLOW-4781] Added the ability to specify ports in kubernetesOperator
added docstring
* [AIRFLOW-4781] Added the ability to specify ports in kubernetesOperator
add typehints
Co-Authored-By: Fokko Driesprong <fokko@driesprong.frl>
* [AIRFLOW-4781] Added the ability to specify ports in kubernetesOperator
fixed docstrings and typehints
* in tests using bash operator repeatedly, env is populated with contents of environment.
* on subsequent runs, render_templates will try to render contents of env.
* this produces unpredictable behavior where missing template error may be thrown, or env paths may be replaced with "template file" contents
* [AIRFLOW-3211] Reattach to GCP Dataproc jobs upon Airflow restart
This change allows Airflow to reattach to existing Dataproc jobs upon
scheduler restart, preventing duplicate job submissions. Previously,
if the Airflow scheduler restarts while it's running a job on GCP
Dataproc, it'll lose track of that job, mark the task as failed, and
eventually retry. However, the jobs may still be running on Dataproc
and maybe even finish successfully. So when Airflow retries and reruns
the job, the same job will run twice. This can result in issues like
delayed workflows, increased costs, and duplicate data.
* [AIRFLOW-3211] Fixed flake8 formatting
* Update test with new GCP_PROJECT convention
* More flake8 cleanups
* viewing gantt chart of running dagrun with a task that failed initially but was cleared would result in json encode error
* when you clear a TI it only nulls out the state -- not the start_date. so these cleared TIs would still be added to `tis` and thus to gantt but would have no state, and json conversion does not like None type for state.
* we should check state in addition to start_date to handle this case
* Two new variables are added to template context: prev_execution_date_success and prev_start_date_success.
* These return the exec / start dates for the same task in prior successful dag run, without regard to TI status.
* Lazy evaluation is employed so that query to look up prev_ti is not executed unnecessarily.
Setting a `dagrun` to success or failure calls set_state for each task in
the dag, running multiple database queries for each one. We can reduce
the number of queries, and improve performance for the associated
endpoints, by setting the states of all relevant tasks in the same query.
google-storage-client 1.16 introduced a breaking change where the
signature of client.get_bucket changed from (bucket_name) to
(bucket_or_name). Calls with named arguments to this method now fail.
This commit makes all calls positional to work around this.
This commit adds a .github/SECURITY.md file that defines the
contents of the "Policy" tab in the new "Security" section of
the GitHub interface.
Currently the Policy tab obtains its content from the
docs/security.rst file, which contains technical, non-policy
related information. This commit retains the
"Reporting Vulnerabilities" section of docs/security.rst, which
is relevant, and strips the extraneous content.
Allow user to specify temporary directory to use on the host machine;
default settings will cause an error on OS X due to the standard
temporary directory not being shared to Docker.
Based on PR #2418 by benjamin@techcitylabs.com. Closes#2418#4315