* Use first start date when running a rescheduled task, this also fixes
the duration. Use actual start date to record reschedule requests.
* Simplify gantt view code now that start date and duration are correct
in `task_instance` table
In cases where the success callback takes variable
time, it's possible for it to interrupted by the heartbeat process.
This is because the heartbeat process looks for tasks that are no
longer in the "running" state but are still executing and reaps them.
This commit reverses the order of callback invocation and state
updating so that the "SUCCESS" state for the task isn't committed
to the database until after the success callback has finished.
`following_schedule` converts to naive time by using the
local time zone. In case of a DST transition, say 3AM -> 2AM
("summer time to winter time") we incorrectly re-applied
the timezone information which meant that a "CEST -> CEST"
could happen instead of a "CEST -> CET". This resulted
in infinite loops.
Full URL decoding is performed now when parsing different
components of URI for connection. This enables to configure
paths to sockets including (for example ":") - so far
only '/' (%2f) was hard-coded in hostname. This change introduces
full decoding for all components of the URI.
Note that this is potentially breaking change if someone uses
% in some of their AIRFLOW_CONN_ defined connections.
Once the user has installed Fernet package then the application enforces setting valid Fernet key.
This change will alter this behavior into letting empty Fernet key or special `no encryption` phrase and interpreting those two cases as no encryption desirable.
I noticed that many of the tests of DagBags operate on a specific DAG
only, and don't need to load the example or test dags. By not loading
the dags we don't need to this shaves about 10-20s of test time.
a `shutdown` task is not considered be `unfinished`, so a dag run can
deadlock when all `unfinished` downstreams are all waiting on a task
that's in the `shutdown` state. fix this by considering `shutdown` to
be `unfinished`, since it's not truly a terminal state
The existing airflow only change dag_run table end_date value when
a user teminate a dag in web UI. The end_date will not be updated
if airflow detected a dag finished and updated its state.
This commit add end_date update in DagRun's set_state function to
make up tho problem mentioned above.
Tasks can have start_dates or end_dates separately
from the DAG. These need to be converted to UTC otherwise
we cannot use them for calculation the next execution
date.
A DAG can be imported as a .py script properly,
but the Cron expression inside as "schedule_interval" may be
invalid, like "0 100 * * *".
This commit helps check the validity of Cron expression in DAG
files (.py) and packaged DAG files (.zip), and help show
exception messages in web UI by add these exceptions into
metadata "import_error".
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
Make sure you have checked _all_ steps below.
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2267
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.
### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Provide DAG level access for airflow. The detail
design could be found at https://docs.google.com/d
ocument/d/1qs26lE9kAuCY0Qa0ga-80EQ7d7m4s-590lhjtMB
jmxw/edit#
### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
Unit tests are added.
### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#3197 from feng-tao/airflow-2267
When Airflow was populating a DagBag from a .zip
file, if a single
file in the root directory did not contain the
strings 'airflow' and
'DAG' it would ignore the entire .zip file.
Also added a small amount of logging to not
bombard user with info
about skipping their .py files.
Closes#3505 from Noremac201/dag_name
Fix to provide proper TI context while calling ti.handle_failure during
kill_zombies, as without the context handler_failure is of no use and
its equivalent of marking those TIs as failed directly.
This patch had conflicts when merged, resolved by
Committer: Ash Berlin-Taylor
<ash_github@firemirror.com>
Closes#1796 from msumit/AIRFLOW-437-2
Make sure you have checked _all_ steps below.
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2526
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.
### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
params can be overridden by the dictionary passed
through `airflow backfill -c`
```
templated_command = """
echo "text = {{ params.text }}"
"""
bash_operator = BashOperator(
task_id='bash_task',
bash_command=templated_command,
dag=dag,
params= {
"text" : "normal processing"
})
```
In daily processing it prints:
```
normal processing
```
In backfill processing `airflow trigger_dag -c
"{"text": "override success"}"`, it prints
```
override success
```
### Tests
- [ ] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
- When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.
### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#3422 from milton0825/params-overridden-
through-cli
Currently, if you have an operator with a template
fields argument, that is a dictionary, e.g.:
template_fields = ([dict_args])
And you populate that dictionary with a field that
an integer in a DAG, e.g.:
...
dict_args = {'ds': '{{ ds }}', num_times: 5}
...
Then ariflow will give you the following error:
{base_task_runner.py:95} INFO - Subtask:
airflow.exceptions.AirflowException: Type '<type
'int'>' used for parameter 'dict_args[num_times]'
is not supported for templating
This fix aims to resolves that issue by
immediately resolving numbers without attempting
to template them
Closes#3410 from
ArgentFalcon/support_numeric_template_fields
A bug existed when default_args did contain
start_date
but it was set to None, failing to instantiate the
DAG.
Closes#3256 from bolkedebruin/AIRFLOW-2351
When a task instance exists in the database but
its corresponding task
no longer exists in the DAG, the scheduler marks
the task instance as
REMOVED. Once removed, task instances stayed
removed forever, even if
the task were to be added back to the DAG.
This change allows for the restoration of REMOVED
task instances. If a
task instance is in state REMOVED but the
corresponding task is present
in the DAG, restore the task instance by setting
its state to NONE.
A new unit test simulates the removal and
restoration of a task from a
DAG and verifies that the task instance is
restored:
`./run_unit_tests.sh tests.models:DagRunTest`
JIRA:
https://issues.apache.org/jira/browse/AIRFLOW-1460Closes#3137 from astahlman/airflow-1460-restore-
tis
Moved from adding_task to when dag is being bagged.
This changes import dag runtime from polynomial to somewhat linear.
Closes#3116 from wongwill86:dag_import_speed