* Refactored get_email_address_list to have a better
separation between string handling and other iterables.
* Explicitely casting get_email_address_list argument
to a list in case the argument was an iterable. This
enables direct support for tuples, sets or the like.
* Fixed type annotation of email parameter of
BaseOperator to show that iterables are directly
supported.
* Added docstring entries for email, email_on_retry,
email_on_failure and queue in BaseOperator.
Since we switched to using sub-processes to parse the DAG files sometime
back in 2016(!) the metrics we have been emitting about dag bag size and
parsing have been incorrect.
We have also been emitting metrics from the webserver which is going to
be become wrong as we move towards a stateless webserver.
To fix both of these issues I have stopped emitting the metrics from
models.DagBag and only emit them from inside the
DagFileProcessorManager.
(There was also a bug in the `dag.loading-duration.*` we were emitting
from the DagBag code where the "dag_file" part of that metric was empty.
I have fixed that even though I have now deprecated that metric. The
webserver was emitting the right metric though so many people wouldn't
notice)
1. Issue old conf method deprecation warnings properly and remove current old conf method usages.
2. Unify the way to use conf as `from airflow.configuration import conf`
Using mock.assert_call_with method can result in flaky tests
(ex. iterating through dict in python 3.5 which does not
store order of elements). That's why it's better to
use assert_called_once_with or has_calls methods.
* AIRFLOW-5139 Allow custom ES configs
While attempting to create a self-signed TLS connection between airflow
and ES, we discovered that airflow does now allow users to modify the
SSL state of the elasticsearchtaskhandler. This commit will allow users
to define ES settings in the airflow.cfg
Under heavy load (and exasperated by having `--run-duration 600`) we
found that the multiprocessing.Manager processes could be left alive.
They would consume no CPU as they were just polling on a socket, but
they would consume memory.
Instead of trying to track down all the places we might have leaked a
process I have just removed manager's from the scheduler entirely, and
re-written the multiprocessing how I would if I was writing this in
golang channels - passing objects/messages over a single channel, and
shutting down when done (so we don't need a "Done" signal, and we can
.poll() on the channel to see if there is anything to receive.
Moved query to fetch zombies from DagFileProcessorManager to DagBag class. Changed query to only look for DAGs of the current DAG bag. The query now uses index ti_dag_state instead of ti_state. Removed no longer required zombies parameters from many function signatures.
The query is now executed on every call to DagBag.kill_zombies which is called when the DAG file is processed which frequency depends on scheduler_heartbeat_sec and processor_poll_interval (AFAIU). The query is faster than the previous one (see also stats below). It's also negligible IMHO because during DAG file processing many other queries (DAG runs and task instances are created, task instance dependencies are checked) are executed.
When using potentially larger offets than javascript can handle, they can get parsed incorrectly on the client, resulting in the offset query getting stuck on a certain number. This patch ensures that we return a string to the client to avoid being parsed. When we run the query, we ensure the offset is set as an integer.
Add unnecesary prefix_ in config for elastic search section
It was possible to end up with invalid JS-in-script if you created
certain structures. This fixes it.
`|tojson|safe` is the method recommended by Flask of dealing with this
sort of case: http://flask.pocoo.org/docs/1.0/templating/#standard-filters
The json_ser function is not used anymore (Flask needs an encoder class)
so it and the tests have been removed. The AirflowJsonEncoder behaves
the same on dates, and has extra behaviour too.
In master branch, we have already decommissioned the Flask-Admin UI.
In model definitions, User and Chart are only applicable for the
"old" UI based on Flask-Admin.
Hence we should decommission these two models as well.
Related doc are updated in this commit as well.
The different UtcDateTime implementations all have issues.
Either they replace tzinfo directly without converting
or they do not convert to UTC at all.
We also ensure all mysql connections are in UTC
in order to keep sanity, as mysql will ignore the
timezone of a field when inserting/updating.
Make sure you have checked _all_ steps below.
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2267
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.
### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Provide DAG level access for airflow. The detail
design could be found at https://docs.google.com/d
ocument/d/1qs26lE9kAuCY0Qa0ga-80EQ7d7m4s-590lhjtMB
jmxw/edit#
### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
Unit tests are added.
### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#3197 from feng-tao/airflow-2267
* Add test that verifies that database schema and SQLAlchemy model are in sync
* Add exception for users.password that doesn't exist in model and tables created by other tests
* Add migration script to merge the two heads
* Add migration script to fix not-null constrains for MySQL that were lost by 0e2a74e0fc9f_add_time_zone_awareness
* Add migration script to fix FK constraint for existing SQLite DBs
* Enable ForeignKey support for SQLite, otherwise 2e82aab8ef20_rename_user_table won't change FK in chart and known_event tables
on_kill methods were not triggered, due to
processes
not being properly terminated. This was due to the
fact
the runners use a shell which is then replaced by
the
child pid, which is unknown to Airflow.
Closes#3204 from bolkedebruin/AIRFLOW-1623
Cache inspect.signature for the wrapper closure to avoid calling it at
every decorated invocation. This is separate sig_cache created per
decoration, i.e. each function decorated using apply_defaults will have
a different sig_cache.
This allows hostnames to be overridable to
facilitate service discovery
requirements in common production deployments.
Closes#3036 from thekashifmalik/hostnames
In a previous change we removed the
airflow.task.raw handler (which
printed to stdout directly) and replaced it with
one that wrote to the
log file itself. The problem comes that python
automatically calls
`logging.shutdown()` itself on process clean exit.
This ended up
uploading the log file twice: once from the end of
`airflow run --raw`,
and then again from the explicit shutdown() call
at the end of cli's
`run()`
Since logging is automatically shutdown this
change adds and explicit
flag to control if the GC and S3 handlers should
upload the file or not,
and we tell them not to when running with `--raw`
Closes#2880 from ashb/AIRFLOW-1916-dont-upload-
logs-twice
Due to the change in AIRFLOW-1873 we inadvertently
changed the behaviour
such that task logs for a try wouldn't show up in
the UI until after the
task run had completed.
Closes#2859 from ashb/AIRFLOW-1897-view-logs-for-
running-instance
Rather than having try_number+1 in various places,
try_number
will now automatically contain the right value for
when the TI
will next be run, and handle the case where
try_number is
accessed when the task is currently running.
This showed up as a bug where the logs from
running operators would
show up in the next log file (2.log for the first
try)
Closes#2832 from ashb/AIRFLOW-1873-task-operator-
log-try-number
Previously setting the context was not propagated
to the parent
loggers. Unfortnately, in case of a non explicitly
defined logger
the returned logger is shallow, ie. it does not
have handlers
defined. So to set the context it is required to
walk the tree.
Closes#2831 from bolkedebruin/fix_logging
Converting to naive time is required in order to make sure
to run at exact times for crons.
E.g. if you specify to run at 8:00pm every day you do not
want suddenly to run at 7:00pm due to DST.
The change from boto2 to boto3 in S3Hook caused
this to break (the
return type of `hook.get_key()` changed. There's a
better method
designed for that we should use anyway.
This wasn't caught by the tests as the mocks
weren't updated. Rather
than mocking the return of the hook I have changed
it to use "moto"
(already in use elsewhere in the tests) to mock at
the S3 layer, not
our hook.
Closes#2773 from ashb/AIRFLOW-1756-s3-logging-
boto3-fix
Fixes Batch clear in Task Instances view is not working
for task instances in RUNNING state and all batch
operations in Task instances view cannot work when
manually triggered task instances are selected
because they have a different execution date
format.
Closes#2759 from yrqls21/fix-ti-batch-clear-n
-set-state-bugs
The new logging framework was not properly
capturing stdout/stderr
output. Redirection the the correct logging
facility is required.
Closes#2745 from bolkedebruin/redirect_std
Until now, the dga processor had its own logging
implementation,
making it hard to adjust for certain use cases
like working
in a container.
This patch moves everything to the standard
logging framework.
Closes#2728 from bolkedebruin/AIRFLOW-1018
Change the configuration of the logging to make
use of the python
logging and make the configuration easy
configurable. Some of the
settings which are now not needed anymore since
they can easily
be implemented in the config file.
Closes#2631 from Fokko/AIRFLOW-1611-customize-
logging-in-airflow
Celery config loading was broken as it was just passing
a string. This fixes it by loading it as a module with an
attribute. Inspired by Django's module loading.
In all the popular languages the variable name log
is the de facto
standard for the logging. Rename LoggingMixin.py
to logging_mixin.py
to comply with the Python standard.
When using the .logger a deprecation warning will
be emitted.
Closes#2604 from Fokko/AIRFLOW-1604-logger-to-log
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).
Closes#2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
This PR adds configurable task logging to Airflow.
Please refer to #2422 for previous discussions.
This is the first step of making entire Airflow
logging configurable ([AIRFLOW-1454](https://issue
s.apache.org/jira/browse/AIRFLOW-1454)).
Closes#2464 from AllisonWang/allison--log-
abstraction
This PR splits logs based on try number and add
tabs to display different task instance tries.
**Note this PR is a temporary change for
separating task attempts. The code in this PR will
be refactored in the future. Please refer to #2422
for Airflow logging abstractions redesign.**
Testing:
1. Added unit tests.
2. Tested on localhost.
3. Tested on production environment with S3 remote
storage, MySQL database, Redis, one Airflow
scheduler and two airflow workers.
Closes#2383 from AllisonWang/allison--add-task-
attempt
The kill_process_tree function comments state that
it uses SIGKILL when
it uses SIGTERM. We should update this to be
correct as well as log
results.
Closes#2241 from saguziel/aguziel-kill-processes
Avoid unnecessary backfills by having start dates
of
just a few days ago. Adds a utility function
airflow.utils.dates.days_ago().
Closes#2068 from jlowin/example-start-date