Currently, data is ordered by first column in
descending order
Header row comes as first only if the first column
is integer
This fix puts header as first row regardless of
first column data type
Closes#3180 from sathyaprakashg/AIRFLOW-2254
The Google cloud operators uses both
google_cloud_storage_default and
google_cloud_default as a default conn_id. This is
confusing and the
google_cloud_storage_default conn_id isnt
initialized by default in db.py
Therefore we rename the
google_cloud_storage_default to
google_cloud_default for simplicity and
convenience
Closes#3141 from Fokko/airflow-2226
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.
Closes#2969, #2857Closes#2979 from bolkedebruin/AIRFLOW-1895
Explicitly set the celery backend from the config
and align the config
with the celery config as this might be confusing.
Closes#2806 from Fokko/AIRFLOW-1840-Fix-celery-
config
In the migration of S3Hook to boto3 the connection
ID parameter changed
to `aws_conn_id`. This fixes the uses of
`s3_conn_id` in the code base
and adds a note to UPDATING.md about the change.
In correcting the tests for S3ToHiveTransfer I
noticed that
S3Hook.get_key was returning a dictionary, rather
then the S3.Object as
mentioned in it's doc string. The important thing
that was missing was
ability to get the key name from the return a call
to get_wildcard_key.
Closes#2795 from
ashb/AIRFLOW-1795-s3hook_boto3_fixes
Before initializing the logging framework, we want
to set the python
path so the logging config can be found.
Closes#2721 from Fokko/AIRFLOW-1731-import-
pythonpath
Change the configuration of the logging to make
use of the python
logging and make the configuration easy
configurable. Some of the
settings which are now not needed anymore since
they can easily
be implemented in the config file.
Closes#2631 from Fokko/AIRFLOW-1611-customize-
logging-in-airflow
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).
Closes#2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
PickleType in Xcom allows remote code execution.
In order to deprecate
it without changing mysql table schema, change
PickleType to LargeBinary
because they both maps to blob type in mysql. Add
"enable_pickling" to
function signature to control using ether pickle
type or JSON. "enable_pickling"
should also be added to core section of
airflow.cfg
Picked up where https://github.com/apache
/incubator-airflow/pull/2132 left off. Took this
PR, fixed merge conflicts, added
documentation/tests, fixed broken tests/operators,
and fixed the python3 issues.
Closes#2518 from aoen/disable-pickle-type
This PR updates Airflow configuration
documentations to include a recent change to split
task logs by try number #2383.
Closes#2467 from AllisonWang/allison--update-doc
subprocess.Popen forks before doing execv. This makes it difficult
for some manager daemons (like supervisord) to send kill signals.
This patch uses os.execve directly. os.execve takes over the current
process and thus responds correctly to signals
* Resolves residue in ISSUE-852
Airflow spawns childs in the form of a webserver, scheduler, and executors.
If the parent gets terminated (SIGTERM) it needs to properly propagate the
signals to the childs otherwise these will get orphaned and end up as
zombie processes. This patch resolves that issue.
In addition Airflow does not store the PID of its services so they can be
managed by traditional unix systems services like rc.d / upstart / systemd
and the likes. This patch adds the "--pid" flag. By default it stores the
PID in ~/airflow/airflow-<service>.pid
Lastly, the patch adds support for different log file locations: log,
stdout, and stderr (respectively: --log-file, --stdout, --stderr). By
default these are stored in ~/airflow/airflow-<service>.log/out/err.
* Resolves ISSUE-852
BaseOperator silently accepts any arguments. This deprecates the
behavior with a warning that says it will be forbidden in Airflow 2.0.
This PR also turns on DeprecationWarnings by default, which in turn
revealed that inspect.getargspec is deprecated. Here it is replaced by
`inspect.signature` (Python 3) or `funcsigs.signature` (Python 2).
Lastly, this brought to attention that example_http_operator was
passing an illegal argument.