Граф коммитов

37 Коммитов

Автор SHA1 Сообщение Дата
Tao feng 64d9501667 [AIRFLOW-74] SubdagOperators can consume all celeryd worker processes
Closes #3251 from feng-tao/airflow-74
2018-04-24 10:13:25 -07:00
DerekRoy 8e83e2b3ef [AIRFLOW-2350] Fix grammar in UPDATING.md
Closes #3248 from r39132/patch-1
2018-04-21 08:34:16 +02:00
Sathyaprakash Govindasamy a148043107 [AIRFLOW-2254] Put header as first row in unload
Currently, data is ordered by first column in
descending order
Header row comes as first only if the first column
is integer
This fix puts header as first row regardless of
first column data type

Closes #3180 from sathyaprakashg/AIRFLOW-2254
2018-04-16 10:21:22 +02:00
Dan Davydov 3c4f1fd9e6 [AIRFLOW-2027] Only trigger sleep in scheduler after all files have parsed
Closes #2986 from aoen/ddavydov--open_source_disab
le_unecessary_sleep_in_scheduler_loop
2018-04-09 10:22:11 +02:00
Taylor D. Edmiston 9bdcc4760a [AIRFLOW-2282] Fix grammar in UPDATING.md
Also remove trailing whitespace.
2018-04-04 17:28:53 -04:00
Tao feng bf86b89439 [AIRFLOW-2233] Update updating.md to include the info of hdfs_sensors renaming
Closes #3145 from feng-tao/airflow-2233
2018-03-31 11:16:46 +02:00
Joy Gao 05e1861e24 [AIRFLOW-1433][AIRFLOW-85] New Airflow Webserver UI with RBAC support
Closes #3015 from jgao54/rbac
2018-03-23 09:18:48 +01:00
Fokko Driesprong bb287ecf5b [AIRFLOW-2226] Rename google_cloud_storage_default to google_cloud_default
The Google cloud operators uses both
google_cloud_storage_default and
google_cloud_default as a default conn_id. This is
confusing and the
google_cloud_storage_default conn_id isnt
initialized by default in db.py
Therefore we rename the
google_cloud_storage_default to
google_cloud_default for simplicity and
convenience

Closes #3141 from Fokko/airflow-2226
2018-03-19 22:02:10 +01:00
Bolke de Bruin a1d5551777 [AIRFLOW-1895] Fix primary key integrity for mysql
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
 migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.

Closes #2969, #2857

Closes #2979 from bolkedebruin/AIRFLOW-1895
2018-01-27 09:01:10 +01:00
fenglu-g cc9295fe37 [AIRFLOW-1953] Add labels to dataflow operators
Closes #2913 from fenglu-g/master
2018-01-03 11:16:39 -08:00
Joy Gao c0dffb57c2 [AIRFLOW-1821] Enhance default logging config by removing extra loggers
Closes #2793 from jgao54/logging-enhancement
2017-12-22 14:07:29 +01:00
Fokko Driesprong 30076f1e45 [AIRFLOW-1840] Make celery configuration congruent with Celery 4
Explicitly set the celery backend from the config
and align the config
with the celery config as this might be confusing.

Closes #2806 from Fokko/AIRFLOW-1840-Fix-celery-
config
2017-12-11 18:56:29 +01:00
Ash Berlin-Taylor 98df0d6e3b [AIRFLOW-1795] Correctly call S3Hook after migration to boto3
In the migration of S3Hook to boto3 the connection
ID parameter changed
to `aws_conn_id`. This fixes the uses of
`s3_conn_id` in the code base
and adds a note to UPDATING.md about the change.

In correcting the tests for S3ToHiveTransfer I
noticed that
S3Hook.get_key was returning a dictionary, rather
then the S3.Object as
mentioned in it's doc string. The important thing
that was missing was
ability to get the key name from the return a call
to get_wildcard_key.

Closes #2795 from
ashb/AIRFLOW-1795-s3hook_boto3_fixes
2017-11-18 14:07:38 +01:00
Fokko Driesprong 635ab01a76 [AIRFLOW-1731] Set pythonpath for logging
Before initializing the logging framework, we want
to set the python
path so the logging config can be found.

Closes #2721 from Fokko/AIRFLOW-1731-import-
pythonpath
2017-10-27 16:02:56 +02:00
Dan Davydov 21e94c7d15 [AIRFLOW-1697] Mode to disable charts endpoint 2017-10-10 11:33:50 -07:00
Chris Riccomini ebe715c565 [AIRFLOW-1691] Add better Google cloud logging documentation
Closes #2671 from criccomini/fix-log-docs
2017-10-09 10:32:34 -07:00
Crystal Qian dd861f8cd0 [AIRFLOW-1323] Made Dataproc operator parameter names consistent
Closes #2636 from cjqian/1323
2017-10-03 11:15:27 +02:00
Fokko Driesprong 3c3a65a3fe [AIRFLOW-1611] Customize logging
Change the configuration of the logging to make
use of the python
logging and make the configuration easy
configurable. Some of the
settings which are now not needed anymore since
they can easily
be implemented in the config file.

Closes #2631 from Fokko/AIRFLOW-1611-customize-
logging-in-airflow
2017-10-02 17:14:01 +02:00
Fokko Driesprong a7a518902d [AIRFLOW-1582] Improve logging within Airflow
Clean the way of logging within Airflow. Remove
the old logging.py and
move to the airflow.utils.log.* interface. Remove
setting the logging
outside of the settings/configuration code. Move
away from the string
format to logging_function(msg, *args).

Closes #2592 from Fokko/AIRFLOW-1582-Improve-
logging-structure
2017-09-13 09:36:58 +02:00
Dan Davydov 4cf904cf5a [AIRFLOW-855] Replace PickleType with LargeBinary in XCom
PickleType in Xcom allows remote code execution.
In order to deprecate
it without changing mysql table schema, change
PickleType to LargeBinary
 because they both maps to blob type in mysql. Add
"enable_pickling" to
function signature to control using ether pickle
type or JSON. "enable_pickling"
 should also be added to core section of
airflow.cfg

Picked up where https://github.com/apache
/incubator-airflow/pull/2132 left off. Took this
PR, fixed merge conflicts, added
documentation/tests, fixed broken tests/operators,
and fixed the python3 issues.

Closes #2518 from aoen/disable-pickle-type
2017-08-15 12:24:07 -07:00
AllisonWang 6825d97b82 [AIRFLOW-1443] Update Airflow configuration documentation
This PR updates Airflow configuration
documentations to include a recent change to split
task logs by try number #2383.

Closes #2467 from AllisonWang/allison--update-doc
2017-08-09 14:49:56 -07:00
Bolke de Bruin 3927723263 Fix new SSH documentation 2017-07-20 22:12:31 +02:00
Jay fe0edeaab5 [AIRFLOW-756][AIRFLOW-751] Replace ssh hook, operator & sftp operator with paramiko based
Closes #1999 from jhsenjaliya/AIRFLOW-756
2017-07-20 22:07:45 +02:00
Younghee Kwon c450b60878 [AIRFLOW-1338][AIRFLOW-782] Add GCP dataflow hook runner change to UPDATING.md
Closes #2326 from yk5/df-python
2017-06-23 15:07:45 -07:00
Chris Riccomini cb336464cc [AIRFLOW-XXX] Updating CHANGELOG, README, and UPDATING after 1.8.1 release 2017-05-09 13:20:31 -07:00
Jeremiah Lowin 4da3611c46 [AIRFLOW-886] Pass result to post_execute() hook
The post_execute() hook should receive
the Operator result in addition to the
execution context.
2017-02-18 18:38:58 -05:00
Jeremiah Lowin 6e22102782 [AIRFLOW-862] Add DaskExecutor
Adds a DaskExecutor for running Airflow tasks
in Dask clusters.

Closes #2067 from jlowin/dask-executor
2017-02-12 16:06:31 -05:00
Bolke de Bruin b56e642247 Add known issue of 'num_runs' 2017-02-10 14:54:46 +01:00
Bolke de Bruin e63cb1fced Add pool upgrade issue description 2017-02-09 16:10:17 +01:00
Bolke de Bruin c64832718b [AIRFLOW-789] Update UPDATING.md
Closes #2011 from bolkedebruin/AIRFLOW-789
2017-02-01 15:52:50 +00:00
Alex Van Boxel 7e691d3f60 Update upgrade documentation for Google Cloud
Closes #1979 from alexvanboxel/pr/doc_gcloud
2017-01-10 09:03:44 +01:00
Jeremiah Lowin 9a61a5bd58 [AIRFLOW-31][AIRFLOW-200] Add note to updating.md
AIRFLOW-31 and AIRFLOW-200 deprecated the old important mechanism and should be noted in UPDATING.md

Closes #1643 from jlowin/patch-1
2016-07-06 10:41:46 +02:00
Rob Froetscher 8d501b0cea [AIRFLOW-171] Add upgrade notes on email and S3 to 1.7.1.2
Closes #1587 from rfroetscher/upgrading_readme
2016-06-14 12:27:58 +02:00
Bolke de Bruin bd414161da Use os.execvp instead of subprocess.Popen for the webserver
subprocess.Popen forks before doing execv. This makes it difficult
for some manager daemons (like supervisord) to send kill signals.
This patch uses os.execve directly. os.execve takes over the current
process and thus responds correctly to signals

* Resolves residue in ISSUE-852
2016-04-21 16:23:11 +02:00
Bolke de Bruin e8c1144bb8 Add consistent and thorough signal handling and logging
Airflow spawns childs in the form of a webserver, scheduler, and executors.
If the parent gets terminated (SIGTERM) it needs to properly propagate the
signals to the childs otherwise these will get orphaned and end up as
zombie processes. This patch resolves that issue.

In addition Airflow does not store the PID of its services so they can be
managed by traditional unix systems services like rc.d / upstart / systemd
and the likes. This patch adds the "--pid" flag. By default it stores the
PID in ~/airflow/airflow-<service>.pid

Lastly, the patch adds support for different log file locations: log,
stdout, and stderr (respectively: --log-file, --stdout, --stderr). By
default these are stored in ~/airflow/airflow-<service>.log/out/err.

* Resolves ISSUE-852
2016-04-06 20:40:43 +02:00
Jeremiah Lowin 10ee622330 Deprecate *args and **kwargs in BaseOperator
BaseOperator silently accepts any arguments. This deprecates the
behavior with a warning that says it will be forbidden in Airflow 2.0.

This PR also turns on DeprecationWarnings by default, which in turn
revealed that inspect.getargspec is deprecated. Here it is replaced by
`inspect.signature` (Python 3) or `funcsigs.signature` (Python 2).

Lastly, this brought to attention that example_http_operator was
passing an illegal argument.
2016-04-05 10:04:55 +02:00
Bence Nagy e1fd48b2ec Set dags_are_paused_at_creation's default value to True 2016-03-31 10:53:43 +02:00