incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
GRANT NICHOLAS	cde3a5fecd	[AIRFLOW-1517] Add minikube for kubernetes integration tests Add better support for minikube integration tests; By default minikube integration tests will run with kubernetes 1.7 and kubernetes 1.8	2018-01-11 15:28:32 -08:00
Daniel Imberman	78ff2fc180	[AIRFLOW-1517] Kubernetes Operator	2017-12-26 08:45:31 -08:00
Fokko Driesprong	815270bb56	[AIRFLOW-1911] Rename celeryd_concurrency There are still celeryd_concurrency occurrences left in the code this needs to be renamed to worker_concurrency to make the config with Celery consistent Closes #2870 from Fokko/AIRFLOW-1911-update- airflow-config	2017-12-12 13:47:55 +01:00
Bolke de Bruin	22453d037e	[AIRFLOW-1908] Fix celery broker options config load Options were set to visibility timeout instead of broker_options directly. Furthermore, options should be int, float, bool or string not all string. Closes #2867 from bolkedebruin/AIRFLOW-1908	2017-12-12 12:44:06 +01:00
Fokko Driesprong	30076f1e45	[AIRFLOW-1840] Make celery configuration congruent with Celery 4 Explicitly set the celery backend from the config and align the config with the celery config as this might be confusing. Closes #2806 from Fokko/AIRFLOW-1840-Fix-celery- config	2017-12-11 18:56:29 +01:00
Bolke de Bruin	b9c82c0400	[AIRFLOW-1870] Enable flake8 tests Flake8 tests now run for diffs Closes #2829 from bolkedebruin/use_flake8	2017-11-30 15:57:17 +01:00
Bolke de Bruin	518a41acf3	[AIRFLOW-1826] Update views to use timezone aware objects	2017-11-27 15:54:27 +01:00
Stefanie Grunwald	a61d9444cd	[AIRFLOW-1669] Fix Docker and pin Moto to 1.1.19 https://github.com/spulec/moto/pull/1048 introduced `docker` as a dependency in Moto, causing a conflict as Airflow uses `docker-py`. As both packages don't work together, Moto is pinned to the version prior to that change.	2017-11-02 14:23:32 +01:00
Maxime Beauchemin	b464d23a6d	[AIRFLOW-1698] Remove SCHEDULER_RUNS env var in systemd In the very early days, the Airflow scheduler needed to be restarted every so often to take new DAG_FOLDERS mutations into account properly. This is no longer required. Closes #2677 from mistercrunch/scheduler_runs	2017-10-18 21:55:57 +02:00
fenglu-g	7cb818bbac	[AIRFLOW-1723] Support sendgrid in email backend Closes #2695 from fenglu-g/master	2017-10-18 12:27:14 -07:00
Dan Davydov	21e94c7d15	[AIRFLOW-1697] Mode to disable charts endpoint	2017-10-10 11:33:50 -07:00
Bolke de Bruin	65f3b468a2	[AIRFLOW-1527] Refactor celery config The celery config is currently part of the celery executor definition. This is really inflexible for users wanting to change it. In addition Celery 4 is moving to lowercase. Closes #2542 from bolkedebruin/upgrade_celery	2017-09-25 11:19:16 -07:00
Bolke de Bruin	fa1dc1eb20	Revert "[AIRFLOW-1368] Automatically remove Docker container on exit" This reverts commit `46c86a5cd2`.	2017-09-24 19:35:28 +02:00
Nathaniel Varona	46c86a5cd2	[AIRFLOW-1368] Automatically remove Docker container on exit Closes #2411 from nathanielvarona/docker-operator	2017-09-22 10:15:23 -07:00
Fokko Driesprong	eb2f589099	[AIRFLOW-1604] Rename logger to log In all the popular languages the variable name log is the de facto standard for the logging. Rename LoggingMixin.py to logging_mixin.py to comply with the Python standard. When using the .logger a deprecation warning will be emitted. Closes #2604 from Fokko/AIRFLOW-1604-logger-to-log	2017-09-19 10:17:14 +02:00
Fokko Driesprong	de99aa20f4	[AIRFLOW-1324] Generalize Druid operator and hook Make the druid operator and hook more specific. This allows us to have a more flexible configuration, for example ingest parquet. Also get rid of the PyDruid extension since it is more focussed on querying druid, rather than ingesting data. Just requests is sufficient to submit an indexing job. Add a test to the hive_to_druid operator to make sure it behaves as we expect. Furthermore cleaned up the docstring a bit Closes #2378 from Fokko/AIRFLOW-1324-make-more- general-druid-hook-and-operator	2017-08-18 21:34:03 +02:00
Jay	fe0edeaab5	[AIRFLOW-756][AIRFLOW-751] Replace ssh hook, operator & sftp operator with paramiko based Closes #1999 from jhsenjaliya/AIRFLOW-756	2017-07-20 22:07:45 +02:00
Bolke de Bruin	fb21bcbcc1	Re-enable caching for hadoop components	2017-06-16 08:41:54 -04:00
Bolke de Bruin	38b2747c5b	Pin Hive and Hadoop to a specific version and create writable warehouse dir	2017-06-15 19:22:09 -04:00
Kengo Seki	0f55477ccb	[AIRFLOW-1172] Support nth weekday of the month cron expression Closes #2321 from sekikn/AIRFLOW-1172	2017-06-14 17:59:02 -07:00
Sumit Maheshwari	6be02475f8	[AIRFLOW-1192] Some enhancements to qubole_operator 1. Upgrade qds_sdk version to latest 2. Add support to run Zeppelin Notebooks 3. Move out initialization of QuboleHook from init() Closes #2322 from msumit/AIRFLOW-1192	2017-06-07 09:09:50 +02:00
Stanislav Kudriashev	d2d3e49ca0	[AIRFLOW-1201] Update deprecated 'nose-parameterized' The 'parameterized' package should be used now, Closes #2298 from skudriashev/airflow-1201	2017-05-16 11:34:52 +02:00
Chris Riccomini	3e9c666e8e	[AIRFLOW-1203] Pin Google API client version to fix OAuth issue Closes #2296 from criccomini/AIRFLOW-1203	2017-05-15 14:42:09 -07:00
Niels Zeilemaker	ac9ccb1518	[AIRFLOW-1179] Fix Pandas 0.2x breaking Google BigQuery change Closes #2279 from NielsZeilemaker/AIRFLOW-1179	2017-05-09 09:42:32 -07:00
Chris Riccomini	94f9822ffd	[AIRFLOW-1138] Add missing licenses to files in scripts directory Closes #2253 from criccomini/AIRFLOW-1138	2017-04-21 13:16:54 -07:00
Henk Griffioen	219c506414	[AIRFLOW-1094] Run unit tests under contrib in Travis Rename all unit tests under tests/contrib to start with test_* and fix broken unit tests so that they run for the Python 2 and 3 builds. Closes #2234 from hgrif/AIRFLOW-1094	2017-04-17 10:04:36 +02:00
Henk Griffioen	f1bc5f38ac	[AIRFLOW-1065] Add functionality for Azure Blob Storage over wasb:// This PR implements a hook to interface with Azure storage over wasb:// via azure-storage; adds sensors to check for blobs or prefixes; and adds an operator to transfer a local file to the Blob Storage. Design is similar to that of the S3Hook in airflow.operators.S3_hook. Closes #2216 from hgrif/AIRFLOW-1065	2017-04-05 09:56:23 +02:00
Xiangrui Meng	70f1bf10a5	[AIRFLOW-1067] use example.com in examples We use airflow@airflow.com in examples. However, https://airflow.com is owned by a company named Airflow (selling fans, etc). We should use airflow@example.com instead. That domain is created for this purpose. Closes #2217 from mengxr/AIRFLOW-1067	2017-04-04 09:22:37 -07:00
Bolke de Bruin	15fd4d98d1	Merge branch 'AIRFLOW-719' into AIRFLOW-719-3	2017-04-04 11:55:20 +02:00
Bolke de Bruin	eb705fd55c	[AIRFLOW-719] Fix race condition in ShortCircuit, Branch and LatestOnly Both the ShortCircuitOperator, Branchoperator and LatestOnlyOperator were arbitrarily changing the states of TaskInstances without locking them in the database. As the scheduler checks the state of dag runs asynchronously the dag run state could be set to failed while the operators are updating the downstream tasks. A better fix would to use the dag run iteself in the context of the Operator.	2017-04-03 10:38:12 +02:00
Alexander Bij	6393366a78	[AIRFLOW-840] Make ticket renewer python3 compatible The return from the subprocess is in bytes when the universal newlines is set to False (default). This will fail in Py3 and works fine in Py2. And with a working unit test. Closes #2158 from abij/AIRFLOW-840	2017-03-28 16:50:10 -07:00
Alex Guziel	fe9ebe3ccf	[AIRFLOW-1047] Sanitize strings passed to Markup We add the Apache-licensed bleach library and use it to sanitize html passed to Markup (which is supposed to be already escaped). This avoids some XSS issues with unsanitized user input being displayed. Closes #2193 from saguziel/aguziel-xss	2017-03-28 16:40:32 -07:00
Bolke de Bruin	4f52db317f	[AIRFLOW-911] Add coloring and timing to tests Closes #2106 from bolkedebruin/profile_tests	2017-02-25 22:10:14 +01:00
Jeremiah Lowin	6e22102782	[AIRFLOW-862] Add DaskExecutor Adds a DaskExecutor for running Airflow tasks in Dask clusters. Closes #2067 from jlowin/dask-executor	2017-02-12 16:06:31 -05:00
Jeremiah Lowin	bbfd43df46	[AIRFLOW-863] Example DAGs should have recent start dates Avoid unnecessary backfills by having start dates of just a few days ago. Adds a utility function airflow.utils.dates.days_ago(). Closes #2068 from jlowin/example-start-date	2017-02-12 15:37:56 -05:00
Dan Davydov	b56cb5cc97	[AIRFLOW-219][AIRFLOW-398] Cgroups + impersonation Submitting on behalf of plypaul Please accept this PR that addresses the following issues: - https://issues.apache.org/jira/browse/AIRFLOW-219 - https://issues.apache.org/jira/browse/AIRFLOW-398 Testing Done: - Running on Airbnb prod (though on a different mergebase) for many months Credits: Impersonation Work: georgeke did most of the work but plypaul did quite a bit of work too. Cgroups: plypaul did most of the work, I just did some touch up/bug fixes (see commit history, cgroups + impersonation commit is actually plypaul 's not mine) Closes #1934 from aoen/ddavydov/cgroups_and_impers onation_after_rebase	2017-01-18 18:11:06 -08:00
Bolke de Bruin	3ac2fba888	Merge branch 'AIRFLOW-760'	2017-01-16 22:23:36 +01:00
Jay	44798e0d4d	[AIRFLOW-683] Add jira hook, operator and sensor Closes #1950 from jhsenjaliya/AIRFLOW-683	2017-01-16 17:46:21 +01:00
Bolke de Bruin	f3e18fbe02	[AIRFLOW-760] Update systemd config	2017-01-14 21:32:27 +01:00
Bolke de Bruin	19ed9001b9	[AIRFLOW-740] Pin jinja2 to < 2.9.0 Jinja2 2.9.1 seems to have a conflict with flask-admin.	2017-01-07 19:53:01 +01:00
Vijay Bhat	7fa86f72c6	[AIRFLOW-673] Add operational metrics test for SchedulerJob Extend SchedulerJob to instrument the execution performance of task instances contained in each DAG. We want to know if any DAG is starved of resources, and this will be reflected in the stats printed out at the end of the test run. Extend SchedulerJob to instrument the execution performance of task instances contained in each DAG. We want to know if any DAG is starved of resources, and this will be reflected in the stats printed out at the end of the test run. this test is for instrumenting the operational impact of https://github.com/apache/incubator- airflow/pull/1906 Closes #1919 from vijaysbhat/scheduler_perf_tool	2017-01-03 08:13:06 -05:00
Bolke de Bruin	d5ac6bd9d0	[AIRFLOW-489] Add API Framework This implements a framework for API calls to Airflow. Currently all access is done by cli or web ui. Especially in the context of the cli this raises security concerns which can be alleviated with a secured API call over the wire. Secondly integration with other systems is a bit harder if you have to call a cli. For public facing endpoints JSON is used. As an example the trigger_dag functionality is now made into a API call. Backwards compat is retained by switching to a LocalClient.	2016-11-27 19:44:31 +01:00
Li Xuanji	dedc54eeaf	[AIRFLOW-640] Install and enable nose-ignore-docstring Closes #1896 from zodiac/nose-ignore-docstring	2016-11-20 17:38:24 -08:00
Li Xuanji	ca6dbc6485	[AIRFLOW-639]AIRFLOW-639] Alphasort package names Closes #1895 from zodiac/alphasort_requirements	2016-11-20 17:06:47 -08:00
Bolke de Bruin	910c0ddd78	[AIRFLOW-504] Store fractional seconds in MySQL tables Both utcnow() and now() return fractional seconds. These are sometimes used in primary_keys (eg. in task_instance). If MySQL is not configured to store these fractional seconds a primary key might fail (eg. at session.merge) resulting in a duplicate entry being added or worse. Postgres does store fractional seconds if left unconfigured, sqlite needs to be examined.	2016-11-13 22:43:17 +01:00
David Gingrich	ff45d8f221	[AIRFLOW-512] Fix 'bellow' typo in docs & comments Dear Airflow Maintainers, Please accept this PR that addresses the following issues: - https://issues.apache.org/jira/browse/AIRFLOW-512 Testing Done: - N/A, but ran core tests: `./run_unit_tests.sh tests.core:CoreTest -s` Closes #1800 from dgingrich/master	2016-09-16 09:45:12 -07:00
Bolke de Bruin	2c3d0fdbe9	Merge remote-tracking branch 'apache/master'	2016-08-09 15:09:51 +02:00
Bolke de Bruin	1d67d6293e	[AIRFLOW-404] Retry download if unpacking fails for hive Travis cache can have a faulty files. This results in builds that fail as they are dependent on certain components being available, ie. hive. This addresses the issue for hive by redownloading if unpacking fails.	2016-08-09 15:00:25 +02:00
Li Xuanji	9d254a317d	[AIRFLOW-276] Gunicorn rolling restart - Tell gunicorn to prepend `[ready]` to worker process name once worker is ready (to serve requests) - in particular this happens after DAGs folder is parsed - Airflow cli runs gunicorn as a child process instead of `excecvp`-ing over itself - Airflow cli monitors gunicorn worker processes and restarts them by sending TTIN/TTOU signals to the gunicorn master process - Fix bug where `conf.get('webserver', 'workers')` and `conf.get('webserver', 'webserver_worker_timeout')` were ignored - Alternatively, https://github.com/apache/incubator-airflow/pull/1684/files does the same thing but the worker-restart script is provided separately for the user to run - Start airflow, observe that workers are restarted - Add new dags to dags folder and check that they show up - Run `siege` against airflow while server is restarting and confirm that all requests succeed - Run with configuration set to `batch_size = 0`, `batch_size = 1` and `batch_size = 4` Closes #1685 from zodiac/xuanji_gunicorn_rolling_restart_2	2016-08-08 11:26:38 -07:00
Paul Yang	fdb7e94914	[AIRFLOW-160] Parse DAG files through child processes Instead of parsing the DAG definition files in the same process as the scheduler, this change parses the files in a child process. This helps to isolate the scheduler from bad user code. Closes #1636 from plypaul/plypaul_schedule_by_file_rebase_master	2016-07-31 12:49:39 -07:00

1 2 3

117 Коммитов