Граф коммитов

470 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk f710a0db49
[AIRFLOW-4757] Selectively disable missing docstrings for tests (#5400) 2019-06-10 17:51:35 +02:00
Jarek Potiuk 4b7667d2ad
[AIRFLOW-4659] Fix pylint problems for api module (#5398) 2019-06-10 13:53:52 +02:00
Bas Harenslak 6dd3f31a04 [AIRFLOW-4689] Make setup.py Pylint compatible (#5395) 2019-06-09 08:35:22 -07:00
Bas Harenslak 189bbfd85d [AIRFLOW-4670] Make airflow/example_dags Pylint compatible (#5361) 2019-06-09 07:36:33 -07:00
Jarek Potiuk 3891de68af
[AIRFLOW-4752] Add missing * in build exclusion and generated config (#5392) 2019-06-09 07:35:17 -07:00
Bas Harenslak 02ef974e4b [AIRFLOW-4669] Make airflow/dag Pylint compatible (#5362) 2019-06-09 00:26:44 -07:00
Andrii Soldatenko 0da976a0e1 [AIRFLOW-3370] Add stdout output options to Elasticsearch task log handler (#5048)
When using potentially larger offets than javascript can handle, they can get parsed incorrectly on the client, resulting in the offset query getting stuck on a certain number. This patch ensures that we return a string to the client to avoid being parsed. When we run the query, we ensure the offset is set as an integer.

Add unnecesary prefix_ in config for elastic search section
2019-06-04 22:50:26 +01:00
Philippe Gagnon d1626d80b5 [AIRFLOW-4573] Import airflow_local_settings after prepare_classpath (#5330)
Moves the airflow_local_settings import code into a dedicated function
in settings.py and adds a call to it in initialize after prepare_syspath
2019-06-04 10:50:45 +01:00
Bas Harenslak 669b026c0b [AIRFLOW-4364] Add Pylint to CI (#5238) 2019-05-30 22:02:09 +02:00
Daniel Imberman f153bf5367
[AIRFLOW-4487] Move k8s executor from contrib folder to main project (#5261)
* Move k8s executor from contrib folder

Considering that the k8s executor is now fully supported by core
committers, we should move it from contrib to the primary executor
directory.
2019-05-09 16:05:31 -07:00
Ash Berlin-Taylor 188b3193c7
[AIRFLOW-4436] Don't build the same docker image twice in tests (#5209) 2019-04-30 13:43:24 +01:00
Chao-Han Tsai 9daad7ecd1 [AIRFLOW-4361] Fix flaky test_integration_run_dag_with_scheduler_failure (#5182) 2019-04-27 23:48:49 -07:00
Jiajie Zhong 0ac501faa9 [AIRFLOW-4296] Remove py2 in ci process (#5090) 2019-04-20 20:05:07 +02:00
Kaxil Naik e26e340e7c [AIRFLOW-4313] Remove the Mesos executor (#5115)
* [AIRFLOW-4313] Remove the Mesos executor

* Update UPDATING.md
2019-04-17 18:28:58 +08:00
Jarek Potiuk 5b53d9987f [AIRFLOW-4115] Multi-staging Aiflow Docker image (#4936) 2019-04-13 15:03:02 +02:00
Chao-Han Tsai bd08034811 [AIRFLOW-XXX] how to setup test env with mysql (#4898) 2019-04-06 23:02:12 -07:00
Chao-Han Tsai 9d8719e17d [AIRFLOW-XXX] 1-setup-env.sh should only run in docker (#5003)
[AIRFLOW-XXX] 1-setup-env.sh should only run in docker
2019-03-29 10:55:46 -07:00
Ash Berlin-Taylor 1c43cde65c
[AIRFLOW-3743] Unify different methods of working out AIRFLOW_HOME (#4705)
There were a few ways of getting the AIRFLOW_HOME directory used
throughout the code base, giving possibly conflicting answer if they
weren't kept in sync:

- the AIRFLOW_HOME environment variable
- core/airflow_home from the config
- settings.AIRFLOW_HOME
- configuration.AIRFLOW_HOME

Since the home directory is used to compute the default path of the
config file to load, specifying the home directory Again in the config
file didn't make any sense to me, and I have deprecated that.

This commit makes everything in the code base use
`settings.AIRFLOW_HOME` as the source of truth, and deprecates the
core/airflow_home config option.

There was an import cycle form settings -> logging_config ->
module_loading -> settings that needed to be broken on Python 2 - so I
have moved all adjusting of sys.path in to the settings module

(This issue caused me a problem where the RBAC UI wouldn't work as it
didn't find the right webserver_config.py)
2019-03-25 11:10:28 +00:00
Peter van 't Hof c1f98a89a4 [AIRFLOW-XXX] Remove verbose test on nosetests (#4913)
Something about the tests or how we run them changed and we ended up 
with a lot more lines appearing in the output, taking us over Travis' "will 
display in the UI" limit, making it harder to debug failures. This isn't a long
term fix, but improves things while we fix the tests for the better.
2019-03-17 18:17:17 +00:00
Ash Berlin-Taylor 01b6d1d8da [AIRFLOW-4053] Fix KubePodOperator Xcom on Kube 1.13.0 (#4883)
Newer versions of Kube return "failed" events for the side car container
when the ^C causes the python process to exit with 1

Kube 1.13 runs a different number of kube-dns pods (2 by default, 1.9
and 1.10 ran only 1) so the setup scripts needed changing a little bit.

To get a Kube 1.13 cluster I had to upgrade minikube, and it no longer
works on a dist without systemd installed (#systemdsucks) so I had to
update the travis dist to xenial which is no bad thing!

This version of minikube doesn't need the localkube bootstrapper set
anymore, it handles driver=none much more gracefully, and some of the
permissions set up for context files/keys needed to be updated.
2019-03-08 22:17:48 -08:00
Xiaodong 6abcdfd496 [AIRFLOW-3793] Decommission configuration items for Flask-Admin web UI & related codes (#4637) 2019-03-04 15:13:29 +00:00
Chao-Han Tsai bbe711640f [AIRFLOW-4001] Update docs about how to run tests (#4826)
fix docs
2019-03-03 21:24:11 -08:00
Chao-Han Tsai 484adb0daa [AIRFLOW-3992] 1-setup-env.sh should be re-runable (#4817) 2019-03-01 14:51:09 -08:00
Stijn De Haes b51712ca93 [AIRFLOW-3766] Add support for kubernetes annotations (#4589) 2019-03-01 00:13:09 -08:00
Ash Berlin-Taylor 28b53e2073 [AIRFLOW-XXX] Pin version of Pip in tests to work around pypa/pip#6163 (#4576)
There is a bug or a new feature that causes a number of our dependencies
to fail to install.
2019-01-23 16:21:24 -08:00
Verdan Mahmood c030729dcb [AIRFLOW-3303] Deprecate old UI in favor of FAB (#4339) 2019-01-14 14:33:45 +00:00
bolkedebruin 69adaee25c [AIRFLOW-3692] Remove ENV variables to avoid GPL (#4506) 2019-01-13 12:34:00 +00:00
Peter van 't Hof e2c22fe70a [AIRFLOW-3673] Add official dockerfile (#4483) 2019-01-12 14:06:23 +01:00
Fokko Driesprong 327860fe4f [AIRFLOW-3515] Remove the run_duration option (#4320) 2019-01-08 10:40:10 +00:00
Kaxil Naik 94bfeef7cb
[AIRFLOW-3612] Remove remaining incubator mention & Fix CI Behaviour (#4441) 2019-01-05 16:32:12 +00:00
Tao Feng 67572025cc [AIRFLOW-3612] Remove incubation/incubator mention (#4419) 2019-01-05 14:05:25 +00:00
Kevin Pullin 8bdf6c4ab9 [AIRFLOW-3402] Support global k8s affinity and toleration configs (#4247)
* Support setting global k8s affinity and toleration configuration in the airflow config file.

* Copy annotations as dict, not list

* Update airflow/contrib/kubernetes/pod.py

Co-Authored-By: kppullin <kevin.pullin@gmail.com>
2019-01-02 21:40:14 +01:00
ianbillett c5c0705de8 [AIRFLOW-XXX] Adds image code comment (#4413) 2019-01-01 23:45:47 -08:00
Daniel Imberman 6bb88fab1e [AIRFLOW-360] Launch custom images to Airflow CI tests (#4416)
To help move away from Minikube, we need to remove the dependency on
a local docker registry and move towards a solution that can be used
in any kubernetes cluster. Custom image names allow users to use
systems like docker, artifactory and gcr
2019-01-02 08:40:22 +01:00
Daniel Imberman f4f55fdc2c [AIRFLOW-3609] Fix bug in volumes readWriteMany (#4417)
When running integration tests on a k8s cluster vs. Minikube
I discovered that we were actually using an invalid permission
structure for our persistent volume. This commit fixes that.
2019-01-01 21:55:43 -08:00
Riccardo Bini 9de9721b48 [AIRFLOW-3281] Fix Kubernetes operator with git-sync (#3770)
* Refactor Kubernetes operator with git-sync

Currently the implementation of git-sync is broken because:
- git-sync clones the repository in /tmp and not in airflow-dags volume
- git-sync add a link to point to the revision required but it is not
taken into account in AIRFLOW__CORE__DAGS_FOLDER

Dags/logs hostPath volume has been added (needed if airflow run in
kubernetes in local environment)

To avoid false positive in CI `load_examples` is set to `False`
otherwise DAGs from `airflow/example_dags` are always loaded. In this
way is possible to test `import` in DAGs

Remove `worker_dags_folder` config:
`worker_dags_folder` is redundant and can lead to confusion.
In WorkerConfiguration `self.kube_config.dags_folder` defines the path of
the dags and can be set in the worker using airflow_configmap
Refactor worker_configuration.py
Use a docker container to run setup.py
Compile web assets
Fix codecov application path

* Fix kube_config.dags_in_image
2018-12-30 21:03:32 -08:00
Kevin Pullin 067b9671e9 [AIRFLOW-2770] Read `dags_in_image` config value as a boolean (#4319)
* Read `dags_in_image` config value as a boolean

This PR is a minor fix for #3683

The dags_in_image config value is read as a string. However, the existing code expects this to be a boolean.

For example, in worker_configuration.py there is the statement: if not self.kube_config.dags_in_image:

Since the value is a non-empty string ('False') and not a boolean, this evaluates to true (since non-empty strings are truthy)
and skips the logic to add the dags_volume_claim volume mount.

This results in the CI tests failing because the dag volume is missing in the k8s pod definition.

This PR reads the dags_in_image using the conf.getboolean to fix this error.

Rebased on 457ad83e4e, before the previous
dags_in_image commit was reverted.

* Revert "Revert  [AIRFLOW-2770] [AIRFLOW-3505] (#4318)"

This reverts commit 77c368fd22.
2018-12-16 23:05:26 -08:00
Tao Feng 77c368fd22
Revert [AIRFLOW-2770] [AIRFLOW-3505] (#4318)
* Revert "[AIRFLOW-3505] replace 'dags_in_docker' with 'dags_in_image' (#4311)"

This reverts commit 457ad83e4e.

* Revert "[AIRFLOW-2770] kubernetes: add support for dag folder in the docker image (#3683)"

This reverts commit e9a09d408e.
2018-12-13 10:21:39 -08:00
Daniel Imberman 457ad83e4e [AIRFLOW-3505] replace 'dags_in_docker' with 'dags_in_image' (#4311)
As kubernetes is moving away from docker to OCI, it will be more correct to use the
'dags_in_image' name to be more container system agnostic
2018-12-12 21:03:41 -08:00
Rurui e9a09d408e [AIRFLOW-2770] kubernetes: add support for dag folder in the docker image (#3683) 2018-12-12 09:43:58 -08:00
Kaxil Naik 9dce1f0740 [AIRFLOW-3408] Remove outdated info from Systemd Instructions (#4269) 2018-12-05 12:50:16 -08:00
Paweł Graczyk 35da9e6b0f [AIRFLOW-3250] Fix for Redis Hook for not authorised connection calls (#4090)
Password stay None value and not None (str) in case there is no password set through webadmin interfaces.
This is fix for connections for Redis that not expect autorisation from clients.
2018-11-25 23:01:09 +01:00
Xiaodong 86a83bfff3 [AIRFLOW-3323] Support HTTP basic authentication for Airflow Flower (#4166)
The current `airflow flower` doesn't come with any authentication.
This may make essential information exposed in an untrusted environment.

This commit add support to HTTP basic authentication for Airflow Flower

Ref:
https://flower.readthedocs.io/en/latest/auth.html
2018-11-13 14:48:23 +00:00
Ash Berlin-Taylor b9fc03ea1a [AIRFLOW-2779] Add license headers to doc files (#4178)
This adds ASF license headers to all the .rst and .md files with the
exception of the Pull Request template (as that is included verbatim
when opening a Pull Request on Github which would be messy)
2018-11-13 15:01:44 +01:00
Fokko Driesprong 0e8394fd23 [AIRFLOW-3190] Make flake8 compliant (#4035)
Enforce Flake8 over the entire project
2018-10-12 22:22:52 +01:00
Ash Berlin-Taylor c4f3f6b199 [AIRFLOW-3178] Handle percents signs in configs for airflow run (#4029)
* [AIRFLOW-3178] Don't mask defaults() function from ConfigParser

ConfigParser (the base class for AirflowConfigParser) expects defaults()
to be a function - so when we re-assign it to be a property some of the
methods from ConfigParser no longer work.

* [AIRFLOW-3178] Correctly escape percent signs when creating temp config

Otherwise we have a problem when we come to use those values.

* [AIRFLOW-3178] Use os.chmod instead of shelling out

There's no need to run another process for a built in Python function.

This also removes a possible race condition that would make temporary
config file be readable by more than the airflow or run-as user
The exact behaviour would depend on the umask we run under, and the
primary group of our user, likely this would mean the file was readably
by members of the airflow group (which in most cases would be just the
airflow user). To remove any such possibility we chmod the file
before we write to it
2018-10-12 11:13:05 +02:00
dima-asana 7115883faf [AIRFLOW-3168] More resillient database use in CI (#4014)
Make sure mysql is available before calling it in CI
2018-10-11 09:55:15 +01:00
Kaxil Naik dd4e3cfa77
[AIRFLOW-2952] Fix Kubernetes CI (#3957)
- Update outdated cli command to create user
- Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/`
- Update the path to copy K8s dags
2018-09-28 11:51:04 +01:00
johnhofman 6a7f388746 [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (#3968)
The recent update to the CI image changed the default
python from python2 to python3. The PythonVirtualenvOperator
tests expected python2 as default and fail due to
serialisation errors.
2018-09-28 11:04:29 +01:00
Holden Karau 010924a06f [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (#3933) 2018-09-27 23:26:03 +01:00
Fokko Driesprong 491fd743da [AIRFLOW-2918] Remove unused imports 2018-09-21 13:21:42 -07:00
Fokko Driesprong d277888a04 [AIRFLOW-3076] Remove preloading of MySQL testdata (#3911)
One of the things for tests is being self contained. This means that
it should not depend on anything external, such as loading data.

This PR will use the setUp and tearDown to load the data into MySQL
and remove it afterwards. This removes the actual bash mysql commands
and will make it easier to dockerize the whole testsuite in the future
2018-09-21 15:37:55 +01:00
Riccardo Bini 8038f88b60 AIRFLOW-2952 Fix Kubernetes CI (#3922)
The current dockerised CI pipeline doesn't run minikube and the
Kubernetes integration tests. This starts a Kubernetes cluster 
using minikube and runs k8s integration tests using docker-compose.
2018-09-21 14:36:09 +02:00
Fokko Driesprong 0e5eee83b1 [AIRFLOW-3068] Remove deprecated imports 2018-09-16 09:11:34 -07:00
yrqls21 9b82fcb5fb [AIRFLOW-2156] Parallelize Celery Executor task state fetching (#3830) 2018-09-11 09:12:18 -07:00
Fokko Driesprong 1db6ebe1eb
[AIRFLOW-3003] Pull the krb5 image instead of building (#3844)
Pull the image instead of building it, this will speed up the CI
process since we don't have to build it every time.
2018-09-05 21:52:12 +02:00
Fokko Driesprong acf1378d77 [AIRFLOW-2933] Enable Codecov on Docker-CI Build (#3780)
- Add missing variables and use codecov instead of coveralls.
  The issue why it wasn't working was because missing environment variables.
  The codecov library heavily depends on the environment variables in
  the CI to determine how to push the reports to codecov.

- Remove the explicit passing of the variables in the `tox.ini`
  since it is already done in the `docker-compose.yml`,
  having to maintain this at two places makes it brittle.

- Removed the empty Codecov yml since codecov was complaining that
  it was unable to parse it
2018-08-25 18:50:16 +01:00
Gerardo Curiel ede67299c4 [AIRFLOW-2499] Dockerise CI pipeline (#3393)
Airflow tests depend on many external services and other custom setup,
which makes it hard for contributors to work on this codebase. CI
builds have also been unreliable, and it is hard to reproduce the
causes. Having contributors trying to emulate the build environment
every time makes it easier to get to an "it works on my machine" sort
of situation.

This implements a dockerised version of the current build pipeline.
This setup has a few advantages:

* TravisCI tests are reproducible locally
* The same build setup can be used to create a local development environment
2018-08-22 10:26:54 +02:00
Dan Davydov 7142ae0732 [AIRFLOW-2895] Prevent scheduler from spamming heartbeats/logs
Reverts most of AIRFLOW-2027 until the issues with
it can be fixed.

Closes #3747 from
aoen/revert_min_file_parsing_time_commit
2018-08-20 09:14:36 -04:00
Kazuhiro Sera b78c7fb851 [AIRFLOW-2889] Fix typos detected by github.com/client9/misspell (#3732) 2018-08-11 21:11:19 -07:00
Kaxil Naik 120f4856cd [AIRFLOW-2867] Refactor Code to conform standards (#3714)
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
2018-08-07 16:18:42 -07:00
Aldo Giambelluca 90e88dfe82 [AIRFLOW-2755] Added `kubernetes.worker_dags_folder` configuration (#3612)
It was previously hardcoded to `/tmp/dags`.
This causes problems with python import of modules in the DAGs folder.
2018-08-06 22:44:48 +02:00
Kengo Seki 84a55f3e54 [AIRFLOW-2811] Fix scheduler_ops_metrics.py to work (#3653)
This PR fixes timezone problem in
scheduler_ops_metrics.py and makes
its timeout configurable.
2018-08-06 09:42:02 -07:00
Kengo Seki 0d193ada44 [AIRFLOW-2829] Brush up the CI script for minikube
Fix scripts/ci/kubernetes/minikube/start_minikube.sh
as follows:

- Make minikube version configurable via
  environment variable
- Remove unused variables for readability
- Reorder some lines to remove warnings
- Replace ineffective `return` with `exit`
- Add -E to `sudo minikube` so that non-root
  users can use this script locally
2018-08-01 14:30:51 -07:00
bolkedebruin c37fc0b6ba
[AIRFLOW-2817] Force explicit choice on GPL dependency (#3660)
By default one of Apache Airflow's dependencies pulls in a GPL
library. Airflow should not install (and upgrade) without an explicit choice.

This is part of the Apache requirements as we cannot depend on Category X
software.
2018-08-01 11:25:31 +02:00
Taylor D. Edmiston 6d31c9e416 [AIRFLOW-2832] Lint and resolve inconsistencies in Markdown files (#3670)
Clean up the Markdown files and make the formatting consistent
2018-08-01 09:50:23 +02:00
Kevin Yang 216beacd5b [AIRFLOW-2648] Update mapred job name in HiveOperator
Closes #3534 from
yrqls21/keivn_yang_reorder_mapred
2018-06-25 13:31:34 +02:00
pengchen adb648c949 [AIRFLOW-2662][AIRFLOW-2397] Add k8s node_selectors and affinity
Add the ability to set the node selection and the affinity
for the k8s executor

Closes #3535 from Cplo/affinity
2018-06-25 13:09:16 +02:00
Kengo Seki b9cb54f873 [AIRFLOW-2634][AIRFLOW-2534] Remove dependency for impyla
Closes #3514 from sekikn/AIRFLOW-2634
2018-06-16 13:35:41 +01:00
pengchen 2fa155fe8b [AIRFLOW-2617] add imagePullPolicy config for kubernetes executor
Closes #3500 from Cplo/k8sexecutor
2018-06-15 11:38:19 +02:00
Tim Swast 0f4d681f6f [AIRFLOW-2512][AIRFLOW-2522] Use google-auth instead of oauth2client
* Updates the GCP hooks to use the google-auth
library and removes
  dependencies on the deprecated oauth2client
package.
* Removes inconsistent handling of the scope
parameter for different
  auth methods.

Note: using google-auth for credentials requires a
newer version of the
google-api-python-client package, so this commit
also updates the
minimum version for that.

To avoid some annoying warnings about the
discovery cache not being
supported, so disable the discovery cache
explicitly as recommend here:
https://stackoverflow.com/a/44518587/101923

Tested by running:

    nosetests
tests/contrib/operators/test_dataflow_operator.py
\
        tests/contrib/operators/test_gcs*.py \
        tests/contrib/operators/test_mlengine_*.py \
        tests/contrib/operators/test_pubsub_operator.py \
        tests/contrib/hooks/test_gcp*.py \
        tests/contrib/hooks/test_gcs_hook.py \
        tests/contrib/hooks/test_bigquery_hook.py

and also tested by running some GCP-related DAGs
locally, such as the
Dataproc DAG example at
https://cloud.google.com/composer/docs/quickstart

Closes #3488 from tswast/google-auth
2018-06-12 23:53:21 +01:00
Andy Cooper 9e1d8ee837 [AIRFLOW-83] add mongo hook and operator
Closes #3440 from
andscoop/AIRFLOW_83_add_mongo_hooks_and_operators
2018-06-05 23:30:02 +01:00
roc 7945854cc4 [AIRFLOW-2532] Support logs_volume_subpath for KubernetesExecutor
The kubernetes section in the configuration file
supports
logs_volume_subpath for KubernetesExecutor.

Closes #3430 from imroc/AIRFLOW-2532
2018-05-30 11:14:13 +02:00
Daniel Imberman 8fa0bbd56e [AIRFLOW-2460] Users can now use volume mounts and volumes
When launching pods using k8s operator

Closes #3356 from dimberman/k8s-mounts
2018-05-14 21:59:59 +02:00
alberto.calderari 6c19468e0b [AIRFLOW-2110][AIRFLOW-2122] Enhance Http Hook
- Use a header in passed in the "extra" argument and
  add tenacity retry
- Fix the tests with proper mocking

Closes #3071 from albertocalderari/master
2018-05-14 21:52:22 +02:00
Bolke de Bruin 648e1e6930 [AIRFLOW-2425] Add lineage support
Add lineage support by having inlets and oulets
that
are made available to dependent upstream or
downstream
tasks.

If configured to do so can send lineage data to a
backend. Apache Atlas is supported out of the box.

Closes #3321 from bolkedebruin/lineage_exp
2018-05-14 09:09:25 +02:00
Daniel Imberman 5de22d7fa0 [AIRFLOW-2424] Add dagrun status endpoint and increased k8s test coverage
[AIRFLOW-2424] Add dagrun status endpoint and
increase k8s test coverage

[AIRFLOW-2424] Added minikube fixes by @kimoonkim

[AIRFLOW-2424] modify endpoint to remove 'status'

Closes #3320 from dimberman/add-kubernetes-test
2018-05-10 19:32:17 +02:00
Fokko Driesprong 16bae5634d [AIRFLOW-1899] Fix Kubernetes tests
[AIRFLOW-1899] Add full deployment

- Made home directory configurable
- Documentation fix
- Add licenses

[AIRFLOW-1899] Tests for the Kubernetes Executor

Add an integration test for the Kubernetes
executor. Done by
spinning up different versions of kubernetes and
run a DAG
by invoking the REST API

Closes #3301 from Fokko/fix-kubernetes-executor
2018-05-04 08:58:12 +02:00
Alexander Petrovsky c3aa8e31fa [AIRFLOW-1313] Add vertica_to_mysql operator
Closes #2370 from juise/master
2018-04-29 00:05:23 -07:00
Fokko Driesprong e30a1f451a [AIRFLOW-2357] Add persistent volume for the logs
The logs are kept inside of the worker pod. By
attaching a persistent
disk we keep the logs and make them available for
the webserver.

- Remove the requirements.txt since we dont want
to maintain another
  dependency file
- Fix some small casing stuff
- Removed some unused code
- Add missing shebang lines
- Started on some docs
- Fixed the logging

Closes #3252 from Fokko/airflow-2357-pd-for-logs
2018-04-23 18:43:24 +02:00
Daniel Imberman a15b7c5b79 [AIRFLOW-1314] Cleanup the config
Closes #2414 from bloomberg:airflow-kubernetes-executor
2018-04-22 10:24:18 +02:00
Fokko Driesprong d807830fe9 [AIRFLOW-1314] Polish some of the Kubernetes docs/config 2018-04-22 10:23:06 +02:00
Jordan Zucker 317b6c7bd5 [AIRFLOW-1314] Improve error handling
Handle too old resource versions and throw exceptions on errors

- K8s API errors will now throw Airflow exceptions
- Add scheduler uuid to worker pod labels to match the two
2018-04-22 10:23:06 +02:00
Daniel Imberman b9a87a07e3 [AIRFLOW-1314] Rebasing against master 2018-04-22 10:23:06 +02:00
Grant Nicholas c0920efc01 [AIRFLOW-1314] Add executor_config and tests
* Added in executor_config to the task_instance table and the base_operator table

* Fix test; bump up number of examples

* Fix up comments from PR

* Exclude the kubernetes example dag from a test

* Fix dict -> KubernetesExecutorConfig

* fixed up executor_config comment and type hint
2018-04-22 10:23:06 +02:00
fenglu-g ad4e67ce1b [AIRFLOW-1314] Improve k8s support
Add kubernetes config section in airflow.cfg and Inject GCP secrets upon executor start. (#17)
Update Airflow to Pass configuration to k8s containers, add some Py3 … (#9)

* Update Airflow to Pass configuration to k8s containers, add some Py3 compat., create git-sync pod

* Undo changes to display-source config setter for to_dict

* WIP Secrets and Configmaps

* Improve secrets support for multiple secrets. Add support for registry secrets. Add support for RBAC service accounts.

* Swap order of variables, overlooked very basic issue

* Secret env var names must be upper

* Update logging

* Revert spothero test code in setup.py

* WIP Fix tests

* Worker should be using local executor

* Consolidate worker setup and address code review comments

* reconfigure airflow script to use new secrets method
2018-04-22 10:23:06 +02:00
grantnicholas a9d90dc9a5 [AIRFLOW-1314] Use VolumeClaim for transporting DAGs
- fix issue where watcher process randomly dies
- fixed alembic head, was pointing to two tips
2018-04-22 10:22:44 +02:00
dimberman 5821320880 [AIRFLOW=1314] Basic Kubernetes Mode 2018-04-22 10:17:39 +02:00
Bolke de Bruin c7a472ed6b [AIRFLOW-2287] Fix incorrect ASF headers
Closes #3219 from bolkedebruin/fix_header
2018-04-14 09:13:23 +02:00
Bolke de Bruin a30f009aeb [AIRFLOW-2287] Update license notices
Closes #3195 from bolkedebruin/AIRFLOW-2287
2018-04-09 00:32:09 -07:00
Joy Gao 05e1861e24 [AIRFLOW-1433][AIRFLOW-85] New Airflow Webserver UI with RBAC support
Closes #3015 from jgao54/rbac
2018-03-23 09:18:48 +01:00
Fokko Driesprong 4af71fd58e [AIRFLOW-2197] Silence hostname_callable config error message
Since the hostname_callable key is not defined in the config, we end
up with a lot of warnings. Add the key to the config and simplify the
code.
2018-03-08 13:05:41 +01:00
Fokko Driesprong 976fd1245a [AIRFLOW-2123] Install CI dependencies from setup.py
Install the dependencies from setup.py so we keep all the dependencies
in one single place

Closes #3054 from Fokko/fd-fix-ci-2
2018-03-05 22:46:45 +00:00
Moe Nadal 667a26ce49 [AIRFLOW-1551] Add operator to trigger Jenkins job
Closes #2553 from moe-nadal-ck/AIRFLOW-1551/AddJenkinsOperator
2018-02-27 11:51:49 +01:00
Fokko Driesprong 283b8d1b97 [AIRFLOW-2116] Set CI Cloudant version to <2.0
The python-cloudant release 2.8 is broken and
causes our CI to fail.
In the setup.py we install cloudant version <2.0
and in our CI pipeline
we install the latest version.

Closes #3051 from Fokko/fd-fix-cloudant
2018-02-16 10:45:38 +01:00
Bolke de Bruin a1d5551777 [AIRFLOW-1895] Fix primary key integrity for mysql
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
 migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.

Closes #2969, #2857

Closes #2979 from bolkedebruin/AIRFLOW-1895
2018-01-27 09:01:10 +01:00
Bolke de Bruin 1abe7f6d54 Merge pull request #2853 from dimberman/Airflow_1517_kubenetes_operator 2018-01-12 19:02:52 +01:00
GRANT NICHOLAS 7fb5906e68 [AIRFLOW-1517] Kubernetes operator PR fixes
Fix python flake8 linting issues and AIRFLOW license issues
2018-01-11 15:29:34 -08:00
Daniel Imberman d5b13a3dad [AIRFLOW-1517] addressed PR comments 2018-01-11 15:29:27 -08:00
GRANT NICHOLAS 965439bef0 [AIRFLOW-1517] Add minikube for kubernetes integration tests
Add better support for minikube integration tests; By default minikube integration tests will run with kubernetes 1.7 and kubernetes 1.8
2018-01-11 15:29:16 -08:00
Daniel Imberman a42dbb4f4d Revert "[AIRFLOW-1517] Add minikube for kubernetes integration tests"
This reverts commit 0197931609685a98181387014f7c8f3b5cd5f9a8.
2018-01-11 15:29:16 -08:00
GRANT NICHOLAS cde3a5fecd [AIRFLOW-1517] Add minikube for kubernetes integration tests
Add better support for minikube integration tests; By default minikube integration tests will run with kubernetes 1.7 and kubernetes 1.8
2018-01-11 15:28:32 -08:00
Edgar Rodriguez b3489b99e9 [AIRFLOW-1963] Add config for HiveOperator mapred_queue
Adding configuration setting for specifying a
default mapred_queue for
hive jobs using the HiveOperator.

Closes #2915 from edgarRd/erod-hive-mapred-queue-
config
2018-01-03 14:23:14 -08:00
Bolke de Bruin 5e4d7d8d7d [AIRFLOW-XXX] Pin sqlalchemy dependency 2017-12-29 12:28:51 +01:00
Daniel Imberman 78ff2fc180 [AIRFLOW-1517] Kubernetes Operator 2017-12-26 08:45:31 -08:00
Fokko Driesprong 815270bb56 [AIRFLOW-1911] Rename celeryd_concurrency
There are still celeryd_concurrency occurrences
left in the code
this needs to be renamed to worker_concurrency to
make the config
with Celery consistent

Closes #2870 from Fokko/AIRFLOW-1911-update-
airflow-config
2017-12-12 13:47:55 +01:00
Bolke de Bruin 22453d037e [AIRFLOW-1908] Fix celery broker options config load
Options were set to visibility timeout instead of
broker_options
directly. Furthermore, options should be int,
float, bool or string
not all string.

Closes #2867 from bolkedebruin/AIRFLOW-1908
2017-12-12 12:44:06 +01:00
Fokko Driesprong 30076f1e45 [AIRFLOW-1840] Make celery configuration congruent with Celery 4
Explicitly set the celery backend from the config
and align the config
with the celery config as this might be confusing.

Closes #2806 from Fokko/AIRFLOW-1840-Fix-celery-
config
2017-12-11 18:56:29 +01:00
Bolke de Bruin b9c82c0400 [AIRFLOW-1870] Enable flake8 tests
Flake8 tests now run for diffs

Closes #2829 from bolkedebruin/use_flake8
2017-11-30 15:57:17 +01:00
Bolke de Bruin 518a41acf3 [AIRFLOW-1826] Update views to use timezone aware objects 2017-11-27 15:54:27 +01:00
Stefanie Grunwald a61d9444cd
[AIRFLOW-1669] Fix Docker and pin Moto to 1.1.19
https://github.com/spulec/moto/pull/1048 introduced `docker` as a
dependency in Moto, causing a conflict as Airflow uses `docker-py`. As
both packages don't work together, Moto is pinned to the version
prior to that change.
2017-11-02 14:23:32 +01:00
Maxime Beauchemin b464d23a6d [AIRFLOW-1698] Remove SCHEDULER_RUNS env var in systemd
In the very early days, the Airflow scheduler
needed to be restarted
every so often to take new DAG_FOLDERS mutations
into account properly. This is no longer
required.

Closes #2677 from mistercrunch/scheduler_runs
2017-10-18 21:55:57 +02:00
fenglu-g 7cb818bbac [AIRFLOW-1723] Support sendgrid in email backend
Closes #2695 from fenglu-g/master
2017-10-18 12:27:14 -07:00
Dan Davydov 21e94c7d15 [AIRFLOW-1697] Mode to disable charts endpoint 2017-10-10 11:33:50 -07:00
Bolke de Bruin 65f3b468a2 [AIRFLOW-1527] Refactor celery config
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.

Closes #2542 from bolkedebruin/upgrade_celery
2017-09-25 11:19:16 -07:00
Bolke de Bruin fa1dc1eb20 Revert "[AIRFLOW-1368] Automatically remove Docker container on exit"
This reverts commit 46c86a5cd2.
2017-09-24 19:35:28 +02:00
Nathaniel Varona 46c86a5cd2 [AIRFLOW-1368] Automatically remove Docker container on exit
Closes #2411 from nathanielvarona/docker-operator
2017-09-22 10:15:23 -07:00
Fokko Driesprong eb2f589099 [AIRFLOW-1604] Rename logger to log
In all the popular languages the variable name log
is the de facto
standard for the logging. Rename LoggingMixin.py
to logging_mixin.py
to comply with the Python standard.

When using the .logger a deprecation warning will
be emitted.

Closes #2604 from Fokko/AIRFLOW-1604-logger-to-log
2017-09-19 10:17:14 +02:00
Fokko Driesprong de99aa20f4 [AIRFLOW-1324] Generalize Druid operator and hook
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit

Closes #2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
2017-08-18 21:34:03 +02:00
Jay fe0edeaab5 [AIRFLOW-756][AIRFLOW-751] Replace ssh hook, operator & sftp operator with paramiko based
Closes #1999 from jhsenjaliya/AIRFLOW-756
2017-07-20 22:07:45 +02:00
Bolke de Bruin fb21bcbcc1 Re-enable caching for hadoop components 2017-06-16 08:41:54 -04:00
Bolke de Bruin 38b2747c5b Pin Hive and Hadoop to a specific version and create writable warehouse dir 2017-06-15 19:22:09 -04:00
Kengo Seki 0f55477ccb [AIRFLOW-1172] Support nth weekday of the month cron expression
Closes #2321 from sekikn/AIRFLOW-1172
2017-06-14 17:59:02 -07:00
Sumit Maheshwari 6be02475f8 [AIRFLOW-1192] Some enhancements to qubole_operator
1. Upgrade qds_sdk version to latest
2. Add support to run Zeppelin Notebooks
3. Move out initialization of QuboleHook from
init()

Closes #2322 from msumit/AIRFLOW-1192
2017-06-07 09:09:50 +02:00
Stanislav Kudriashev d2d3e49ca0 [AIRFLOW-1201] Update deprecated 'nose-parameterized'
The 'parameterized' package should be used now,

Closes #2298 from skudriashev/airflow-1201
2017-05-16 11:34:52 +02:00
Chris Riccomini 3e9c666e8e [AIRFLOW-1203] Pin Google API client version to fix OAuth issue
Closes #2296 from criccomini/AIRFLOW-1203
2017-05-15 14:42:09 -07:00
Niels Zeilemaker ac9ccb1518 [AIRFLOW-1179] Fix Pandas 0.2x breaking Google BigQuery change
Closes #2279 from NielsZeilemaker/AIRFLOW-1179
2017-05-09 09:42:32 -07:00
Chris Riccomini 94f9822ffd [AIRFLOW-1138] Add missing licenses to files in scripts directory
Closes #2253 from criccomini/AIRFLOW-1138
2017-04-21 13:16:54 -07:00
Henk Griffioen 219c506414 [AIRFLOW-1094] Run unit tests under contrib in Travis
Rename all unit tests under tests/contrib to start
with test_* and fix
broken unit tests so that they run for the Python
2 and 3 builds.

Closes #2234 from hgrif/AIRFLOW-1094
2017-04-17 10:04:36 +02:00
Henk Griffioen f1bc5f38ac [AIRFLOW-1065] Add functionality for Azure Blob Storage over wasb://
This PR implements a hook to interface with Azure
storage over wasb://
via azure-storage; adds sensors to check for blobs
or prefixes; and
adds an operator to transfer a local file to the
Blob Storage.

Design is similar to that of the S3Hook in
airflow.operators.S3_hook.

Closes #2216 from hgrif/AIRFLOW-1065
2017-04-05 09:56:23 +02:00
Xiangrui Meng 70f1bf10a5 [AIRFLOW-1067] use example.com in examples
We use airflow@airflow.com in examples. However,
https://airflow.com
is owned by a company named Airflow (selling fans,
etc). We should use
airflow@example.com instead. That domain is
created for this purpose.

Closes #2217 from mengxr/AIRFLOW-1067
2017-04-04 09:22:37 -07:00
Bolke de Bruin 15fd4d98d1 Merge branch 'AIRFLOW-719' into AIRFLOW-719-3 2017-04-04 11:55:20 +02:00
Bolke de Bruin eb705fd55c [AIRFLOW-719] Fix race condition in ShortCircuit, Branch and LatestOnly
Both the ShortCircuitOperator, Branchoperator and LatestOnlyOperator
 were arbitrarily changing the states of TaskInstances without locking
them in the database. As the scheduler checks the state of dag runs
asynchronously the dag run state could be set to failed while the
operators are updating the downstream tasks.

A better fix would to use the dag run iteself in the context of the
Operator.
2017-04-03 10:38:12 +02:00
Alexander Bij 6393366a78 [AIRFLOW-840] Make ticket renewer python3 compatible
The return from the subprocess is in bytes when
the universal
newlines is set to False (default). This will fail
in Py3 and
works fine in Py2. And with a working unit test.

Closes #2158 from abij/AIRFLOW-840
2017-03-28 16:50:10 -07:00
Alex Guziel fe9ebe3ccf [AIRFLOW-1047] Sanitize strings passed to Markup
We add the Apache-licensed bleach library and use
it to sanitize html
passed to Markup (which is supposed to be already
escaped). This avoids
some XSS issues with unsanitized user input being
displayed.

Closes #2193 from saguziel/aguziel-xss
2017-03-28 16:40:32 -07:00
Bolke de Bruin 4f52db317f [AIRFLOW-911] Add coloring and timing to tests
Closes #2106 from bolkedebruin/profile_tests
2017-02-25 22:10:14 +01:00
Jeremiah Lowin 6e22102782 [AIRFLOW-862] Add DaskExecutor
Adds a DaskExecutor for running Airflow tasks
in Dask clusters.

Closes #2067 from jlowin/dask-executor
2017-02-12 16:06:31 -05:00
Jeremiah Lowin bbfd43df46 [AIRFLOW-863] Example DAGs should have recent start dates
Avoid unnecessary backfills by having start dates
of
just a few days ago. Adds a utility function
airflow.utils.dates.days_ago().

Closes #2068 from jlowin/example-start-date
2017-02-12 15:37:56 -05:00
Dan Davydov b56cb5cc97 [AIRFLOW-219][AIRFLOW-398] Cgroups + impersonation
Submitting on behalf of plypaul

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-219
-
https://issues.apache.org/jira/browse/AIRFLOW-398

Testing Done:
- Running on Airbnb prod (though on a different
mergebase) for many months

Credits:
Impersonation Work: georgeke did most of the work
but plypaul did quite a bit of work too.
Cgroups: plypaul did most of the work, I just did
some touch up/bug fixes (see commit history,
cgroups + impersonation commit is actually plypaul
's not mine)

Closes #1934 from aoen/ddavydov/cgroups_and_impers
onation_after_rebase
2017-01-18 18:11:06 -08:00
Bolke de Bruin 3ac2fba888 Merge branch 'AIRFLOW-760' 2017-01-16 22:23:36 +01:00
Jay 44798e0d4d [AIRFLOW-683] Add jira hook, operator and sensor
Closes #1950 from jhsenjaliya/AIRFLOW-683
2017-01-16 17:46:21 +01:00
Bolke de Bruin f3e18fbe02 [AIRFLOW-760] Update systemd config 2017-01-14 21:32:27 +01:00
Bolke de Bruin 19ed9001b9 [AIRFLOW-740] Pin jinja2 to < 2.9.0
Jinja2 2.9.1 seems to have a conflict with flask-admin.
2017-01-07 19:53:01 +01:00
Vijay Bhat 7fa86f72c6 [AIRFLOW-673] Add operational metrics test for SchedulerJob
Extend SchedulerJob to instrument the execution
performance of task instances contained in each
DAG.
We want to know if any DAG is starved of resources,
and this will be reflected in the stats printed
out at the end of the test run.

Extend SchedulerJob to instrument the execution
performance of task instances contained in each
DAG. We want to know if any DAG is starved of
resources, and this will be reflected in the stats
printed out at the end of the test run.

this test is for instrumenting
the operational impact of
https://github.com/apache/incubator-
airflow/pull/1906

Closes #1919 from vijaysbhat/scheduler_perf_tool
2017-01-03 08:13:06 -05:00
Bolke de Bruin d5ac6bd9d0 [AIRFLOW-489] Add API Framework
This implements a framework for API calls to Airflow. Currently
all access is done by cli or web ui. Especially in the context
of the cli this raises security concerns which can be alleviated
with a secured API call over the wire.

Secondly integration with other systems is a bit harder if you have
to call a cli. For public facing endpoints JSON is used.

As an example the trigger_dag functionality is now made into a
API call.

Backwards compat is retained by switching to a LocalClient.
2016-11-27 19:44:31 +01:00
Li Xuanji dedc54eeaf [AIRFLOW-640] Install and enable nose-ignore-docstring
Closes #1896 from zodiac/nose-ignore-docstring
2016-11-20 17:38:24 -08:00
Li Xuanji ca6dbc6485 [AIRFLOW-639]AIRFLOW-639] Alphasort package names
Closes #1895 from zodiac/alphasort_requirements
2016-11-20 17:06:47 -08:00
Bolke de Bruin 910c0ddd78 [AIRFLOW-504] Store fractional seconds in MySQL tables
Both utcnow() and now() return fractional seconds. These
are sometimes used in primary_keys (eg. in task_instance).
If MySQL is not configured to store these fractional seconds
a primary key might fail (eg. at session.merge) resulting in
a duplicate entry being added or worse.

Postgres does store fractional seconds if left unconfigured,
sqlite needs to be examined.
2016-11-13 22:43:17 +01:00
David Gingrich ff45d8f221 [AIRFLOW-512] Fix 'bellow' typo in docs & comments
Dear Airflow Maintainers,

Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-512

Testing Done:
- N/A, but ran core tests: `./run_unit_tests.sh
tests.core:CoreTest -s`

Closes #1800 from dgingrich/master
2016-09-16 09:45:12 -07:00
Bolke de Bruin 2c3d0fdbe9 Merge remote-tracking branch 'apache/master' 2016-08-09 15:09:51 +02:00
Bolke de Bruin 1d67d6293e [AIRFLOW-404] Retry download if unpacking fails for hive
Travis cache can have a faulty files. This results in builds
that fail as they are dependent on certain components being
available, ie. hive. This addresses the issue for hive by
redownloading if unpacking fails.
2016-08-09 15:00:25 +02:00
Li Xuanji 9d254a317d [AIRFLOW-276] Gunicorn rolling restart
- Tell gunicorn to prepend `[ready]` to worker process name once worker is ready (to serve requests) - in particular this happens after DAGs folder is parsed
- Airflow cli runs gunicorn as a child process instead of `excecvp`-ing over itself
- Airflow cli monitors gunicorn worker processes and restarts them by sending TTIN/TTOU signals to the gunicorn master process
- Fix bug where `conf.get('webserver', 'workers')` and `conf.get('webserver', 'webserver_worker_timeout')` were ignored

- Alternatively, https://github.com/apache/incubator-airflow/pull/1684/files does the same thing but the worker-restart script is provided separately for the user to run

- Start airflow, observe that workers are restarted
- Add new dags to dags folder and check that they show up
- Run `siege` against airflow while server is restarting and confirm that all requests succeed
- Run with configuration set to `batch_size = 0`, `batch_size = 1` and `batch_size = 4`

Closes #1685 from zodiac/xuanji_gunicorn_rolling_restart_2
2016-08-08 11:26:38 -07:00
Paul Yang fdb7e94914 [AIRFLOW-160] Parse DAG files through child processes
Instead of parsing the DAG definition files in the same process as the
scheduler, this change parses the files in a child process. This helps
to isolate the scheduler from bad user code.

Closes #1636 from plypaul/plypaul_schedule_by_file_rebase_master
2016-07-31 12:49:39 -07:00
Li Xuanji edce741234 [AIRFLOW-359] Pin flask-login to 0.2.11
Closes #1679 from zodiac/xuanji_pin_ci_flask_login
2016-07-25 13:59:49 -07:00
Damien Lejeune d6d3f53673 [AIRFLOW-40] Add LDAP group filtering feature.
It is now possible to filter over LDAP group (in the web
interface) when using the LDAP authentication backend.
Note that this feature requires the "memberOf"
overlay to be configured on the LDAP server.

Closes #1479 from dsjl/AIRFLOW-40
2016-06-28 21:12:44 +02:00
Ajay Yadav 3ffa656d97 [AIRFLOW-248] Add Apache license header to all files
- Added Apache license header for files with extension (.service, .in, .mako, .properties, .ini, .sh, .ldif, .coveragerc, .cfg, .yml, .conf, .sql, .css, .js, .html, .xml.
- Added/Replaced shebang on all .sh files with portable version - #!/usr/bin/env bash.
- Skipped third party css and js files. Skipped all minified js files as well.

Closes #1598 from ajayyadava/248
2016-06-21 08:15:42 -07:00
Kengo Seki b7def7f1f9 [AIRFLOW-142] setup_env.sh doesn't download hive tarball if hdp is specified as distro
Closes #1518 from sekikn/AIRFLOW-142
2016-06-09 14:24:54 -07:00
Bolke de Bruin afcd4fcf01 AIRFLOW-181 Fix failing unpacking of hadoop by redownloading
curl compares timestamps, but if the file is corrupt this can
result in hadoop tars that are never updated. This adds a retry
without using the cache.
2016-05-26 21:54:49 +02:00
Kengo Seki 4b78e1a0f1 [AIRFLOW-143] setup_env.sh doesn't leverage cache for downloading minicluster 2016-05-19 15:02:00 +00:00
Matt Pelland 11c34c4353
Implement a Cloudant hook 2016-04-19 16:11:54 -04:00
Bolke de Bruin a36861a5f7 Add multiprocessing support to the scheduler
As the amount of dags grows and the ability to create dags programmatically
is more often used, more and more time is spend in the scheduler which lowers
throughput.

This patch adds the ability to use multiple threads to the scheduler. The
amount of threads can be specified by "max_threads" in the scheduler section
of the configuration. The amount of threads will, however, not exceed the
amount of cores.

In case of using sqlite the max_threads will be set to 1 as sqlite does not
support multiple db connections.
2016-04-16 13:03:50 +02:00
bolkedebruin 4865ee66ba Merge pull request #855 from bolkedebruin/ISSUE-852
Use proper signal handling and cascade signals to children (Fix #852)
2016-04-06 20:41:24 +02:00
Bolke de Bruin e8c1144bb8 Add consistent and thorough signal handling and logging
Airflow spawns childs in the form of a webserver, scheduler, and executors.
If the parent gets terminated (SIGTERM) it needs to properly propagate the
signals to the childs otherwise these will get orphaned and end up as
zombie processes. This patch resolves that issue.

In addition Airflow does not store the PID of its services so they can be
managed by traditional unix systems services like rc.d / upstart / systemd
and the likes. This patch adds the "--pid" flag. By default it stores the
PID in ~/airflow/airflow-<service>.pid

Lastly, the patch adds support for different log file locations: log,
stdout, and stderr (respectively: --log-file, --stdout, --stderr). By
default these are stored in ~/airflow/airflow-<service>.log/out/err.

* Resolves ISSUE-852
2016-04-06 20:40:43 +02:00
Paul Rhodes 81ff5cccb7 Allow Operators to specify SKIPPED status internally
* Added ability to skip DAG elements based on raised Exception

* Added nose-parameterized to test dependencies

* Fix for broken mysql test - provided by jlowin
2016-04-06 13:23:49 -04:00
jlowin 4763720530 Set DAG_FOLDER for unit tests
When tests are running, the default DAG_FOLDER becomes
`airflow/tests/dags`. This makes it much easier to execute DAGs in unit
tests in a standardized manner.

Also exports DAGS_FOLDER as an env var for Travis
2016-04-05 11:04:42 -04:00
bolkedebruin f347ee709f Merge pull request #1128 from bolkedebruin/hivemeta_sasl
Add GSSAPI SASL to HiveMetaStoreHook.
2016-03-30 09:17:50 +02:00
Bolke de Bruin d66bf57ec5 Use LocalExecutor on Travis if possible
Travis was only using the SequentialExecutor, which is suboptimal as the
SequentialExecutor is not geared for production. This change enables
the LocalExecutor if possible, ie. when not using sqllite.
2016-03-29 12:58:07 +02:00
Bolke de Bruin 657aebbd0e Merge branch 'master' into hivemeta_sasl 2016-03-28 12:07:47 +02:00
Xavier P a2c389b936 Set killMode to 'control-group' for webservice.service
The gunicorn workers where not properly killed on `systemctl stop`
command. By removing the KillMode setting the default parameter
"control-group" is used and all childrens are properly killed
2016-03-21 16:22:25 +01:00
Xavier P afcd824154 Set KillMode to 'control-group' for worker.service
Systemd KillMode was set to "process" so only the main process was
kill and not his childrens. By removing this line, the systemd default value
"control-group" is used and all childrens are correctly stopped.
2016-03-21 16:21:34 +01:00
Bolke de Bruin 1b75315cd5 Merge remote-tracking branch 'upstream/master' into minicluster 2016-03-20 10:20:43 +01:00
Maxime Beauchemin a4024c2670 Pointing to a reqs file 2016-03-19 15:22:35 -07:00
Bolke de Bruin cd27c939b8 Add license and ignore for sql and csv 2016-03-19 19:00:09 +01:00
Bolke de Bruin f4596263d3 Provide data for ci tests 2016-03-19 18:37:07 +01:00
Bolke de Bruin ba15d3ab76 This patch allows for testing of hive operators and hooks. Sasl is used (NoSasl in connection string is not possible). Tests have been adjusted. 2016-03-18 14:03:38 +01:00
Bolke de Bruin d2b310462d Make sure only to update counts when building master 2016-03-09 08:48:46 +01:00
Dmitriy Lee 4de0e70d6a Add upstart scripts 2016-03-07 11:24:08 -05:00
Bolke de Bruin 94142ea177 This patch adds license checking for Airflow. For now it will store a number
in Travis' cache to make sure current builds do not fail but newly added
files should have a license header included.
2016-03-04 19:11:15 +01:00
Maxime Beauchemin f112885538 Running unit tests with local executor 2016-02-21 21:16:02 -08:00
Moira Tagle 3e99d58d25 Add tests to:
1. make sure that if no superuser or dataprofiler filter is set, all logged in users get superuser or dataprofiler privileges.
2. make sure that, if filters are set, users get acess when the filters allow it.

change ldif loading so that it always loads in a reasonable order.
2016-02-09 15:42:00 -08:00
Maxime Beauchemin 705499e97d Merge pull request #674 from bolkedebruin/systemd
Add systemd unit files
2015-12-04 14:37:32 -08:00
Svend Vanderveken 4563df404c add Fernet key to test config
- congiguration.py now generates the test config and real airflow config with the same method
- fix warning logs in local UT execution, related to missing FERNET-KEY in unittest.cfg
- fix warning logs in Travis UT exeuction, related to missing FERNET-KEY in airflow_travis.cfg
- added one to validate config generation
2015-12-04 11:06:37 +01:00
Bolke de Bruin 06b2728c52 Make scheduler runs configurable add example environment file 2015-12-01 12:13:38 +01:00
Bolke de Bruin 429cecc4fa Always restart scheduler if it stops after 5 runs 2015-11-25 04:35:09 +01:00
Bolke de Bruin 327984db53 Add contents and description 2015-11-23 19:32:54 +01:00
Bolke de Bruin c83d78f31b Add systemd unit files 2015-11-20 19:43:09 +01:00
Svend Vanderveken 49364e13a3 now running travis on all DB backends
- added Travis config to also run integration tests on sqlite
- added Travis config to also run integration test on postgre, and commented them out as they seem to fail
- added a bit more logs in bash scripts executed by Travis
2015-11-16 11:21:21 +01:00
Bolke de Bruin 1a66df4119 Add ldap travis tests 2015-11-05 22:48:01 +01:00
Maxime Beauchemin 8295ed4edd Trying to see whether the sequential executor works better 2015-10-30 14:11:45 -07:00
Maxime Beauchemin 714beb2594 Lowering travis parallelism to 2 2015-10-30 12:23:22 -07:00
Maxime Beauchemin 53258d1c61 Merge pull request #579 from bolkedebruin/master
Fix travis caching
2015-10-30 11:50:36 -07:00
Maxime Beauchemin 92b81ddc7b Lowering parallelism on Travis cfg 2015-10-30 09:52:10 -07:00
Bolke de Bruin 2856b3a4c8 Fix caching by using TRAVIS_CACHE and moving it from .cache to .travis_cache 2015-10-30 11:23:11 +01:00
Maxime Beauchemin 34ac16ea83 Merge pull request #566 from bolkedebruin/master
Speed up requirements downloading for tests
2015-10-28 10:32:05 -07:00
Bolke de Bruin efb834ee1f Make cp recursive 2015-10-28 12:32:09 +01:00
Bolke de Bruin 91c3c9c33b Set cache for ivy 2015-10-28 12:28:54 +01:00
Bolke de Bruin 5b90d7b28a Cache minikdc requirements 2015-10-28 12:19:54 +01:00
Bolke de Bruin a936848e35 Typo fix 2015-10-28 12:13:31 +01:00
Bolke de Bruin c6d6b800c2 Speed up downloads of hadoop distros 2015-10-28 12:04:51 +01:00
Maxime Beauchemin 6c89f0f92e Tons of tests improvments 2015-10-25 23:52:23 -07:00
Bolke de Bruin 7ab372e32e Fix path (again) 2015-10-21 10:45:27 -07:00
Bolke de Bruin 475853659d Fix path 2015-10-21 10:45:27 -07:00
Bolke de Bruin 039a1e650d Make sure the cfg can be copied in any travis environment 2015-10-21 10:45:27 -07:00
Bolke de Bruin 76779fb671 Move to mysqlclient instead of pymysql for python3 2015-10-21 10:45:27 -07:00
Maxime Beauchemin 4401938f76 Moving to from mysql-python to pymysql 2015-10-21 10:45:27 -07:00
Maxime Beauchemin 470e2a9be8 Adding a custom airflow.cfg from the repo 2015-10-21 10:42:01 -07:00
Maxime Beauchemin 1ca1776b3d Trying different strategies to get initdb to run 2015-10-21 10:42:01 -07:00
Bolke de Bruin f1776f2d5d Removed unneeded script 2015-10-21 10:42:00 -07:00
Bolke de Bruin 1bef4a3812 different way of matching 2015-10-21 10:42:00 -07:00
Bolke de Bruin a75c6baca0 fix 2015-10-21 10:42:00 -07:00
Bolke de Bruin d50b4aa26c make case insensitive 2015-10-21 10:42:00 -07:00
Bolke de Bruin 7ecfe7d20f use correct separator 2015-10-21 10:42:00 -07:00
Bolke de Bruin ef63350007 testing 2015-10-21 10:42:00 -07:00
Bolke de Bruin 685f54547f Update script for misisng wheels (does not check versions) 2015-10-21 10:42:00 -07:00
Bolke de Bruin db1ad2f3a9 Make setup_kdc executable 2015-10-21 10:41:58 -07:00
Bolke de Bruin 8cf4b9d360 Add kdc download and some test 2015-10-21 10:41:58 -07:00
Bolke de Bruin aefda8e31a Add minikdc to be downloaded and made available 2015-10-21 10:41:58 -07:00
Bolke de Bruin 4a65e94a5d correct paths 2015-10-21 10:41:58 -07:00
Bolke de Bruin 903bb81b36 allow execute rights 2015-10-21 10:41:57 -07:00
Bolke de Bruin e78ec1f1dc Port ci tests from snakebite in order to test against hadoop 2015-10-21 10:41:57 -07:00