When using potentially larger offets than javascript can handle, they can get parsed incorrectly on the client, resulting in the offset query getting stuck on a certain number. This patch ensures that we return a string to the client to avoid being parsed. When we run the query, we ensure the offset is set as an integer.
Add unnecesary prefix_ in config for elastic search section
* Move k8s executor from contrib folder
Considering that the k8s executor is now fully supported by core
committers, we should move it from contrib to the primary executor
directory.
There were a few ways of getting the AIRFLOW_HOME directory used
throughout the code base, giving possibly conflicting answer if they
weren't kept in sync:
- the AIRFLOW_HOME environment variable
- core/airflow_home from the config
- settings.AIRFLOW_HOME
- configuration.AIRFLOW_HOME
Since the home directory is used to compute the default path of the
config file to load, specifying the home directory Again in the config
file didn't make any sense to me, and I have deprecated that.
This commit makes everything in the code base use
`settings.AIRFLOW_HOME` as the source of truth, and deprecates the
core/airflow_home config option.
There was an import cycle form settings -> logging_config ->
module_loading -> settings that needed to be broken on Python 2 - so I
have moved all adjusting of sys.path in to the settings module
(This issue caused me a problem where the RBAC UI wouldn't work as it
didn't find the right webserver_config.py)
Something about the tests or how we run them changed and we ended up
with a lot more lines appearing in the output, taking us over Travis' "will
display in the UI" limit, making it harder to debug failures. This isn't a long
term fix, but improves things while we fix the tests for the better.
Newer versions of Kube return "failed" events for the side car container
when the ^C causes the python process to exit with 1
Kube 1.13 runs a different number of kube-dns pods (2 by default, 1.9
and 1.10 ran only 1) so the setup scripts needed changing a little bit.
To get a Kube 1.13 cluster I had to upgrade minikube, and it no longer
works on a dist without systemd installed (#systemdsucks) so I had to
update the travis dist to xenial which is no bad thing!
This version of minikube doesn't need the localkube bootstrapper set
anymore, it handles driver=none much more gracefully, and some of the
permissions set up for context files/keys needed to be updated.
* Support setting global k8s affinity and toleration configuration in the airflow config file.
* Copy annotations as dict, not list
* Update airflow/contrib/kubernetes/pod.py
Co-Authored-By: kppullin <kevin.pullin@gmail.com>
To help move away from Minikube, we need to remove the dependency on
a local docker registry and move towards a solution that can be used
in any kubernetes cluster. Custom image names allow users to use
systems like docker, artifactory and gcr
When running integration tests on a k8s cluster vs. Minikube
I discovered that we were actually using an invalid permission
structure for our persistent volume. This commit fixes that.
* Refactor Kubernetes operator with git-sync
Currently the implementation of git-sync is broken because:
- git-sync clones the repository in /tmp and not in airflow-dags volume
- git-sync add a link to point to the revision required but it is not
taken into account in AIRFLOW__CORE__DAGS_FOLDER
Dags/logs hostPath volume has been added (needed if airflow run in
kubernetes in local environment)
To avoid false positive in CI `load_examples` is set to `False`
otherwise DAGs from `airflow/example_dags` are always loaded. In this
way is possible to test `import` in DAGs
Remove `worker_dags_folder` config:
`worker_dags_folder` is redundant and can lead to confusion.
In WorkerConfiguration `self.kube_config.dags_folder` defines the path of
the dags and can be set in the worker using airflow_configmap
Refactor worker_configuration.py
Use a docker container to run setup.py
Compile web assets
Fix codecov application path
* Fix kube_config.dags_in_image
* Read `dags_in_image` config value as a boolean
This PR is a minor fix for #3683
The dags_in_image config value is read as a string. However, the existing code expects this to be a boolean.
For example, in worker_configuration.py there is the statement: if not self.kube_config.dags_in_image:
Since the value is a non-empty string ('False') and not a boolean, this evaluates to true (since non-empty strings are truthy)
and skips the logic to add the dags_volume_claim volume mount.
This results in the CI tests failing because the dag volume is missing in the k8s pod definition.
This PR reads the dags_in_image using the conf.getboolean to fix this error.
Rebased on 457ad83e4e, before the previous
dags_in_image commit was reverted.
* Revert "Revert [AIRFLOW-2770] [AIRFLOW-3505] (#4318)"
This reverts commit 77c368fd22.
* Revert "[AIRFLOW-3505] replace 'dags_in_docker' with 'dags_in_image' (#4311)"
This reverts commit 457ad83e4e.
* Revert "[AIRFLOW-2770] kubernetes: add support for dag folder in the docker image (#3683)"
This reverts commit e9a09d408e.
Password stay None value and not None (str) in case there is no password set through webadmin interfaces.
This is fix for connections for Redis that not expect autorisation from clients.
The current `airflow flower` doesn't come with any authentication.
This may make essential information exposed in an untrusted environment.
This commit add support to HTTP basic authentication for Airflow Flower
Ref:
https://flower.readthedocs.io/en/latest/auth.html
This adds ASF license headers to all the .rst and .md files with the
exception of the Pull Request template (as that is included verbatim
when opening a Pull Request on Github which would be messy)
* [AIRFLOW-3178] Don't mask defaults() function from ConfigParser
ConfigParser (the base class for AirflowConfigParser) expects defaults()
to be a function - so when we re-assign it to be a property some of the
methods from ConfigParser no longer work.
* [AIRFLOW-3178] Correctly escape percent signs when creating temp config
Otherwise we have a problem when we come to use those values.
* [AIRFLOW-3178] Use os.chmod instead of shelling out
There's no need to run another process for a built in Python function.
This also removes a possible race condition that would make temporary
config file be readable by more than the airflow or run-as user
The exact behaviour would depend on the umask we run under, and the
primary group of our user, likely this would mean the file was readably
by members of the airflow group (which in most cases would be just the
airflow user). To remove any such possibility we chmod the file
before we write to it
- Update outdated cli command to create user
- Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/`
- Update the path to copy K8s dags
The recent update to the CI image changed the default
python from python2 to python3. The PythonVirtualenvOperator
tests expected python2 as default and fail due to
serialisation errors.
One of the things for tests is being self contained. This means that
it should not depend on anything external, such as loading data.
This PR will use the setUp and tearDown to load the data into MySQL
and remove it afterwards. This removes the actual bash mysql commands
and will make it easier to dockerize the whole testsuite in the future
The current dockerised CI pipeline doesn't run minikube and the
Kubernetes integration tests. This starts a Kubernetes cluster
using minikube and runs k8s integration tests using docker-compose.
- Add missing variables and use codecov instead of coveralls.
The issue why it wasn't working was because missing environment variables.
The codecov library heavily depends on the environment variables in
the CI to determine how to push the reports to codecov.
- Remove the explicit passing of the variables in the `tox.ini`
since it is already done in the `docker-compose.yml`,
having to maintain this at two places makes it brittle.
- Removed the empty Codecov yml since codecov was complaining that
it was unable to parse it
Airflow tests depend on many external services and other custom setup,
which makes it hard for contributors to work on this codebase. CI
builds have also been unreliable, and it is hard to reproduce the
causes. Having contributors trying to emulate the build environment
every time makes it easier to get to an "it works on my machine" sort
of situation.
This implements a dockerised version of the current build pipeline.
This setup has a few advantages:
* TravisCI tests are reproducible locally
* The same build setup can be used to create a local development environment
- Dictionary creation should be written by dictionary literal
- Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
- Functions calling sets which can be replaced by set literal are now replaced by set literal
- Replace list literals
- Some of the static methods haven't been set static
- Remove redundant parentheses
Fix scripts/ci/kubernetes/minikube/start_minikube.sh
as follows:
- Make minikube version configurable via
environment variable
- Remove unused variables for readability
- Reorder some lines to remove warnings
- Replace ineffective `return` with `exit`
- Add -E to `sudo minikube` so that non-root
users can use this script locally
By default one of Apache Airflow's dependencies pulls in a GPL
library. Airflow should not install (and upgrade) without an explicit choice.
This is part of the Apache requirements as we cannot depend on Category X
software.
* Updates the GCP hooks to use the google-auth
library and removes
dependencies on the deprecated oauth2client
package.
* Removes inconsistent handling of the scope
parameter for different
auth methods.
Note: using google-auth for credentials requires a
newer version of the
google-api-python-client package, so this commit
also updates the
minimum version for that.
To avoid some annoying warnings about the
discovery cache not being
supported, so disable the discovery cache
explicitly as recommend here:
https://stackoverflow.com/a/44518587/101923
Tested by running:
nosetests
tests/contrib/operators/test_dataflow_operator.py
\
tests/contrib/operators/test_gcs*.py \
tests/contrib/operators/test_mlengine_*.py \
tests/contrib/operators/test_pubsub_operator.py \
tests/contrib/hooks/test_gcp*.py \
tests/contrib/hooks/test_gcs_hook.py \
tests/contrib/hooks/test_bigquery_hook.py
and also tested by running some GCP-related DAGs
locally, such as the
Dataproc DAG example at
https://cloud.google.com/composer/docs/quickstartCloses#3488 from tswast/google-auth
Add lineage support by having inlets and oulets
that
are made available to dependent upstream or
downstream
tasks.
If configured to do so can send lineage data to a
backend. Apache Atlas is supported out of the box.
Closes#3321 from bolkedebruin/lineage_exp
[AIRFLOW-2424] Add dagrun status endpoint and
increase k8s test coverage
[AIRFLOW-2424] Added minikube fixes by @kimoonkim
[AIRFLOW-2424] modify endpoint to remove 'status'
Closes#3320 from dimberman/add-kubernetes-test
[AIRFLOW-1899] Add full deployment
- Made home directory configurable
- Documentation fix
- Add licenses
[AIRFLOW-1899] Tests for the Kubernetes Executor
Add an integration test for the Kubernetes
executor. Done by
spinning up different versions of kubernetes and
run a DAG
by invoking the REST API
Closes#3301 from Fokko/fix-kubernetes-executor
The logs are kept inside of the worker pod. By
attaching a persistent
disk we keep the logs and make them available for
the webserver.
- Remove the requirements.txt since we dont want
to maintain another
dependency file
- Fix some small casing stuff
- Removed some unused code
- Add missing shebang lines
- Started on some docs
- Fixed the logging
Closes#3252 from Fokko/airflow-2357-pd-for-logs
Handle too old resource versions and throw exceptions on errors
- K8s API errors will now throw Airflow exceptions
- Add scheduler uuid to worker pod labels to match the two
* Added in executor_config to the task_instance table and the base_operator table
* Fix test; bump up number of examples
* Fix up comments from PR
* Exclude the kubernetes example dag from a test
* Fix dict -> KubernetesExecutorConfig
* fixed up executor_config comment and type hint
Add kubernetes config section in airflow.cfg and Inject GCP secrets upon executor start. (#17)
Update Airflow to Pass configuration to k8s containers, add some Py3 … (#9)
* Update Airflow to Pass configuration to k8s containers, add some Py3 compat., create git-sync pod
* Undo changes to display-source config setter for to_dict
* WIP Secrets and Configmaps
* Improve secrets support for multiple secrets. Add support for registry secrets. Add support for RBAC service accounts.
* Swap order of variables, overlooked very basic issue
* Secret env var names must be upper
* Update logging
* Revert spothero test code in setup.py
* WIP Fix tests
* Worker should be using local executor
* Consolidate worker setup and address code review comments
* reconfigure airflow script to use new secrets method
The python-cloudant release 2.8 is broken and
causes our CI to fail.
In the setup.py we install cloudant version <2.0
and in our CI pipeline
we install the latest version.
Closes#3051 from Fokko/fd-fix-cloudant
sla_miss and task_instances cannot have NULL
execution_dates. The timezone
migration scripts forgot to set this properly. In
addition to make sure
MySQL does not set "ON UPDATE CURRENT_TIMESTAMP"
or MariaDB "DEFAULT
0000-00-00 00:00:00" we now check if
explicit_defaults_for_timestamp is turned
on and otherwise fail an database upgrade.
Closes#2969, #2857Closes#2979 from bolkedebruin/AIRFLOW-1895
Adding configuration setting for specifying a
default mapred_queue for
hive jobs using the HiveOperator.
Closes#2915 from edgarRd/erod-hive-mapred-queue-
config
There are still celeryd_concurrency occurrences
left in the code
this needs to be renamed to worker_concurrency to
make the config
with Celery consistent
Closes#2870 from Fokko/AIRFLOW-1911-update-
airflow-config
Options were set to visibility timeout instead of
broker_options
directly. Furthermore, options should be int,
float, bool or string
not all string.
Closes#2867 from bolkedebruin/AIRFLOW-1908
Explicitly set the celery backend from the config
and align the config
with the celery config as this might be confusing.
Closes#2806 from Fokko/AIRFLOW-1840-Fix-celery-
config
https://github.com/spulec/moto/pull/1048 introduced `docker` as a
dependency in Moto, causing a conflict as Airflow uses `docker-py`. As
both packages don't work together, Moto is pinned to the version
prior to that change.
In the very early days, the Airflow scheduler
needed to be restarted
every so often to take new DAG_FOLDERS mutations
into account properly. This is no longer
required.
Closes#2677 from mistercrunch/scheduler_runs
The celery config is currently part of the celery executor definition.
This is really inflexible for users wanting to change it. In addition
Celery 4 is moving to lowercase.
Closes#2542 from bolkedebruin/upgrade_celery
In all the popular languages the variable name log
is the de facto
standard for the logging. Rename LoggingMixin.py
to logging_mixin.py
to comply with the Python standard.
When using the .logger a deprecation warning will
be emitted.
Closes#2604 from Fokko/AIRFLOW-1604-logger-to-log
Make the druid operator and hook more specific.
This allows us to
have a more flexible configuration, for example
ingest parquet.
Also get rid of the PyDruid extension since it is
more focussed on
querying druid, rather than ingesting data. Just
requests is
sufficient to submit an indexing job. Add a test
to the hive_to_druid
operator to make sure it behaves as we expect.
Furthermore cleaned
up the docstring a bit
Closes#2378 from Fokko/AIRFLOW-1324-make-more-
general-druid-hook-and-operator
1. Upgrade qds_sdk version to latest
2. Add support to run Zeppelin Notebooks
3. Move out initialization of QuboleHook from
init()
Closes#2322 from msumit/AIRFLOW-1192
Rename all unit tests under tests/contrib to start
with test_* and fix
broken unit tests so that they run for the Python
2 and 3 builds.
Closes#2234 from hgrif/AIRFLOW-1094
This PR implements a hook to interface with Azure
storage over wasb://
via azure-storage; adds sensors to check for blobs
or prefixes; and
adds an operator to transfer a local file to the
Blob Storage.
Design is similar to that of the S3Hook in
airflow.operators.S3_hook.
Closes#2216 from hgrif/AIRFLOW-1065
Both the ShortCircuitOperator, Branchoperator and LatestOnlyOperator
were arbitrarily changing the states of TaskInstances without locking
them in the database. As the scheduler checks the state of dag runs
asynchronously the dag run state could be set to failed while the
operators are updating the downstream tasks.
A better fix would to use the dag run iteself in the context of the
Operator.
The return from the subprocess is in bytes when
the universal
newlines is set to False (default). This will fail
in Py3 and
works fine in Py2. And with a working unit test.
Closes#2158 from abij/AIRFLOW-840
We add the Apache-licensed bleach library and use
it to sanitize html
passed to Markup (which is supposed to be already
escaped). This avoids
some XSS issues with unsanitized user input being
displayed.
Closes#2193 from saguziel/aguziel-xss
Avoid unnecessary backfills by having start dates
of
just a few days ago. Adds a utility function
airflow.utils.dates.days_ago().
Closes#2068 from jlowin/example-start-date
Submitting on behalf of plypaul
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-219
-
https://issues.apache.org/jira/browse/AIRFLOW-398
Testing Done:
- Running on Airbnb prod (though on a different
mergebase) for many months
Credits:
Impersonation Work: georgeke did most of the work
but plypaul did quite a bit of work too.
Cgroups: plypaul did most of the work, I just did
some touch up/bug fixes (see commit history,
cgroups + impersonation commit is actually plypaul
's not mine)
Closes#1934 from aoen/ddavydov/cgroups_and_impers
onation_after_rebase
Extend SchedulerJob to instrument the execution
performance of task instances contained in each
DAG.
We want to know if any DAG is starved of resources,
and this will be reflected in the stats printed
out at the end of the test run.
Extend SchedulerJob to instrument the execution
performance of task instances contained in each
DAG. We want to know if any DAG is starved of
resources, and this will be reflected in the stats
printed out at the end of the test run.
this test is for instrumenting
the operational impact of
https://github.com/apache/incubator-
airflow/pull/1906
Closes#1919 from vijaysbhat/scheduler_perf_tool
This implements a framework for API calls to Airflow. Currently
all access is done by cli or web ui. Especially in the context
of the cli this raises security concerns which can be alleviated
with a secured API call over the wire.
Secondly integration with other systems is a bit harder if you have
to call a cli. For public facing endpoints JSON is used.
As an example the trigger_dag functionality is now made into a
API call.
Backwards compat is retained by switching to a LocalClient.
Both utcnow() and now() return fractional seconds. These
are sometimes used in primary_keys (eg. in task_instance).
If MySQL is not configured to store these fractional seconds
a primary key might fail (eg. at session.merge) resulting in
a duplicate entry being added or worse.
Postgres does store fractional seconds if left unconfigured,
sqlite needs to be examined.
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-512
Testing Done:
- N/A, but ran core tests: `./run_unit_tests.sh
tests.core:CoreTest -s`
Closes#1800 from dgingrich/master
Travis cache can have a faulty files. This results in builds
that fail as they are dependent on certain components being
available, ie. hive. This addresses the issue for hive by
redownloading if unpacking fails.
- Tell gunicorn to prepend `[ready]` to worker process name once worker is ready (to serve requests) - in particular this happens after DAGs folder is parsed
- Airflow cli runs gunicorn as a child process instead of `excecvp`-ing over itself
- Airflow cli monitors gunicorn worker processes and restarts them by sending TTIN/TTOU signals to the gunicorn master process
- Fix bug where `conf.get('webserver', 'workers')` and `conf.get('webserver', 'webserver_worker_timeout')` were ignored
- Alternatively, https://github.com/apache/incubator-airflow/pull/1684/files does the same thing but the worker-restart script is provided separately for the user to run
- Start airflow, observe that workers are restarted
- Add new dags to dags folder and check that they show up
- Run `siege` against airflow while server is restarting and confirm that all requests succeed
- Run with configuration set to `batch_size = 0`, `batch_size = 1` and `batch_size = 4`
Closes#1685 from zodiac/xuanji_gunicorn_rolling_restart_2
Instead of parsing the DAG definition files in the same process as the
scheduler, this change parses the files in a child process. This helps
to isolate the scheduler from bad user code.
Closes#1636 from plypaul/plypaul_schedule_by_file_rebase_master
It is now possible to filter over LDAP group (in the web
interface) when using the LDAP authentication backend.
Note that this feature requires the "memberOf"
overlay to be configured on the LDAP server.
Closes#1479 from dsjl/AIRFLOW-40
- Added Apache license header for files with extension (.service, .in, .mako, .properties, .ini, .sh, .ldif, .coveragerc, .cfg, .yml, .conf, .sql, .css, .js, .html, .xml.
- Added/Replaced shebang on all .sh files with portable version - #!/usr/bin/env bash.
- Skipped third party css and js files. Skipped all minified js files as well.
Closes#1598 from ajayyadava/248
As the amount of dags grows and the ability to create dags programmatically
is more often used, more and more time is spend in the scheduler which lowers
throughput.
This patch adds the ability to use multiple threads to the scheduler. The
amount of threads can be specified by "max_threads" in the scheduler section
of the configuration. The amount of threads will, however, not exceed the
amount of cores.
In case of using sqlite the max_threads will be set to 1 as sqlite does not
support multiple db connections.
Airflow spawns childs in the form of a webserver, scheduler, and executors.
If the parent gets terminated (SIGTERM) it needs to properly propagate the
signals to the childs otherwise these will get orphaned and end up as
zombie processes. This patch resolves that issue.
In addition Airflow does not store the PID of its services so they can be
managed by traditional unix systems services like rc.d / upstart / systemd
and the likes. This patch adds the "--pid" flag. By default it stores the
PID in ~/airflow/airflow-<service>.pid
Lastly, the patch adds support for different log file locations: log,
stdout, and stderr (respectively: --log-file, --stdout, --stderr). By
default these are stored in ~/airflow/airflow-<service>.log/out/err.
* Resolves ISSUE-852
* Added ability to skip DAG elements based on raised Exception
* Added nose-parameterized to test dependencies
* Fix for broken mysql test - provided by jlowin
When tests are running, the default DAG_FOLDER becomes
`airflow/tests/dags`. This makes it much easier to execute DAGs in unit
tests in a standardized manner.
Also exports DAGS_FOLDER as an env var for Travis
Travis was only using the SequentialExecutor, which is suboptimal as the
SequentialExecutor is not geared for production. This change enables
the LocalExecutor if possible, ie. when not using sqllite.
The gunicorn workers where not properly killed on `systemctl stop`
command. By removing the KillMode setting the default parameter
"control-group" is used and all childrens are properly killed
Systemd KillMode was set to "process" so only the main process was
kill and not his childrens. By removing this line, the systemd default value
"control-group" is used and all childrens are correctly stopped.
1. make sure that if no superuser or dataprofiler filter is set, all logged in users get superuser or dataprofiler privileges.
2. make sure that, if filters are set, users get acess when the filters allow it.
change ldif loading so that it always loads in a reasonable order.
- congiguration.py now generates the test config and real airflow config with the same method
- fix warning logs in local UT execution, related to missing FERNET-KEY in unittest.cfg
- fix warning logs in Travis UT exeuction, related to missing FERNET-KEY in airflow_travis.cfg
- added one to validate config generation
- added Travis config to also run integration tests on sqlite
- added Travis config to also run integration test on postgre, and commented them out as they seem to fail
- added a bit more logs in bash scripts executed by Travis