So far, the production images of Airflow were using sources
when they were built on CI. This PR changes that, to build
airflow + providers packages first and install them
rather than use sources as installation mechanism.
Part of #12261
For Kubernetes tests all tests can be executed in the same python
version - default one - no matter which PYTHON_MAJOR_MINOR is
used. This is because we are testing Airflow which is deployed
via production image. Thanks to that we can fix the python version
to be default and avoid any python version problems (this is
especially important for cherry-picking to 1.10 where we have
python 2.7 and 3.5.
The K9s is fantastic tool that helps to debug a running k8s
instance. It is terminal-based windowed CLI that makes you
several times more productive comparing to using kubectl
commands. We've integrated k9s (it is run as a docker container
and downloaded on demand). We've also separated out KUBECONFIG
of the integrated kind cluster so that it does not mess with
kubernetes configuration you might already have.
Also - together with that the "surrounding" of the kubernetes
tests were simplified and improved so that the k9s integration
can be utilized well. Instead of kubectl port forwarding (which
caused multitude of problems) we are now utilizing kind's
portMapping feature + custom NodePort resource that maps
port 8080 to 30007 NodePort which in turn maps it to 8080
port of the Webserver. This way we do not have to establish
an external kubectl port forward which is prone to error and
management - everything is brought up when Airflow gets
deployed to the Kind Cluster and shuts down when the Kind
cluster is stopped.
Yet another problem fixed was killing of postgres by one of the
kubernetes tests ('test_integration_run_dag_with_scheduler_failure').
Instead of just killing the scheduler it killed all pods - including
the Postgres one (it was named 'airflow-postgres.*'). That caused
various problems, as the database could be left in a strange state.
I changed the tests to do what it claimed was doing - so killing only the
scheduler during the test. This seemed to improve the stability
of tests immensely in my local setup.
In preparation for adding provider packages to 2.0 line we
are renaming backport packages to provider packages.
We want to implement this in stages - first to rename the
packages, then split-out backport/2.0 providers as part of
the #11421 issue.
We seem to have a problem with running all tests at once - most
likely due to some resource problems in our CI, therefore it makes
sense to split the tests into more batches. This is not yet full
implementation of selective tests but it is going in this direction
by splitting to Core/Providers/API/CLI tests. The full selective
tests approach will be implemented as part of #10507 issue.
This split is possible thanks to #10422 which moved building image
to a separate workflow - this way each image is only built once
and it is uploaded to a shared registry, where it is quickly
downloaded from rather than built by all the jobs separately - this
way we can have many more jobs as there is very little per-job
overhead before the tests start runnning.
We've observed the tests for last couple of weeks and it seems
most of the tests marked with "quarantine" marker are succeeding
in a stable way (https://github.com/apache/airflow/issues/10118)
The removed tests have success ratio of > 95% (20 runs without
problems) and this has been verified a week ago as well,
so it seems they are rather stable.
There are literally few that are either failing or causing
the Quarantined builds to hang. I manually reviewed the
master tests that failed for last few weeks and added the
tests that are causing the build to hang.
Seems that stability has improved - which might be casued
by some temporary problems when we marked the quarantined builds
or too "generous" way of marking test as quarantined, or
maybe improvement comes from the #10368 as the docker engine
and machines used to run the builds in GitHub experience far
less load (image builds are executed in separate builds) so
it might be that resource usage is decreased. Another reason
might be Github Actions stability improvements.
Or simply those tests are more stable when run isolation.
We might still add failing tests back as soon we see them behave
in a flaky way.
The remaining quarantined tests that need to be fixed:
* test_local_run (often hangs the build)
* test_retry_handling_job
* test_clear_multiple_external_task_marker
* test_should_force_kill_process
* test_change_state_for_tis_without_dagrun
* test_cli_webserver_background
We also move some of those tests to "heisentests" category
Those testst run fine in isolation but fail
the builds when run with all other tests:
* TestImpersonation tests
We might find that those heisentest can be fixed but for
now we are going to run them in isolation.
Also - since those quarantined tests are failing more often
the "num runs" to track for those has been decreased to 10
to keep track of 10 last runs only.
The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.
The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.
This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Daniel Imberman <daniel@astronomer.io>
For a long time the way how entrypoint worked in ci scripts
was wrong. The way it worked was convoluted and short of black
magic. This did not allow to pass multiple test targets and
required separate execute command scripts in Breeze.
This is all now straightened out and both production and
CI image are always using the right entrypoint by default
and we can simply pass parameters to the image as usual without
escaping strings.
This also allowed to remove some breeze commands and
change names of several flags in Breeze to make them more
meaningful.
Both CI and PROD image have now embedded scripts for log
cleaning.
History of image releases is added for 1.10.10-*
alpha quality images.
Tests requiring Kubernetes Cluster are now moved out of
the regular CI tests and moved to "kubernetes_tests" folder
so that they can be run entirely on host without having
the CI image built at all. They use production image
to run the tests on KinD cluster and we add tooling
to start/stop/deploy the application to the KinD cluster
automatically - for both CI testing and local development.
This is a pre-requisite to convert the tests to convert the
tests to use the official Helm Chart and Docker images or
Apache Airflow.
It closes#8782
We have now mechanism to keep release notes updated for the
backport operators in an automated way.
It really nicely generates all the necessary information:
* summary of requirements for each backport package
* list of dependencies (including extras to install them) when package
depends on other providers packages
* table of new hooks/operators/sensors/protocols/secrets
* table of moved hooks/operators/sensors/protocols/secrets with
information where they were moved from
* changelog of all the changes to the provider package (this will be
automatically updated with incremental changelog whenever we decide to
release separate packages.
The system is fully automated - we will be able to produce release notes
automatically (per-package) whenever we decide to release new version of
the package in the future.
The CRON job from previous runs did not have everything working
after the emergency migration to Github Actions.
This change brings back following improvements:
* rebuilding images from the scratch in CRON job
* automatically upgrading all requirements to test if they are new
* pushing production images to github packages as cache
* pushing nightly tag to github
Originally Breeze was used to run unit and integration tests, recently system
tests and finally we make it a bit more friendly to test your DAGs there. You
can now install any older airflow version in Breeze via
--install-airflow-version switch and "files/dags" folder is mounted to
"/files/dags" and this folder is used to read the dags from.
This change introduces sub-commands in breeze tool.
It is much needed as we have many commands now
and it was difficult to separate commands from flags.
Also --help output was very long and unreadable.
With this change help it is much easier to discover
what breeze can do for you as well as navigate with it.
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
We will run system test on back-ported operators for 1.10* series of airflow
and for that we need to have support for running system tests using pytest's
markers and reading environment variables passed from HOST machine (to pass
credentials).
This is the first step to automate system tests execution.
* Add DAG return type
* Add airflow dag test CLI command for running a complete DAG with CLI
* Added new command note to updating.md
* Add airflow dags test example to testing.rst
* Fix rst inline code mistake
You can now choose which integration you want to start when you run
Breeze as well as when CI tests are run. Now by default Breeze
and CI runs without integrations, but you can add them via
Breeze flags or by environment variables when CI is executed.
We have pytest markers now that mark tests that can be run for
integration, backend and runtime selected.
Also we have now more test jobs - we have separate test run
for all non-integration tests (with less memory used by the
integrations) and separate jobs that run integration tests
only (more memory used for integrations but far less number
of tests to run)
We have far too much bash code around that is not automatically tested.
This is the first step to change it (simplifications and more tests are coming
soon).
Adds new executor that is meant to be used mainly
for debugging and DAG development purposes. This
executor executes single task instance at time and
is able to work with SQLLite and sensors.