to control which parts of the tests are run during the tests. This is implemented by the
``scripts/ci/selective_ci_checks.sh`` script in our repository. This script analyses which part of the
code has changed and based on that it sets the right outputs that control which tests are executed in
the CI build, and whether we need to build CI images necessary to run those steps. This allowed to
heavily decrease the strain especially for the Pull Requests that were not touching code (in which case
the builds can complete in < 2 minutes) but also by limiting the number of tests executed in PRs that do
not touch the "core" of Airflow, or only touching some - standalone - parts of Airflow such as
"Providers", "WWW" or "CLI". This solution is not yet perfect as there are likely some edge cases but
it is easy to maintain and we have an escape-hatch - all the tests are always executed in master pushes,
so contributors can easily spot if there is a "missed" case and fix it - both by fixing the problem and
adding those exceptions to the code. More about it can be found in the
`Selective CI checks <#selective-ci-checks>`_ chapter.
3) Even more optimisation came from limiting the scope of tests to only "default" matrix parameters. So far
in Airflow we always run all tests for all matrix combinations. The primary matrix components are:
* Python versions (currently 3.6, 3.7, 3.8)
* Backend types (currently MySQL/Postgres)
* Backed version (currently MySQL 5.7, MySQL 8, Postgres 9.6, Postgres 13
We've decided that instead of running all the combinations of parameters for all matrix component we will
only run default values (Python 3.6, Mysql 5.7, Postgres 9.6) for all PRs which are not approved yet by
the committers. This has a nice effect, that full set of tests (though with limited combinations of
the matrix) are still run in the CI for every Pull Request that needs tests at all - allowing the
contributors to make sure that their PR is "good enough" to be reviewed.
Even after approval, the automated workflows we've implemented, check if the PR seems to need
"full test matrix" and provide helpful information to both contributors and committers in the form of
explanatory comments and labels set automatically showing the status of the PR. Committers have still
control whether they want to merge such requests automatically or ask for rebase or re-run the tests
and run "full tests" by applying the "full tests needed" label and re-running such request.
The "full tests needed" label is also applied automatically after approval when the change touches
the "core" of Airflow - also a separate check is added to the PR so that the "merge" button status
will indicate to the committer that full tests are still needed. The committer might still decide,
whether to merge such PR without the "full matrix". The "escape hatch" we have - i.e. running the full
matrix of tests in the "merge push" will enable committers to catch and fix such problems quickly.
More about it can be found in `Approval workflow and Matrix tests <#approval-workflow-and-matrix-tests>`_
chapter.
4) We've also applied (and received) funds to run self-hosted runners. This is not yet implemented, due to
discussions about security of self-hosted runners for public repositories. Running self-hosted runners by
public repositories is currently (as of end of October 2020)
`Discouraged by GitHub <https://docs.github.com/en/free-pro-team@latest/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories>`_
and we are working on solving the problem - also involving Apache Software Foundation infrastructure team.
This document does not describe this part of the approach. Most likely we will add soon a document
describing details of the approach taken there.
Selective CI Checks
-------------------
In order to optimise our CI builds, we've implemented optimisations to only run selected checks for some
kind of changes. The logic implemented reflects the internal architecture of Airflow 2.0 packages
and it helps to keep down both the usage of jobs in GitHub Actions as well as CI feedback time to
contributors in case of simpler changes.
We have the following test types (separated by packages in which they are):
* Always - those are tests that should be always executed (always folder)
* Core - for the core Airflow functionality (core folder)
* API - Tests for the Airflow API (api and api_connexion folders)
* CLI - Tests for the Airflow CLI (cli folder)
* WWW - Tests for the Airflow webserver (www and www_rbac in 1.10 folders)
* Providers - Tests for all Providers of Airflow (providers folder)
* Other - all other tests (all other folders that are not part of any of the above)
We also have several special kinds of tests that are not separated by packages but they are marked with
pytest markers. They can be found in any of those packages and they can be selected by the appropriate
pylint custom command line options. See `TESTING.rst <TESTING.rst>`_ for details but those are:
* Integration - tests that require external integration images running in docker-compose
* Heisentests - tests that are vulnerable to some side effects and are better to be run on their own
* Quarantined - tests that are flaky and need to be fixed
* Postgres - tests that require Postgres database. They are only run when backend is Postgres
* MySQL - tests that require MySQL database. They are only run when backend is MySQL
Even if the types are separated, In case they share the same backend version/python version, they are
run sequentially in the same job, on the same CI machine. Each of them in a separate ``docker run`` command
and with additional docker cleaning between the steps to not fall into the trap of exceeding resource
usage in one big test run, but also not to increase the number of jobs per each Pull Request.
The logic implemented for the changes works as follows:
1) In case of direct push (so when PR gets merged) or scheduled run, we always run all tests and checks.
This is in order to make sure that the merge did not miss anything important. The remainder of the logic
is executed only in case of Pull Requests.
2) We retrieve which files have changed in the incoming Merge Commit (github.sha is a merge commit
automatically prepared by GitHub in case of Pull Request, so we can retrieve the list of changed
files from that commit directly).
3) If any of the important, environment files changed (Dockerfile, ci scripts, setup.py, GitHub workflow
files), then we again run all tests and checks. Those are cases where the logic of the checks changed
or the environment for the checks changed so we want to make sure to check everything.
*`Label when approved <https://github.com/TobKed/label-when-approved-action>`_
Next steps
----------
We are planning to also propose the approach to other projects from Apache Software Foundation to
make it a common approach, so that our effort is not limited only to one project.
Discussion about it in `this discussion <https://lists.apache.org/thread.html/r1708881f52adbdae722afb8fea16b23325b739b254b60890e72375e1%40%3Cbuilds.apache.org%3E>`_