Граф коммитов

64 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk 1f67edd127 Run kubernetes tests in parallel (#15222)
(cherry picked from commit ea0710edc1)
2021-04-15 14:00:32 +01:00
Jarek Potiuk dcf7f9cffb Adds 'Trino' provider (with lower memory footprint for tests) (#15187)
While checking the test status of various CI tests we came to
conclusion that Presto integration took a lot of memory (~1GB)
and was the main source of failures during integration tests,
especially with MySQL8. The attempt to fine-tune the memory
used turned out in the discovery, that Presto DB stopped
publishing their Docker image (prestosql/presto) - apparently
after the aftermath of splitting-off Trino from Presto.

Th split-off was already discussed in #14281 and it was planned
to add support for Trino (which is the more community-driven
fork of the Presto - Presto remained at Facebook Governance,
where Trino is an effort continued by the original creators.

You can read more about it in the announcement:
https://trino.io/blog/2020/12/27/announcing-trino.html. While
Presto continues their way under The Linux Foundation, Trino
lives its own live and keeps on maintaining all artifacts and
libraries (including the image). That allowed us to update
our tests and decrease the memory footprint by around 400MB.

This commit:

* adds the new Trino provider
* removes `presto` integration and replaces it with `trino`
* the `trino` integartion image is built with 400MB less memory
  requirementes and published as `apache/airflow:trino-*`
* moves the integration tests from Presto to Trino

Fixes: #14281
(cherry picked from commit eae22cec9c)
2021-04-15 14:00:32 +01:00
Jarek Potiuk 0b881c566f Running tests in parallel (#14915)
This is by far the biggest improvements of the test execution time
we can get now when we are using self-hosted runners.

This change drives down the time of executing all tests on
self-hosted runners from ~ 50 minutes to ~ 13 minutes due to heavy
parallelisation we can implement for different test types and the
fact that our machines for self-hosted runners are far more
capable - they have more CPU, more memory and the fact that
we are using tmpfs for everything.

This change will also drive the cost of our self-hosted runners
down. Since we have auto-scaling infrastructure we will simply need
the machines to run tests for far shorter time. Since the number
of test jobs we run on those self hosted runners is substantial
(10 jobs), we are going to save ~ 6 build hours per one PR/merged
commit!

This also allows the developers to use the power of their
development machines - when you use
`./scripts/ci/testing/ci_run_airflow_testing.sh` the script
detects how many CPU cores are available and it will run as
many parallel test types as many cores you have.

Also in case of Integration tests - they require more memory to run
all the integrations, so in case there is less than ~ 32 GB of RAM
available to Docker, the integration tests are run sequentially
at the end. This drives stability up for machines with lower memory.

On one personal PC (64GB RAM, 8 CPUS/16 cores, fast SSD) the full
test suite execution went down from 30 minutes to 5 minutes.

There is a continuous progress information printed every 10 seconds when
either parallel or sequential tests are run, and the full output is
shown at the end - failed tests are marked in red groups, and succesful are
marked in green groups. This makes it easier to see and analyse errors.

(cherry picked from commit 01a5d36e6b)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 12ad529a3a Remove Backport Providers (#14886)
We are removing support for Backport Providers now.

The last release was sent yesterday- as planned, on 17 March 2021 - the
last release of the Backport Providers.

As agreed before, and documented here:
https://github.com/apache/airflow/blob/master/dev/PROJECT_GUIDELINES.md#support-for-backport-providers

> Backport providers within 1.10.x, will be supported for critical fixes
for three months (March 17, 2021) from Airflow 2.0.0 release date (Dec
17, 2020).

For the future reference, if anyone would like to build backport
providers with cherry-picking any fixes, the branch to start from is
`legacy-backport-cutoff-point`. The documentation and tools to build the
backports are there, but there will be no more community releases for
backports.

Good Bye Backport Providers.

(cherry picked from commit 68e4c4dcb0)
2021-04-15 14:00:31 +01:00
Jarek Potiuk f991734586 Remove Heisentest category and quarantine test_backfill_depends_on_past (#14756)
The whole Backfill class was in Heisentest but only one of those tests
is problematic nowi: test_backfill_depends_on_past. Therfore it makes
sense to remove the class from heisentests and move the
depends_on_past to quarantine.

It turned out that this is the last "Heisentest" and with the
isolation we have now coming in parallel tests, it turns out that
Heisentests are not really good way thinking about the tests - running
them in isolation does not often help, it only makes it more difficult
to flag the tests as flaky.

The quarantine test_backfill_depends_on_past ihas been captured in
the #14755 issue - and hopefully we will make an effort to
de-quarantine some of those tests soon.

(cherry picked from commit 4ce952e7c2)
2021-04-15 14:00:30 +01:00
Joshua Carp 25e91c85af Use plain asserts in tests. (#12951)
(cherry picked from commit 39d9057984)
2021-01-21 19:53:16 +00:00
Jarek Potiuk b0d1411e07 Introduces separate runtime provider schema (#13488)
The provider.yaml contains more information that required at
runtime (specifically about documentation building). Those
fields are not needed at runtime and their presence is optional.
Also the runtime check for provider information should be more
relexed and allow for future compatibility (with
additional properties set to false). This way we can add new,
optional fields to provider.yaml without worrying about breaking
future-compatibility of providers with future airflow versions.

This changei restores 'additionalProperties': false in the
main, development-focused provider.yaml schema and introduced
new runtime schema that is used to verify the provider info when
providers are discovered by airflow.

This 'runtime' version should change very rarely as change to
add a new required property in it breaks compatibility of
providers with already released versions of Airflow.

We also trim-down the provider.yaml file when preparing provider
packages to only contain those fields that are required in the
runtime schema.

(cherry picked from commit ad2a030b9e)
2021-01-21 19:30:23 +00:00
Kamil Breguła 1a4c651267 Add integration tests for Apache Pinot (#13195)
* Add integration tests for Apache Pinot

* fixup! Add integration tests for Apache Pinot

* fixup! fixup! Add integration tests for Apache Pinot

* fixup! fixup! fixup! Add integration tests for Apache Pinot

* fixup! fixup! fixup! fixup! Add integration tests for Apache Pinot

* Update setup.cfg

(cherry picked from commit 98f097e542)
2021-01-21 18:38:23 +00:00
Kamil Breguła 2f65a1080f Fix typos and minor simplification in TESTING.rst (#13194)
(cherry picked from commit 97eee350e4)
2021-01-21 18:04:21 +00:00
Kaxil Naik 7c866a2262 Fix typos in TESTING.rst (#13169)
Since TESTING.rst is not published on Apache Site, we don't run spell check on it and hence there were some typos introuduced without getting noticed.

Time to fix them

(cherry picked from commit 5cf2fbf124)
2021-01-21 17:57:11 +00:00
Jarek Potiuk ed1825c026
Production images on CI are now built from packages (#12685)
So far, the production images of Airflow were using sources
when they were built on CI. This PR changes that, to build
airflow + providers packages first and install them
rather than use sources as installation mechanism.

Part of #12261
2020-12-06 23:36:33 +01:00
Jarek Potiuk cbd6daf5e6
All kubernetes tests use the same host python version (#12374)
For Kubernetes tests all tests can be executed in the same python
version - default one - no matter which PYTHON_MAJOR_MINOR is
used. This is because we are testing Airflow which is deployed
via production image. Thanks to that we can fix the python version
to be default and avoid any python version problems (this is
especially important for cherry-picking to 1.10 where we have
python 2.7 and 3.5.
2020-11-15 14:20:22 +01:00
Jarek Potiuk 21999dd56e
Added k9s as integrated tool to help with kubernetes testing (#12163)
The K9s is fantastic tool that helps to debug a running k8s
instance. It is terminal-based windowed CLI that makes you
several times more productive comparing to using kubectl
commands. We've integrated k9s (it is run as a docker container
and downloaded on demand). We've also separated out KUBECONFIG
of the integrated kind cluster so that it does not mess with
kubernetes configuration you might already have.

Also - together with that the "surrounding" of the kubernetes
tests were simplified and improved so that the k9s integration
can be utilized well. Instead of kubectl port forwarding (which
caused multitude of problems) we are now utilizing kind's
portMapping feature + custom NodePort resource that maps
port 8080 to 30007 NodePort which in turn maps it to 8080
port of the Webserver. This way we do not have to establish
an external kubectl port forward which is prone to error and
management - everything is brought up when Airflow gets
deployed to the Kind Cluster and shuts down when the Kind
cluster is stopped.

Yet another problem fixed was killing of postgres by one of the
kubernetes tests ('test_integration_run_dag_with_scheduler_failure').
Instead of just killing the scheduler it killed all pods - including
the Postgres one (it was named 'airflow-postgres.*'). That caused
various problems, as the database could be left in a strange state.
I changed the tests to do what it claimed was doing - so killing only the
scheduler during the test. This seemed to improve the stability
of tests immensely in my local setup.
2020-11-11 17:15:02 +01:00
Jarek Potiuk 57b273a0b1
Fixed path of the test_core.py file in docs (#12191)
The test_core.py has been used as example in Breeze and it's
location changed to tests/core folder. This PR fixes references
to the changed location.
2020-11-09 10:34:06 +01:00
Kaxil Naik 186a368d9a
Fix Helm Chart Testing guide (#11909) 2020-10-28 12:26:31 +00:00
Daniel Imberman 0d1ad6648e
Add Python Helm testing framework (#11693)
* Helm Python Testing

* helm change

* add back args
2020-10-27 18:29:47 -07:00
Jarek Potiuk 32f2a45819
Rename backport packages to provider packages (#11459)
In preparation for adding provider packages to 2.0 line we
are renaming backport packages to provider packages.

We want to implement this in stages - first to rename the
packages, then split-out backport/2.0 providers as part of
the #11421 issue.
2020-10-12 16:29:48 +02:00
Jarek Potiuk 5bc5994c2c
Split tests to more sub-types (#11402)
We seem to have a problem with running all tests at once - most
likely due to some resource problems in our CI, therefore it makes
sense to split the tests into more batches. This is not yet full
implementation of selective tests but it is going in this direction
by splitting to Core/Providers/API/CLI tests. The full selective
tests approach will be implemented as part of #10507 issue.

This split is possible thanks to #10422 which moved building image
to a separate workflow - this way each image is only built once
and it is uploaded to a shared registry, where it is quickly
downloaded from rather than built by all the jobs separately - this
way we can have many more jobs as there is very little per-job
overhead before the tests start runnning.
2020-10-11 07:40:31 -07:00
Jarek Potiuk b746f33fc6
Removes stable tests from quarantine (#10768)
We've observed the tests for last couple of weeks and it seems
most of the tests marked with "quarantine" marker are succeeding
in a stable way (https://github.com/apache/airflow/issues/10118)
The removed tests have success ratio of > 95% (20 runs without
problems) and this has been verified a week ago as well,
so it seems they are rather stable.

There are literally few that are either failing or causing
the Quarantined builds to hang. I manually reviewed the
master tests that failed for last few weeks and added the
tests that are causing the build to hang.

Seems that stability has improved - which might be casued
by some temporary problems when we marked the quarantined builds
or too "generous" way of marking test as quarantined, or
maybe improvement comes from the #10368 as the docker engine
and machines used to run the builds in GitHub experience far
less load (image builds are executed in separate builds) so
it might be that resource usage is decreased. Another reason
might be Github Actions stability improvements.

Or simply those tests are more stable when run isolation.

We might still add failing tests back as soon we see them behave
in a flaky way.

The remaining quarantined tests that need to be fixed:
 * test_local_run (often hangs the build)
 * test_retry_handling_job
 * test_clear_multiple_external_task_marker
 * test_should_force_kill_process
 * test_change_state_for_tis_without_dagrun
 * test_cli_webserver_background

We also move some of those tests to "heisentests" category
Those testst run fine in isolation but fail
the builds when run with all other tests:
 * TestImpersonation tests

We might find that those heisentest can be fixed but for
now we are going to run them in isolation.

Also - since those quarantined tests are failing more often
the "num runs" to track for those has been decreased to 10
to keep track of 10 last runs only.
2020-09-08 07:36:12 +02:00
Kamil Breguła 2ca615cffe
Update Google Cloud branding (#10642) 2020-08-29 23:36:52 +02:00
Jarek Potiuk 3357d8dad2
Fix port number in webserver for kind setup (#10452) 2020-08-21 20:29:09 +02:00
Tomek Urbaszek 95632ce8ed
Fix dag.clear usages after change from #9824 (#9909)
#9824 introduced changes in the signature of dag.clear(...) 
but not all occurrences of invocation were adjusted.
2020-07-21 12:47:39 +02:00
Jarek Potiuk faec41ec9a
Group CI scripts in subdirectories (#9653)
Reviewed the scripts and removed some of the old unused ones.
2020-07-16 18:05:35 +02:00
Jarek Potiuk f3e1f9a313
Update Breeze documentation (#9608)
* Update Breeze documentation
2020-07-01 16:02:24 +02:00
Jarek Potiuk 8bd15ef634
Switches to Helm Chart for Kubernetes tests (#9468)
The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.

The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.

This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Daniel Imberman <daniel@astronomer.io>
2020-07-01 14:50:30 +02:00
Jarek Potiuk 7c12a9d4e0
Improve production image iteration speed (#9162)
For a long time the way how entrypoint worked in ci scripts
was wrong. The way it worked was convoluted and short of black
magic. This did not allow to pass multiple test targets and
required separate execute command scripts in Breeze.

This is all now straightened out and both production and
CI image are always using the right entrypoint by default
and we can simply pass parameters to the image as usual without
escaping strings.

This also allowed to remove some breeze commands and
change names of several flags in Breeze to make them more
meaningful.

Both CI and PROD image have now embedded scripts for log
cleaning.

History of image releases is added for 1.10.10-*
alpha quality images.
2020-06-16 12:36:46 +02:00
Jarek Potiuk a39e9a3520
Replaces cloud-provider CLIs in CI image with scripts running containers (#9129)
The clis are replaced with scripts that will pull and run
docker images when they are needed.

Added Azure CLI as well.

Closes: #8946 #8947 #8785
2020-06-04 19:12:09 +02:00
Jarek Potiuk ff5dcccbbd
Kubernetes Cluster is started on host not in the container (#8265)
Tests requiring Kubernetes Cluster are now moved out of
the regular CI tests and moved to "kubernetes_tests" folder
so that they can be run entirely on host without having
the CI image built at all. They use production image
to run the tests on KinD cluster and we add tooling
to start/stop/deploy the application to the KinD cluster
automatically - for both CI testing and local development.

This is a pre-requisite to convert the tests to convert the
tests to use the official Helm Chart and Docker images or
Apache Airflow.

It closes #8782
2020-06-03 20:58:38 +02:00
Adam Dobrawy f3456b125f
Fix formatting code block in TESTING.rst (#8985) 2020-05-23 11:43:53 +02:00
Jarek Potiuk 92585ca4cb
Added automated release notes generation for backport operators (#8807)
We have now mechanism to keep release notes updated for the
backport operators in an automated way.

It really nicely generates all the necessary information:

* summary of requirements for each backport package
* list of dependencies (including extras to install them) when package
  depends on other providers packages
* table of new hooks/operators/sensors/protocols/secrets
* table of moved hooks/operators/sensors/protocols/secrets with
  information where they were moved from
* changelog of all the changes to the provider package (this will be
  automatically updated with incremental changelog whenever we decide to
  release separate packages.

The system is fully automated - we will be able to produce release notes
automatically (per-package) whenever we decide to release new version of
the package in the future.
2020-05-15 19:00:15 +02:00
Jarek Potiuk 791d1a786f
Backport packages are renamed to include backport in their name (#8767) 2020-05-09 14:09:12 +02:00
Kamil Breguła b7566e16d6
Add SQL query tracking for pytest (#8754) 2020-05-08 06:36:27 +02:00
Felix Uellendall ff5b70149b
Add google_api_to_s3_transfer example dags and system tests (#8581)
- add amazon system helper for easier testing amazon aws systems / services
- fix TESTING docs
2020-05-07 09:32:29 +02:00
Jarek Potiuk 0de597f95f
The CRON job now is working and triggers builds on DockerHub (#8549)
The CRON job from previous runs did not have everything working
after the emergency migration to Github Actions.

This change brings back following improvements:

* rebuilding images from the scratch in CRON job
* automatically upgrading all requirements to test if they are new
* pushing production images to github packages as cache
* pushing nightly tag to github
2020-04-26 00:33:39 +02:00
Felix Uellendall 1ea9fa758a
Fix --forward-credentials flag in Breeze (#8554) 2020-04-25 15:51:25 +02:00
Jarek Potiuk ffcbb22c93
Move some tests to quarantine (#8511) 2020-04-23 08:51:56 +02:00
Jarek Potiuk de453a6710
List of integrations is now maintained in one place. (#8496) 2020-04-22 14:38:56 +02:00
Jarek Potiuk bd7f63b39f
Get rid of Travis CI from the docs (#8488) 2020-04-21 17:27:09 +02:00
Xinbin Huang 3f9f845cd9
Mount ${HOME}/.aws in breeze environemnt if --forward-credentials (#8183) 2020-04-08 19:09:58 +08:00
Jarek Potiuk 07fd0d71c8
Add Production Docker image support (#7832) 2020-04-02 18:52:11 +01:00
yajna pandith d33c498ef5
Update TESTING.rst (#8029)
Updating TESTING.rst with minor grammatical corrections
2020-03-31 17:43:50 +02:00
Kamil Breguła d372f230fb
[AIRFLOW-XXXX] Add guide for Travis CI and IDE setup (#7625) 2020-03-23 00:05:21 +01:00
Kamil Breguła e054bbcde6
[AIRFLOW-XXXX] Fix typo in ci_prepare_backport_packages.sh (#7778) 2020-03-20 14:47:27 +01:00
Jarek Potiuk 6b3b8a4268
[AIRFLOW-XXXX] document system tests mechanism better (#7774) 2020-03-20 11:00:10 +01:00
Kamil Breguła a6e5bcd591
[AIRFLOW-6972] Shorter frequently used commands in Breeze (#7608) 2020-03-04 01:21:11 +01:00
Jarek Potiuk d0d8732a84
[AIRFLOW-6932] Add restart-environment command to Breeze (#7557)
When you switch between versions of Aiflow installed, you want to delete the
database so that the scripts for resetdb work
2020-02-27 10:44:43 +01:00
Jarek Potiuk 83b60f0946
[AIRFLOW-6919] Make Breeze DAG-test friedly (#7539)
Originally Breeze was used to run unit and integration tests, recently system
tests and finally we make it a bit more friendly to test  your DAGs there. You
can now install any older airflow version in Breeze via
--install-airflow-version switch and "files/dags" folder is mounted to
"/files/dags" and this folder is used to read the dags from.
2020-02-26 11:11:53 +01:00
Jarek Potiuk 20b6b34392
[AIRFLOW-6838] Introduce real subcommands for Breeze (#7515)
This change introduces sub-commands in breeze tool.
It is much needed as we have many commands now
and it was difficult to separate commands from flags.

Also --help output was very long and unreadable.

With this change help it is much easier to discover
what breeze can do for you as well as navigate with it.

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>

Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
2020-02-24 22:31:50 +01:00
Matt Buell b4ce8f22f6
[AIRFLOW-XXXX] correct path to deploy_airflow_to_kubernetes.sh in TESTING.rst (#7522) 2020-02-24 17:09:30 +00:00
Jarek Potiuk 848fbab5bd
[AIRFLOW-6763] Make systems tests ready for backport tests (#7389)
We will run system test on back-ported operators for 1.10* series of airflow
and for that we need to have support for running system tests using pytest's
markers and reading environment variables passed from HOST machine (to pass
credentials). 

This is the first step to automate system tests execution.
2020-02-21 18:25:32 +01:00