Граф коммитов

151 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk 2c0920fba5
Adds mechanism for provider package discovery. (#12383)
This is a simple mechanism that will allow us to dynamically
discover and register all provider packages in the Airflow core.

Closes: #11422
2020-11-17 18:48:57 +01:00
Kamil Breguła 2cda2f2a0a
Add missing pre-commit definition - provider-yamls (#12393) 2020-11-17 15:44:46 +01:00
Kamil Breguła 7825e8f590
Docs installation improvements (#12304)
* Improvements for installation docs
2020-11-13 09:38:54 +01:00
Jarek Potiuk 21999dd56e
Added k9s as integrated tool to help with kubernetes testing (#12163)
The K9s is fantastic tool that helps to debug a running k8s
instance. It is terminal-based windowed CLI that makes you
several times more productive comparing to using kubectl
commands. We've integrated k9s (it is run as a docker container
and downloaded on demand). We've also separated out KUBECONFIG
of the integrated kind cluster so that it does not mess with
kubernetes configuration you might already have.

Also - together with that the "surrounding" of the kubernetes
tests were simplified and improved so that the k9s integration
can be utilized well. Instead of kubectl port forwarding (which
caused multitude of problems) we are now utilizing kind's
portMapping feature + custom NodePort resource that maps
port 8080 to 30007 NodePort which in turn maps it to 8080
port of the Webserver. This way we do not have to establish
an external kubectl port forward which is prone to error and
management - everything is brought up when Airflow gets
deployed to the Kind Cluster and shuts down when the Kind
cluster is stopped.

Yet another problem fixed was killing of postgres by one of the
kubernetes tests ('test_integration_run_dag_with_scheduler_failure').
Instead of just killing the scheduler it killed all pods - including
the Postgres one (it was named 'airflow-postgres.*'). That caused
various problems, as the database could be left in a strange state.
I changed the tests to do what it claimed was doing - so killing only the
scheduler during the test. This seemed to improve the stability
of tests immensely in my local setup.
2020-11-11 17:15:02 +01:00
Jarek Potiuk 348510f86b
Providers in extras are properly configured and verified (#12265)
* Providers in extras are properly configured and verified

This fixes #12255 - where we published beta2 release with some
extras pulling non-existing providers.

The exact list of providers that had problems:

Wrongly named extras/providers:

* apache.presto: it was badly named -> renamed to 'presto'
* spark (badly pointing to spark instead of apache.spark)
* yandexcloud (the name remains there but we've also added 'yandex' extra to correspond 1-1 with 'yandex' provider

Extras that were wrongly marked as having providers, where they had
none:

* dask
* rabbitmq
* sentry
* statsd
* tableau
* virtualenv

* Update scripts/ci/pre_commit/pre_commit_check_extras_have_providers.py

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>

* Update scripts/ci/pre_commit/pre_commit_check_extras_have_providers.py

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-11-11 17:13:57 +01:00
Tomek Urbaszek 0cd1c846b2
Remove providers imports from core examples (#12252)
Core example DAGs should not depend on any non-core dependency
like providers packages.

closes: #12247

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>
2020-11-10 22:49:08 +01:00
John Bampton 7463b6bcc7
Add Markdown linting to pre-commit (#11465) 2020-11-10 03:37:45 +01:00
Jarek Potiuk b2a28d1590
Moves provider packages scripts to dev (#12082)
The change #10806 made airflow works with implicit packages
when "airflow" got imported. This is a good change, however
it has some unforeseen consequences. The 'provider_packages'
script copy all the providers code for backports in order
to refactor them to the empty "airflow" directory in
provider_packages folder. The #10806 change turned that
empty folder in 'airflow' package because it was in the
same directory as the provider_packages scripts.

Moving the scripts to dev solves this problem.
2020-11-09 13:27:10 +01:00
Jarek Potiuk 57b273a0b1
Fixed path of the test_core.py file in docs (#12191)
The test_core.py has been used as example in Breeze and it's
location changed to tests/core folder. This PR fixes references
to the changed location.
2020-11-09 10:34:06 +01:00
Kaxil Naik 8c42cf1b00
Use PyUpgrade to use Python 3.6 features (#11447)
Use features like `f-strings` instead of format across the code-base.
More details: https://github.com/asottile/pyupgrade
2020-11-03 21:53:59 +00:00
SZN 2354bd2be3
Checks if all the libraries in setup.py are listed in installation.rst file (#12023) 2020-11-02 14:17:41 +01:00
Daniel Imberman 0d1ad6648e
Add Python Helm testing framework (#11693)
* Helm Python Testing

* helm change

* add back args
2020-10-27 18:29:47 -07:00
Jarek Potiuk 8d94214575
Switch postgres from 10 to 13 (#11785)
Seems that postgres is really stable when it comes to upgrades,
so we take the assumption that if we test 9.6 and 13, and they
work, all the versions between will also work.

This PR changes Postgres 10 to 13 in tests  and updates documentation
with all the versions in between.
2020-10-24 14:39:01 +02:00
John Bampton 172820db4d
Fix case of GitHub (#11398) 2020-10-21 14:32:41 +02:00
Jarek Potiuk ae06ad01a2
Fixes versioning for pre-release provider packages (#11586)
When we prepare pre-release versions, they are not intended to be
converted to final release versions, so there is no need to replace
version number for them artificially,

For release candidates on the other hand, we should internally use the
"final" version because those packages might be simply renamed to the
final "production" versions.

Fixes #11585
2020-10-19 12:32:07 +02:00
Kaxil Naik 76dd8d0367
Fix typo in BREEZE.rst (#11637) 2020-10-18 20:45:25 +02:00
Jarek Potiuk 66ced72fca
Name and optionally preserve data volumes in Breeze (#11628)
So far breeze used in-container data for persisting it (mysql redis,
postgres). This means that the data was kept as long, as long the
containers were running. If you stopped Breeze via `stop` command
the data was always deleted.

This changes the behaviour - each of the Breeze containers has
a named volume where data is kept. Those volumes are also deleted
by default when Breeze is stopped, but you can choose to preserve
them by adding ``--preserve-volumes`` when you run ``stop`` or
``restart`` command.

Fixes: #11625
2020-10-18 16:39:44 +02:00
Tomek Urbaszek e74b861fd8
Expose flower and redis ports in breeze (#11624) 2020-10-18 11:46:22 +02:00
Jarek Potiuk 925f7619e1
Behaviour to install all airflow providers added (#11529)
In Airflow 2.0 we decided to split Airlow into separate providers.
this means that when you prepare core airflow package, providers
are not installed by default. This is not very convenient for
local development though and for docker images built from sources,
where you would like to install all providers by default.

A new INSTALL_ALL_AIRFLOW_PROVIDERS environment variable controls
this behaviour now. It is is set to "true", all packages including
provider packages are installed. If missing or set to false, only
the core provider package is installed.

For Breeze, the default is set to "true", as for those cases you
want to install all providers in your environment. Similarly if you
build the production image from sources. However when you build
image using github tag or pip package, you should specify
appropriate extras to install the required provider packages.

Note that if you install Airflow via 'pip install .' from sources
in local virtualenv, provider packages are not going to be
installed unless you set INSTALL_ALL_AIRFLOW_PROVIDERS to "true".

Fixes #11489
2020-10-17 11:16:28 +02:00
Felix Uellendall 5d4fbcebe7
Clarify breeze docs --install-airflow-version/-reference (#11570)
* Clarify breeze docs --install-airflow-version/-reference

* Add to automated bash scripts
2020-10-16 10:50:17 +02:00
Jarek Potiuk e7dc964619
Adds capability of installing wheel packages in CI image (#11527)
The production image had the capability of installing images from
wheels (for security teams/air-gaped systems). This capability
might also be useful when building CI image espeically when
we are installing separately core and providers packages and
we do not yet have provider packages available in PyPI.

This is an intermediate step to implement #11490
2020-10-15 15:19:18 +02:00
Jarek Potiuk 32f2a45819
Rename backport packages to provider packages (#11459)
In preparation for adding provider packages to 2.0 line we
are renaming backport packages to provider packages.

We want to implement this in stages - first to rename the
packages, then split-out backport/2.0 providers as part of
the #11421 issue.
2020-10-12 16:29:48 +02:00
Jarek Potiuk 5bc5994c2c
Split tests to more sub-types (#11402)
We seem to have a problem with running all tests at once - most
likely due to some resource problems in our CI, therefore it makes
sense to split the tests into more batches. This is not yet full
implementation of selective tests but it is going in this direction
by splitting to Core/Providers/API/CLI tests. The full selective
tests approach will be implemented as part of #10507 issue.

This split is possible thanks to #10422 which moved building image
to a separate workflow - this way each image is only built once
and it is uploaded to a shared registry, where it is quickly
downloaded from rather than built by all the jobs separately - this
way we can have many more jobs as there is very little per-job
overhead before the tests start runnning.
2020-10-11 07:40:31 -07:00
Jarek Potiuk 04973904c3
Constraints and PIP packages can be installed from local sources (#11382)
* Constraints and PIP packages can be installed from local sources

This is the final part of implementing #11171 based on feedback
from enterprise customers we worked with. They want to have
a capability of building the image using binary wheel packages
that are locally available and the official Dockerfile. This means
that besides the official APT sources the Dockerfile build should
not needd GitHub, nor any other external files pulled from outside
including PIP repository.

This change also includes documentation on how to prepare set of
such binaries ready for inspection and review by security teams
in Enterprise environment. Such sets of "known-working-binary-whl"
files can then be separately committed, tracked and scrutinized
in an artifact repository of such an Enterprise.

Fixes: #11171

* Update docs/production-deployment.rst
2020-10-10 12:58:09 +02:00
John Bampton 39fc961eec
Fix case of JavaScript. (#10957) 2020-10-10 00:50:31 +02:00
Jarek Potiuk d752575e78
Revert "Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348)" (#11356)
This reverts commit f67e6cb805.
2020-10-10 00:41:11 +02:00
Kaxil Naik ba60836456
Fix command to run tmux with breeze in BREEZE.rst (#11340)
`breeze --start-airflow` -> `breeze start-airflow`
2020-10-08 08:47:56 -07:00
Ash Berlin-Taylor f67e6cb805
Revert "Adds --install-wheels flag to breeze command line (#11317)" (#11348)
This reverts commit de07d135ae.
2020-10-08 14:35:04 +01:00
Jarek Potiuk de07d135ae
Adds --install-wheels flag to breeze command line (#11317)
If this flag is specified it will look for wheel packages placed in dist
folder and it will install the wheels from there after installing
Airflow. This is useful for testing backport packages as well as in the
future for testing provider packages for 2.0.
2020-10-08 10:06:53 +02:00
Jarek Potiuk 22c6a843d7
Adds --no-rbac-ui flag for Breeze airflow 1.10 installation (#11315)
When installing airflow 1.10 via breeze we now enable rbac
by default, but we can disable it with --no-rbac-ui flag.

This is useful to test different variants of 1.10 when testing
release candidataes in connection with the 'start-airflow'
command.
2020-10-07 01:00:00 +01:00
Kaxil Naik 6dce7a6c26
Enable MySQL 8 CI jobs (#11247)
closes https://github.com/apache/airflow/issues/11164
2020-10-04 13:45:05 +02:00
Jarek Potiuk ebd7150862
More customizable build process for Docker images (#11176)
* Allows more customizations for image building.

This is the third (and not last) part of making the Production
image more corporate-environment friendly. It's been prepared
for the request of one of the big Airflow user (company) that
has rather strict security requirements when it comes to
preparing and building images. They are committed to
synchronizing with the progress of Apache Airflow 2.0 development
and making the image customizable so that they can build it using
only sources controlled by them internally was one of the important
requirements for them.

This change adds the possibilty of customizing various steps in
the build process:

* adding custom scripts to be run before installation of both
  build image and runtime image. This allows for example to
  add installing custom GPG keys, and adding custom sources.

* customizing the way NodeJS and Yarn are installed in the
  build image segment - as they might rely on their own way
  of installation.

* adding extra packages to be installed during both build and
  dev segment build steps. This is crucial to achieve the same
  size optimizations as the original image.

* defining additional environment variables (for example
  environment variables that indicate acceptance of the EULAs
  in case of installing proprietary packages that require
  EULA acceptance - both in the build image and runtime image
  (again the goal is to keep the image optimized for size)

The image build process remains the same when no customization
options are specified, but having those options increases
flexibility of the image build process in corporate environments.

This is part of #11171.

This change also fixes some of the issues opened and raised by
other users of the Dockerfile.

Fixes: #10730
Fixes: #10555
Fixes: #10856

Input from those issues has been taken into account when this
change was designed so that the cases described in those issues
could be implemented. Example from one of the issue landed as
an example way of building highly customized Airflow Image
using those customization options.

Depends on #11174

* Update IMAGES.rst

Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
2020-09-29 15:30:00 +02:00
Omair Khan 68e0eb6976
in_container bats pre-commit hook and updated bats-tests hook (#11179) 2020-09-29 11:59:06 +02:00
Jarek Potiuk c9a34d2ef9
Optionally tags image when building with Breeze (#11181)
Breeze tags the image based on the default python version,
branch, type of the image, but you might want to tag the image
in the same command especially in automated cases of building
the image via CI scripts or security teams that tag the imge
based on external factors (build time, person etc.).

This is part of #11171 which makes the image easier to build in
corporate environments.
2020-09-29 11:45:37 +02:00
Jarek Potiuk 044b441257
Conditional MySQL Client installation (#11174)
This is the second step of making the Production Docker Image more
corporate-environment friendly, by making MySQL client installation
optional. Instaling MySQL Client on Debian requires to reach out
to oracle deb repositories which might not be approved by security
teams when you build the images. Also not everyone needs MySQL
client or might want to install their own MySQL client or MariaDB
client - from their own repositories.

This change makes the installation step separated out to
script (with prod/dev installation option). The prod/dev separation
is needed because MySQL needs to be installed with dev libraries
in the "Build" segment of the image (requiring build essentials
etc.) but in "Final" segment of the image only runtime libraries
are needed.

Part of #11171

Depends on #11173.
2020-09-27 18:56:58 +02:00
mucio 0db7a30782
New Breeze command start-airflow, it replaces the previous flag (#11157) 2020-09-27 18:31:50 +02:00
Jarek Potiuk f16354bc02
Optionally disables PIP cache from GitHub during the build (#11173)
This is first step of implementing the corporate-environment
friendly way of building images, where in the corporate
environment, this might not be possible to install the packages
using the GitHub cache initially.

Part of #11171
2020-09-27 18:00:03 +02:00
Jarek Potiuk 620b0989b8
Add Helm Chart linting (#11108) 2020-09-24 13:02:11 +02:00
Kaxil Naik 7644c37082
Revert "Introducing flags to skip example dags and default connections (#11099)" (#11110)
This reverts commit 0edc3dd579.
2020-09-23 19:47:43 +01:00
mucio 0edc3dd579
Introducing flags to skip example dags and default connections (#11099) 2020-09-23 14:56:29 +02:00
Tomek Urbaszek 29d62977d3
Fix s.apache.org Slack link (#11078)
Remove ending / from s.apache.org Slack link
2020-09-22 11:33:49 +02:00
Kaxil Naik e3a590075e
Replace Airflow Slack Invite old link to short link (#11071)
Follow up to https://github.com/apache/airflow/pull/10034

https://apache-airflow-slack.herokuapp.com/ to https://s.apache.org/airflow-slack/
2020-09-22 10:46:44 +02:00
Jarek Potiuk 3db4d3b04d
All versions in CI yamls are not hard-coded any more (#10959)
GitHub Actions allow to use `fromJson` method to read arrays
or even more complex json objects into the CI workflow yaml files.

This, connected with set::output commands, allows to read the
list of allowed versions as well as default ones from the
environment variables configured in
./scripts/ci/libraries/initialization.sh

This means that we can have one plece in which versions are
configured. We also need to do it in "breeze-complete" as this is
a standalone script that should not source anything we added
BATS tests to verify if the versions in breeze-complete
correspond with those defined in the initialization.sh

Also we do not limit tests any more in regular PRs now - we run
all combinations of available versions. Our tests run quite a
bit faster now so we should be able to run more complete
matrixes. We can still exclude individual values of the matrixes
if this is too much.

MySQL 8 is disabled from breeze for now. I plan a separate follow
up PR where we will run MySQL 8 tests (they were not run so far)
2020-09-21 20:02:04 +02:00
mucio 17faea0b5c
Starting breeze will run an init script after the environment is setup (#11029)
Added the possibility to run an init script
2020-09-21 11:58:30 +01:00
Jarek Potiuk b2dc346062
Make breeeze-complete Google Shell Guide compatible (#10708)
Also added unit tests for breeze-complete
Part of #10576
2020-09-14 10:21:09 +02:00
Kaxil Naik 76dc7ed027
Fix grammar in BREEZE.rst (#10904)
`Other uses the Airflow Breeze environment` -> `Other uses of the Airflow Breeze environment`
2020-09-13 16:43:01 +02:00
Kaxil Naik d8237b84cb
Fix typos in BREEZE.rst (#10905)
`lunch` -> `launch`
`disto` -> `distro`
2020-09-13 16:42:30 +02:00
Jarek Potiuk 106c0f556f
Add pre-commit to sort INTHEWILD.md file automatically (#10851) 2020-09-12 18:26:12 +02:00
mucio 47e592e3a0
Flag --start-airflow for breeze (#10837) 2020-09-11 23:26:56 +02:00
Ash Berlin-Taylor 59e8341d6e
Add new lint check to now allow realtive imports (#10825)
Relative and absolute imports are functionally equivalent, the only
pratical difference is that relative is shorter.

But it is also less obvious what exactly is imported, and harder to find
such imports with simple tools (such as grep).

Thus we have decided that Airflow house style is to use absolute imports
only
2020-09-10 18:07:50 +01:00