incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
jao6693	d84faa36a0	Update Dockerfile.ci (#12988 ) Fix permission issue in Azure DevOps when running the script install_mysql.sh, which prevents the build to succeed /bin/bash: ./scripts/docker/install_mysql.sh: Permission denied The command '/bin/bash -o pipefail -e -u -x -c ./scripts/docker/install_mysql.sh dev' returned a non-zero code: 126 ##[error]The command '/bin/bash -o pipefail -e -u -x -c ./scripts/docker/install_mysql.sh dev' returned a non-zero code: 126 ##[error]The process '/usr/bin/docker' failed with exit code 126	2020-12-10 21:00:41 +01:00
Ash Berlin-Taylor	63ea88d1b1	Apply labels to Docker images in a single instruction (#12931 ) * Apply labels to Docker images in a single instruction While looking at the build logs for something else I noticed this oddity at the end of the CI logs: ``` Tue, 08 Dec 2020 21:20:19 GMT Step 125/135 : LABEL org.apache.airflow.distro="debian" ... Tue, 08 Dec 2020 21:21:14 GMT Step 133/135 : LABEL org.apache.airflow.commitSha=${COMMIT_SHA} Tue, 08 Dec 2020 21:21:14 GMT ---> Running in 1241a5f6cdb7 Tue, 08 Dec 2020 21:21:21 GMT Removing intermediate container 1241a5f6cdb7 ``` Applying all the labels took 1m2s! Hopefully applying these in a single layer/command should speed things up. A less extreme example still took 43s ``` Tue, 08 Dec 2020 20:44:40 GMT Step 125/135 : LABEL org.apache.airflow.distro="debian" ... Tue, 08 Dec 2020 20:45:18 GMT Step 133/135 : LABEL org.apache.airflow.commitSha=${COMMIT_SHA} Tue, 08 Dec 2020 20:45:18 GMT ---> Running in dc601207dbcb Tue, 08 Dec 2020 20:45:23 GMT Removing intermediate container dc601207dbcb Tue, 08 Dec 2020 20:45:23 GMT ---> 5aae5dd0f702 ``` * Update Dockerfile	2020-12-09 06:19:38 +01:00
Jarek Potiuk	ed1825c026	Production images on CI are now built from packages (#12685 ) So far, the production images of Airflow were using sources when they were built on CI. This PR changes that, to build airflow + providers packages first and install them rather than use sources as installation mechanism. Part of #12261	2020-12-06 23:36:33 +01:00
Ash Berlin-Taylor	2936c13a44	Get airflow version from importlib.metadata rather than hard-coding (#12786 ) One less thing to change, and one less pre-commit step needed :)	2020-12-04 16:42:25 +00:00
Jarek Potiuk	0451d84ea2	Pins PIP to 20.2.4 in our Dockerfiles (#12738 ) Until we make sure that the new resolver in PIP 20.3 works we should pin PIP to 20.2.4. This is hopefully a temporary measure. Part of #12737	2020-12-01 17:39:55 +01:00
Jarek Potiuk	fa8af2d165	Enable PIP check for both CI and PROD image (#12664 ) This PR enables PIP check after constraints have been updated to be stable and 'pip check' compliant in #12636	2020-11-27 21:33:50 +01:00
Kaxil Naik	c457c975b8	Use AIRFLOW_CONSTRAINTS_LOCATION when passed during docker build (#12604 ) Previously, even though this was passed during docker build it was ignored. This commit fixes it	2020-11-25 08:43:47 +01:00
Jarek Potiuk	37548f09ac	Fixes unneeded docker-context-files added in CI (#12534 ) We do not need to add docker-context-files in CI before we run first "cache" PIP installation. Adding it might cause the effect that the cache will always be invalidated in case someone has a file added there before building and pushing the image. This PR fixes the problem by adding docker-context files later in the Dockerfile and changing the constraints location used in the "cache" step to always use the github constraints in this case. Closes #12509	2020-11-21 19:21:43 +01:00
Jarek Potiuk	167b9b9889	Simplifies check whether the CI image should be rebuilt (#12181 ) Rather than counting changed layers in the image (which was enigmatic, difficult and prone to some magic number) we rely now on random file generated while building the image. We are using the docker image caching mechanism here. The random file will be regenerated only when the previous layer (which is about installling Airflow dependencies for the first time) gets rebuild. And for us this is the indication, that the building the image will take quite some time. This layer should be relatively static - even if setup.py changes the CI image is designed in the way that the first time installation of Airflow dependencies is not invalidated. This should lead to faster and less frequent rebuild for people using Breeze and static checks.	2020-11-13 22:21:39 +01:00
Jarek Potiuk	9b7e7603c4	Docker context files should be available earlier (#12219 ) If you want to override constraints with local version, the docker-context-files should be earlier in the Dockerfile	2020-11-11 11:00:16 +01:00
Daniel Imberman	0d1ad6648e	Add Python Helm testing framework (#11693 ) * Helm Python Testing * helm change * add back args	2020-10-27 18:29:47 -07:00
Ash Berlin-Taylor	d5bfffca81	Use packaging.version, not semver module for version comparisons (#11854 ) Semver module doesn't like python version specifiers such as `0.0.2a1` -- since packaging module is already a dep from setup tools, and is what the python ecosystem uses to do version handling it makes sense to use it.	2020-10-26 15:13:09 +00:00
Jarek Potiuk	925f7619e1	Behaviour to install all airflow providers added (#11529 ) In Airflow 2.0 we decided to split Airlow into separate providers. this means that when you prepare core airflow package, providers are not installed by default. This is not very convenient for local development though and for docker images built from sources, where you would like to install all providers by default. A new INSTALL_ALL_AIRFLOW_PROVIDERS environment variable controls this behaviour now. It is is set to "true", all packages including provider packages are installed. If missing or set to false, only the core provider package is installed. For Breeze, the default is set to "true", as for those cases you want to install all providers in your environment. Similarly if you build the production image from sources. However when you build image using github tag or pip package, you should specify appropriate extras to install the required provider packages. Note that if you install Airflow via 'pip install .' from sources in local virtualenv, provider packages are not going to be installed unless you set INSTALL_ALL_AIRFLOW_PROVIDERS to "true". Fixes #11489	2020-10-17 11:16:28 +02:00
Jarek Potiuk	e7dc964619	Adds capability of installing wheel packages in CI image (#11527 ) The production image had the capability of installing images from wheels (for security teams/air-gaped systems). This capability might also be useful when building CI image espeically when we are installing separately core and providers packages and we do not yet have provider packages available in PyPI. This is an intermediate step to implement #11490	2020-10-15 15:19:18 +02:00
Jarek Potiuk	16e7129719	Added support for provider packages for Airflow 2.0 (#11487 ) * Separate changes/readmes for backport and regular providers We have now separate release notes for backport provider packages and regular provider packages. They have different versioning - backport provider packages with CALVER, regular provider packages with semver. * Added support for provider packages for Airflow 2.0 This change consists of the following changes: * adds provider package support for 2.0 * adds generation of package readme and change notes * versions are for now hard-coded to 0.0.1 for first release * adds automated tests for installation of the packages * rename backport package readmes/changes to BACKPORT_* * adds regulaar packge readmes/changes * updates documentation on generating the provider packaes * adds CI tests for the packages * maintains backport packages generation with --backports flag Fixes #11421 Fixes #11424	2020-10-13 16:33:00 +01:00
Kaxil Naik	7f674c685d	Use only-if-needed upgrade strategy for PRs (#11363 ) Currently, upgrading dependencies in setup.py still runs with previous versions of the package for the PR which fails. This will change to upgrade only the package that is required for the PRs	2020-10-09 09:57:51 +02:00
Jarek Potiuk	e89d384688	The bats script for CI image is now placed in the docker folder (#11262 ) The script was previously placed in scripts/ci which caused a bit of a problem in 1-10-test branch where PRs were using scripts/ci from the v1-10-test HEAD but they were missing the ci script from the PR. The scripts "ci" are parts of the host scripts that are always taken from master when the image is built, but all the other stuff should be taken from "docker" folder - which will be taken from the PR.	2020-10-04 08:30:11 +02:00
Jarek Potiuk	ebd7150862	More customizable build process for Docker images (#11176 ) * Allows more customizations for image building. This is the third (and not last) part of making the Production image more corporate-environment friendly. It's been prepared for the request of one of the big Airflow user (company) that has rather strict security requirements when it comes to preparing and building images. They are committed to synchronizing with the progress of Apache Airflow 2.0 development and making the image customizable so that they can build it using only sources controlled by them internally was one of the important requirements for them. This change adds the possibilty of customizing various steps in the build process: * adding custom scripts to be run before installation of both build image and runtime image. This allows for example to add installing custom GPG keys, and adding custom sources. * customizing the way NodeJS and Yarn are installed in the build image segment - as they might rely on their own way of installation. * adding extra packages to be installed during both build and dev segment build steps. This is crucial to achieve the same size optimizations as the original image. * defining additional environment variables (for example environment variables that indicate acceptance of the EULAs in case of installing proprietary packages that require EULA acceptance - both in the build image and runtime image (again the goal is to keep the image optimized for size) The image build process remains the same when no customization options are specified, but having those options increases flexibility of the image build process in corporate environments. This is part of #11171. This change also fixes some of the issues opened and raised by other users of the Dockerfile. Fixes: #10730 Fixes: #10555 Fixes: #10856 Input from those issues has been taken into account when this change was designed so that the cases described in those issues could be implemented. Example from one of the issue landed as an example way of building highly customized Airflow Image using those customization options. Depends on #11174 * Update IMAGES.rst Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>	2020-09-29 15:30:00 +02:00
Omair Khan	68e0eb6976	in_container bats pre-commit hook and updated bats-tests hook (#11179 )	2020-09-29 11:59:06 +02:00
Kaxil Naik	2ec12474ff	Fix typos in Dockerfile.ci (#11187 ) Fixed some spellings	2020-09-29 07:41:05 +02:00
Jarek Potiuk	044b441257	Conditional MySQL Client installation (#11174 ) This is the second step of making the Production Docker Image more corporate-environment friendly, by making MySQL client installation optional. Instaling MySQL Client on Debian requires to reach out to oracle deb repositories which might not be approved by security teams when you build the images. Also not everyone needs MySQL client or might want to install their own MySQL client or MariaDB client - from their own repositories. This change makes the installation step separated out to script (with prod/dev installation option). The prod/dev separation is needed because MySQL needs to be installed with dev libraries in the "Build" segment of the image (requiring build essentials etc.) but in "Final" segment of the image only runtime libraries are needed. Part of #11171 Depends on #11173.	2020-09-27 18:56:58 +02:00
Jarek Potiuk	f16354bc02	Optionally disables PIP cache from GitHub during the build (#11173 ) This is first step of implementing the corporate-environment friendly way of building images, where in the corporate environment, this might not be possible to install the packages using the GitHub cache initially. Part of #11171	2020-09-27 18:00:03 +02:00
Jarek Potiuk	52fdb62314	Requirements might get upgraded without setup.py change (#10784 ) I noticed that when there is no setup.py changes, the constraints are not upgraded automatically. This is because of the docker caching strategy used - it simply does not even know that the upgrade of pip should happen. I believe this is really good (from security and incremental updates POV to attempt to upgrade at every successfull merge (not that the upgrade will not be committed if any of the tests fail and this is only happening on every merge to master or scheduled run. This way we will have more often but smaller constraint changes. Depends on #10828	2020-09-22 16:22:53 +02:00
Jarek Potiuk	018ae0ed95	The PIP version is not pinned to 19.0.2 any more (#10542 ) Fixes #10516	2020-08-25 15:45:59 +02:00
Jarek Potiuk	1cf1af664f	Do not override in_container scripts when building the image (#10442 ) After #10368, we've changed the way we build the images on CI. We are overriding the ci scripts that we use to build the image with the scripts taken from master to not give roque PR authors the possibiility to run something with the write credentials. We should not override the in_container scripts, however because they become part of the image, so we should use those that came with the PR. That's why we have to move the "in_container" scripts out of the "ci" folder and only override the "ci" folder with the one from master. We've made sure that those scripts in ci are self-contained and they do not need reach outside of that folder. Also the static checks are done with local files mounted on CI because we want to check all the files - not only those that are embedded in the container.	2020-08-21 17:21:57 +02:00
Jarek Potiuk	306a6660fd	Docker images are now consistently labelled and a bit smaller (#10387 ) Extracted from #10368	2020-08-19 02:03:22 +02:00
David Cavaletto	f6734b3b85	Enable Sphinx spellcheck for doc generation (#10280 )	2020-08-12 21:30:37 +01:00
Jarek Potiuk	de9eaeb434	Constraint files are now maintained automatically (#9889 ) * Constraint files are now maintained automatically * No need to generate requirements when setup.py changes * requirements are kept in separate orphan branches not in main repo * merges to master verify if latest requirements are working and push tested requirements to orphaned branches * we keep history of requirement changes and can label them individually for each version (by constraint-1.10.n tag name) * consistently changed all references to be 'constraints' not 'requirements'	2020-07-20 14:36:03 +02:00
Jarek Potiuk	5805a36368	Fix SqlAlchemy-Flask failure with python 3.8.4 (#9821 )	2020-07-14 21:28:42 +01:00
Jarek Potiuk	ca88151887	Fix in-breeze CLI tools to work also on Linux (#9376 ) Instead of creating the links in the image (which did not work) the links are created now at the entry to the breeze image. The wrappers were not installed via Dockerfile and the ownership fixing did not work on Linux	2020-06-19 08:58:32 +02:00
Jarek Potiuk	4fefaf78a2	Fixed crashing webserver after /tmp is mounted from the host (#9378 ) The bug was introduced in `f17a02d330` Gunicorn uses a lot of os.fchmod in /tmp directory and it can create some excessive blocking in os.fchmod https://docs.gunicorn.org/en/stable/faq.html#how-do-i-avoid-gunicorn-excessively-blocking-in-os-fchmod We want to switch to use /dev/shm in prod image (shared memory) to make blocking go away and make independent on the docker filesystem used (osxfs has problems with os.fchmod and use permissions as well). Use case / motivation Avoiding contention might be useful = in production image. This can be done with: GUNICORN_CMD_ARGS="--worker-tmp-dir /dev/shm"	2020-06-18 16:58:08 +02:00
Kamil Breguła	6a9c436f6f	Move out metastore_browser from airflow.contrib (#9341 )	2020-06-17 21:52:23 +02:00
Jarek Potiuk	7c12a9d4e0	Improve production image iteration speed (#9162 ) For a long time the way how entrypoint worked in ci scripts was wrong. The way it worked was convoluted and short of black magic. This did not allow to pass multiple test targets and required separate execute command scripts in Breeze. This is all now straightened out and both production and CI image are always using the right entrypoint by default and we can simply pass parameters to the image as usual without escaping strings. This also allowed to remove some breeze commands and change names of several flags in Breeze to make them more meaningful. Both CI and PROD image have now embedded scripts for log cleaning. History of image releases is added for 1.10.10-* alpha quality images.	2020-06-16 12:36:46 +02:00
Jarek Potiuk	696e74594f	Fix broken CI image optimisation (#9313 ) The commit `5918efc86a` broke optimisation of the CI image - using the Apache Airflow master branch as a base package installation source from PyPI. This commit restores it including removal of the obsolete CI_OPTIMISED arg - as now we have a separate production and CI image and CI image is by default CI_OPTIMISED	2020-06-16 00:38:55 +01:00
Kamil Breguła	f17a02d330	Add generic CLI tool wrapper (#9223 ) * Add generic CLI tool wrapper * Pas working directory to container * Share namespaces between all containers * Fix permissions hack * Unify code style Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com> * Detect standalone execution by checking symboli link * User friendly error message when env var is missing * Display error to stderr * Display errors on stderr * Fix permission hack * Fix condition in if * Fix missing env-file * TEST: Install airflow without copying ssources * Update scripts/ci/in_container/run_prepare_backport_readme.sh Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>	2020-06-11 18:50:31 +02:00
zikun	82c8343ab6	Support additional apt dependencies (#9189 ) * Add ADDITONAL_DEV_DEPS and ADDITONAL_RUNTIME_DEPS * Add examples for additional apt dev and runtime dependencies * Update comment * Fix typo	2020-06-09 23:05:43 +02:00
Jarek Potiuk	de9d3401f9	Improved cloud tool available in the trimmed down CI container (#9167 ) * Improved cloud tool available in the trimmed down CI container The tools now have shebangs which make them available for python tools. Also /opt/airflow is now mounted from the host Airflow sources which makes it possible for the tools to copy files directly to/from the sources of Airflow. It also contains one small change for Linux users - the files created by docker gcloud are created with root user so in order to fix that the directories mounted from the host are fixed when you exit the tool - their ownership is changed to be owned by the host user	2020-06-09 09:33:16 +02:00
James Timmins	5918efc86a	Add 3.8 to the test matrices (#8836 )	2020-06-05 18:39:28 +01:00
Jarek Potiuk	a39e9a3520	Replaces cloud-provider CLIs in CI image with scripts running containers (#9129 ) The clis are replaced with scripts that will pull and run docker images when they are needed. Added Azure CLI as well. Closes: #8946 #8947 #8785	2020-06-04 19:12:09 +02:00
Jarek Potiuk	46fee77156	Use static binary linked docker client in CI image (#9126 )	2020-06-04 15:24:51 +02:00
Jarek Potiuk	ff5dcccbbd	Kubernetes Cluster is started on host not in the container (#8265 ) Tests requiring Kubernetes Cluster are now moved out of the regular CI tests and moved to "kubernetes_tests" folder so that they can be run entirely on host without having the CI image built at all. They use production image to run the tests on KinD cluster and we add tooling to start/stop/deploy the application to the KinD cluster automatically - for both CI testing and local development. This is a pre-requisite to convert the tests to convert the tests to use the official Helm Chart and Docker images or Apache Airflow. It closes #8782	2020-06-03 20:58:38 +02:00
James Timmins	10796cb7ce	Remove Hive/Hadoop/Java dependency from unit tests (#9029 )	2020-06-03 12:49:27 +01:00
Jarek Potiuk	738667082d	Additional python extras and deps can be set in breeze (#9035 ) Closes #8604 Closes #8866	2020-05-27 17:09:11 +02:00
Ash Berlin-Taylor	47413d98f0	Remove singularity from CI images (#8945 ) The singularity operator tests _have always_ used mocking, so we were adding 700MB to our docker image for nothing. Fixes #8774	2020-05-21 12:12:03 +01:00
Ash Berlin-Taylor	8476c1e387	Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938 ) Debian Buster only ships with a JDK11, and Hive/Hadoop fails in odd, hard to debug ways (complains about metastore not being initalized, possibly related to the class loader issues.) Until we rip Hive out from the CI (replacing it with Hadoop in a seprate integration, only on for some builds) we'll have to stick with JRE8 Our previous approach of installing openjdk-8 from Sid/Unstable started failing as Debian Sid has a new (and conflicting) version of GCC/libc. The adoptopenjdk package archive is designed for Buster so should be more resilient	2020-05-21 07:19:49 +02:00
Ash Berlin-Taylor	fef00e5a06	Use Debian's provided JRE from Buster (#8919 ) Installing the JDK (not even the JRE) from Sid is starting to break on Buster as the versions of packages conflict: > The following packages have unmet dependencies: > libgcc-8-dev : Depends: gcc-8-base (= 8.4.0-4) but 8.3.0-6 is to be installed > Depends: libmpx2 (>= 8.4.0-4) but 8.3.0-6 is to be installed This changes our CI docker images to: 1. Not install something from Sid (unstable, packages change/get updated) when we are using Buster (stable, only security fixes). 2. Installed the JRE, not the JDK. We don't need to compile Java code.	2020-05-20 14:18:59 +01:00
Jarek Potiuk	2121f494c3	Avoid failure on transient requirements in CI image (#8892 ) When you build from the scratch and some transient requirements fail, the initial step of installation might fail. We are now using latest valid constraints from the DEFAULT_BRANCH branch to avoid it.	2020-05-17 22:41:48 +02:00
Felix Uellendall	2878f17630	Relax Flask-Appbuilder version to ~=2.3.4 (#8857 ) "Bump jQuery to 3.5" was reverted. And so we can upgrade and remove email_validator dependency See also: https://github.com/dpgaspar/Flask-AppBuilder/blob/master/CHANGELOG.rst#improvements-and-bug-fixes-on-234	2020-05-13 19:42:51 +01:00
Jarek Potiuk	d15839de0c	Latest debian-buster release broke image build (#8758 )	2020-05-07 08:25:50 +02:00
Jarek Potiuk	45c8983306	Less aggressive eager upgrade of requirements (#8267 ) With this change requirements are only eagerly upgraded when generating requirements when setup.py changes. They are also eagerly upgraded when you run ./breeze generate-requirements locally. Still the cron job will use the eager update mechanism when building the docker image which means that CRON jobs will still detect cases where upgrede of requirements causes failure either at the installation time or during tests.	2020-04-13 18:50:46 +02:00

1 2

53 Коммитов