After #10368, we've changed the way we build the images
on CI. We are overriding the ci scripts that we use
to build the image with the scripts taken from master
to not give roque PR authors the possibiility to run
something with the write credentials.
We should not override the in_container scripts, however
because they become part of the image, so we should use
those that came with the PR. That's why we have to move
the "in_container" scripts out of the "ci" folder and
only override the "ci" folder with the one from
master. We've made sure that those scripts in ci
are self-contained and they do not need reach outside of
that folder.
Also the static checks are done with local files mounted
on CI because we want to check all the files - not only
those that are embedded in the container.
The EMBEDDED dags were only really useful for testing
but it required to customise built production image
(run with extra --build-arg flag). This is not needed
as it is better to extend the image instead with FROM
and add dags afterwards. This way you do not have
to rebuild the image while iterating on it.
* Constraint files are now maintained automatically
* No need to generate requirements when setup.py changes
* requirements are kept in separate orphan branches not in main repo
* merges to master verify if latest requirements are working and
push tested requirements to orphaned branches
* we keep history of requirement changes and can label them
individually for each version (by constraint-1.10.n tag name)
* consistently changed all references to be 'constraints' not
'requirements'
The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.
The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.
This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Daniel Imberman <daniel@astronomer.io>
OpenShift (and other Kubernetes platforms) often use the approach
that they start containers with random user and root group. This is
described in the https://docs.openshift.com/container-platform/3.7/creating_images/guidelines.html
All the files created by the "airflow" user are now belonging to
'root' group and the root group has the same access to those
files as the Airflow user.
Additionally, the random user gets automatically added
/etc/passwd entry which is name 'default'. The name of the user
can be set by setting the USER_NAME variable when starting the
container.
Closes#9248Closes#8706
For a long time the way how entrypoint worked in ci scripts
was wrong. The way it worked was convoluted and short of black
magic. This did not allow to pass multiple test targets and
required separate execute command scripts in Breeze.
This is all now straightened out and both production and
CI image are always using the right entrypoint by default
and we can simply pass parameters to the image as usual without
escaping strings.
This also allowed to remove some breeze commands and
change names of several flags in Breeze to make them more
meaningful.
Both CI and PROD image have now embedded scripts for log
cleaning.
History of image releases is added for 1.10.10-*
alpha quality images.
* Add migration waiting script and log cleaner
This PR creates a "migration spinner" that allows the webserver to wait for all database migrations to complete before starting up. Is a necessary component before we can merge the helm chart.
* Update airflow/cli/cli_parser.py
Co-Authored-By: Tomek Urbaszek <turbaszek@gmail.com>
Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
It also installs properly on Mac as well as it auto-detects
if yarn prod is needed - based on presence of proper
package.json in either www or www_rbac which makes it simpler
for remote installations.
Switch to MySQL 5.7 in tests.
Fixes the utf8mb4 encoding issue where utf8mb4 encoding
produces too long keys for mysql to handle in XCom table.
You can optionally specify a separate option to set
encoding differently for the columns that are part of the
index - dag_id, task_id and key.
Each stage of the CI tests needs to pull our `ci` image. By removing
java from it we can save 1-2minutes from each test stage. This is part
of that work.
* adding singularity operator and tests
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* removing encoding pragmas and fixing up dockerfile to pass linting
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* make workdir in /tmp because AIRFLOW_SOURCES not defined yet
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* curl needs to follow redirects with -L
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* moving files to where they are supposed to be, more changes to mock, no clue
Signed-off-by: vsoch <vsochat@stanford.edu>
* removing trailing whitespace, moving example_dag for singularity, adding licenses to empty init files
Signed-off-by: vsoch <vsochat@stanford.edu>
* ran isort on example dags file
Signed-off-by: vsoch <vsochat@stanford.edu>
* adding missing init in example_dags folder for singularity
Signed-off-by: vsoch <vsochat@stanford.edu>
* removing code from __init__.py files for singularity operator to fix documentation generation
Signed-off-by: vsoch <vsochat@stanford.edu>
* forgot to update link to singularity in operators and hooks ref
Signed-off-by: vsoch <vsochat@stanford.edu>
* command must have been provided on init of singularity operator instance
Signed-off-by: vsoch <vsochat@stanford.edu>
* I guess I'm required to have a task_id?
Signed-off-by: vsoch <vsochat@stanford.edu>
* try adding working_dir to singularity operator type definitions
Signed-off-by: vsoch <vsochat@stanford.edu>
* disable too many arguments for pylint of singularity operator init
Signed-off-by: vsoch <vsochat@stanford.edu>
* move pylint disable up to line 64 - doesnt catch at end of statement like other examples
Signed-off-by: vsoch <vsochat@stanford.edu>
* two spaces before inline comment
Signed-off-by: vsoch <vsochat@stanford.edu>
* I dont see task_id as a param for other providers, removing for singularity operator
Signed-off-by: vsoch <vsochat@stanford.edu>
* adding debug print
Signed-off-by: vsoch <vsochat@stanford.edu>
* allow for return of just image and/or lines
Signed-off-by: vsoch <vsochat@stanford.edu>
* dont understand how mock works, but the image should exist after its pulled....
Signed-off-by: vsoch <vsochat@stanford.edu>
* try removing shutil, the client should handle pull folder instead
Signed-off-by: vsoch <vsochat@stanford.edu>
* try changing pull-file to same uri that is expected to be pulled
Signed-off-by: vsoch <vsochat@stanford.edu>
* import of AirflowException moved to exceptions
Signed-off-by: vsoch <vsochat@stanford.edu>
* DAG module was moved to airflow.models
Signed-off-by: vsoch <vsochat@stanford.edu>
* ensure pull is called with pull_folder
Signed-off-by: vsoch <vsochat@stanford.edu>
There is two parts to this PR:
1. Only copying www/webpack.config.js and www/static/ before running the
asset pipeline
2. Making sure that _all_ files (not just the critical ones) have the
same permissions.
There is two parts to this PR:
1. Only copying www/webpack.config.js and www/static/ before running the
asset pipeline
2. Making sure that _all_ files (not just the critical ones) have the
same permissions.
The goal of both of these is to make sure that the docker build cache for the "expensive"
operations (installing NPM modules, running asset pipeline, installing python modules)
isn't run when it isn't necessary.
* Revert "[AIRFLOW-6662] Switch to --init docker flag for signal propagation (#7278)"
This reverts commit d1bf343ffe.
* [AIRFLOW-6662] return back the dumb-init - installed by apt
We had stability problems with tests with --init flag so we are
going back to it
Also curl options are now using long format and include --fail
to protect against some temporary errors (5xx). Also RAT download
uses now two possible sources of downloads and fallbacks to the
second if first is not available.
* Fixed problem that Kubernetes tests were testing latest master
rather than what came from the local sources.
* Kind (Kubernetes in Dcocker) is run in the same Docker as Breeze env
* Moved Kubernetes scripts to 'in_container' dir where they belong now
* Kubernetes cluster is reused until it is stopped
* Kubernetes image is build from image in docker already + mounted sources
* Kubectl version name is corrected in the Dockerfile
* KUBERNETES_VERSION can now be used to select Kubernetes version
* Running kubernetes scripts is now easy in Breeze
* We can start/recreate/stop cluster using --<ACTION>-kind-cluster
* Instructions on how to run Kubernetes tests are updated
* The old "bare" environment is replaced by --no-deps switch
It is:
- quicker to install
- easier to get repeatable results
- Takes up less space (130MB/15k files vs 190MB/23k files)
- nicer to user (has better help)
With this change you should be able to simply run `pytest` to run all the tests in the main airflow directory.
This consist of two changes:
* moving pytest.ini to the main airflow directory
* skipping collecting kubernetes tests when ENV != kubernetes
This change further improves time of rebuilds for docker image when your
sources change (very useful in case of building kubernetes image). It adds only
directories that are needed (it is synchronised with .dockerignore and local
mounts) and in the sequence that reflects frequency of changes. Also pip
install is not done again after sources change (there is no point) so the
build is much faster when only sources or test file change.