* Modify helm chart to use pod_template_file
Since we are deprecating most k8sexecutor arguments
we should use the pod_template_file when launching airflow
using the KubernetesExecutor
* fix tests
* one more nit
* fix dag command
* fix pylint
Relative and absolute imports are functionally equivalent, the only
pratical difference is that relative is shorter.
But it is also less obvious what exactly is imported, and harder to find
such imports with simple tools (such as grep).
Thus we have decided that Airflow house style is to use absolute imports
only
Until pre-commit implements export of all configured
checks, we need to maintain the list manually updated.
We check both - pre-commit list in breeze-complete and
descriptions in STATIC_CODE_CHECKS.rst
The hadolint check only checked the "main dir" Dockerfile
but we have more of them now. All of them are now checked.
The following problems are fixed:
* DL3000 Use absolute WORKDIR
* DL4000 MAINTAINER is deprecated
* DL4006 Set the SHELL option -o pipefail before RUN with a pipe in it.
* SC2046 Quote this to prevent word splitting.
The followiing problems are ignored:
* DL3018 Pin versions in apk add. Instead of `apk add <package>` use `apk add
<package>=<version>`
* CI Images are now pre-build and stored in registry
With this change we utilise the latest pull_request_target
event type from Github Actions and we are building the
CI image only once (per version) for the entire run.
This safes from 2 to 10 minutes per job (!) depending on
how much of the Docker image needs to be rebuilt.
It works in the way that the image is built only in the
build-or-wait step. In case of direct push run or
scheduled runs, the build-or-wait step builds and pushes
to the GitHub registry the CI image. In case of the
pull_request runs, the build-and-wait step waits until
separate build-ci-image.yml workflow builds and pushes
the image and it will only move forward once the image
is ready.
This has numerous advantages:
1) Each job that requires CI image is much faster because
instead of pulling + rebuilding the image it only pulls
the image that was build once. This saves around 2 minutes
per job in regular builds but in case of python patch level
updates, or adding new requirements it can save up to 10
minutes per job (!)
2) While the images are buing rebuilt we only block one job waiting
for all the images. The tests will start running in parallell
only when all images are ready, so we are not blocking
other runs from running.
3) Whole run uses THE SAME image. Previously we could have some
variations because the images were built at different times
and potentially releases of dependencies in-between several
jobs could make different jobs in the same run use slightly
different image. This is not happening any more.
4) Also when we push image to github or dockerhub we push the
very same image that was built and tested. Previously it could
happen that the image pushed was slightly different than the
one that was used for testing (for the same reason)
5) Similar case is with the production images. We are now building
and pushing consistently the same images accross the board.
6) Documentation building is split into two parallel jobs docs
building and spell checking - decreases elapsed time for
the docs build.
7) Last but not least - we keep the history of al the images
- those images contain SHA of the commit. This means
that we can simply download and run the image locally to reproduce
any problem that anyone had in their PR (!). This is super useful
to be able to help others to test their problems.
* fixup! CI Images are now pre-build and stored in registry
* fixup! fixup! CI Images are now pre-build and stored in registry
* fixup! fixup! fixup! CI Images are now pre-build and stored in registry
* fixup! fixup! fixup! CI Images are now pre-build and stored in registry
This cleans up the document building process and replaces it
with breeze-only. The original instructions with
`pip install -e .[doc]` stopped working so there is no
point keeping them.
Extracted from #10368
* Improves stability of reported coverage and makes it nicer
With this change we only upload coverage report in the case
when all tests were successuful and actually executed. This means
that coverage report will not be run when the job gets canceled.
Currently a lof of coverage reports gathered contain far less
coverage because when static check fails or docs some test jobs
could already submit their coverage - resulting in partial coverage
reports.
With this change we also remove comment from coverage report
and replace it with (for now) informational status message published
to github. If we see that it works, we can change it to a
PR-failing status if coverage drops for a given PR.
This way we might get our coverage monotonously increasing :).
* Pylint checks should be way faster now
Instead of running separate pylint checks for tests and main source
we are running a single check now. This is possible thanks to a
nice hack - we have pylint plugin that injects the right
"# pylint: disable=" comment for all test files while reading
the file content by astroid (just before tokenization)
Thanks to that we can also separate out pylint checks
to a separate job in CI - this way all pylint checks will
be run in parallel to all other checks effectively halfing
the time needed to get the static check feedback and potentially
cancelling other jobs much faster.
* fixup! Pylint checks should be way faster now
It is slower to call e.g. dict() than using the empty literal, because the name dict must be looked up in the global scope in case it has been rebound. Same for the other two types like list() and tuple().
* Consistent naming of transfer operators
Transfer operators have consistent names and are grouped in
the 'transfer' packages.
* fixup! Consistent naming of transfer operators
* Introduces 'transfers' packages.
Closes#9161 and #8620
* fixup! Introduces 'transfers' packages.
* fixup! fixup! Introduces 'transfers' packages.
* fixup! fixup! fixup! Introduces 'transfers' packages.
- use official isort pre-commit-hook
- use official yamllint pre-commit-hook
- run isort pre-commit-hook on all python files instead of files ending with py
- add howto docs for S3ToRedshift example dag
- add terraform which runs terraform CLI commands in an isolated docker container
NOTE: This system test uses terraform to provide the infrastructure needed to run this example dag.
Most things already use the `Markup` class to correctly escape problem
areas, this commit just fixes the last instances so that we can assert
that `|safe` is never used.
It's fairly common to say whitelisting and blacklisting to describe
desirable and undesirable things in cyber security. However just because
it is common doesn't mean it's right.
However, there's an issue with the terminology. It only makes sense if
you equate white with 'good, permitted, safe' and black with 'bad,
dangerous, forbidden'. There are some obvious problems with this.
You may not see why this matters. If you're not adversely affected by
racial stereotyping yourself, then please count yourself lucky. For some
of your friends and colleagues (and potential future colleagues), this
really is a change worth making.
From now on, we will use 'allow list' and 'deny list' in place of
'whitelist' and 'blacklist' wherever possible. Which, in fact, is
clearer and less ambiguous. So as well as being more inclusive of all,
this is a net benefit to our understandability.
(Words mostly borrowed from
<https://www.ncsc.gov.uk/blog-post/terminology-its-not-black-and-white>)
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>