* Gets rid of all the docker cli tools in Breeze.
The docker cli tools caused more trouble than benefits.
Replaced the last two (azure and aws) with installation to
"/files" directory so that they survive breeze restart.
Also added --reinstall to installation command for all tools
so that it is easier to reinstall them.
* Update .pre-commit-config.yaml
Previously UPGRADE_TO_LATEST_CONSTRAINTS variable controlled
whether the CI image uses latest dependencies rather than
fixed constraints. This PR brings it also to PROD image.
The name of the ARG is changed to UPGRADE_TO_NEWER_DEPENDENCIES
as this corresponds better with the intention.
So far, the production images of Airflow were using sources
when they were built on CI. This PR changes that, to build
airflow + providers packages first and install them
rather than use sources as installation mechanism.
Part of #12261
This change upgrades setup.py and setup.cfg to provide non-conflicting
`pip check` valid set of constraints for CI image.
Fixes#10854
Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>
Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>
The change #10806 made airflow works with implicit packages
when "airflow" got imported. This is a good change, however
it has some unforeseen consequences. The 'provider_packages'
script copy all the providers code for backports in order
to refactor them to the empty "airflow" directory in
provider_packages folder. The #10806 change turned that
empty folder in 'airflow' package because it was in the
same directory as the provider_packages scripts.
Moving the scripts to dev solves this problem.
This test was bundled in with the existing needs-api tests, but then
performed it's _own_ checks on if it should run. This changes that to
have selective_ci_checks.sh do this check.
Additionally CI_SOURCE_REPO was often wrong -- at least for me as I
don't open PRs from ashb/airflow, and this lead to a confusing message:
> https://github.com/ashb/airflow.git Branch my_branch does not exist
But all we were using this for was to find the "parent" commit, but
there is any easier way we can do that: HEAD^1 with a fetch depth of 2
to the checkout option.
So I've removed calculating that and where it is used.
If we need to bring it back we should use the output from the
`potiuk/get-workflow-origin` action -- that gets the correct value
We had a lot of problems recently about the queues in Github
Actions. This documentations explains the motivation and approach
we have taken for optimizing our PR workflow.
Some tests (testing the structure and importability of
example) should be always run even if core part was not modified.
That's why we move it to "always" directory.
The security scans take a long time, especially for python code
- it is about ~18 minutes now. This PR reduces strain on the
GitHub actions by only running the scan in pull requests
when any of python/javascript code changed respectively.
* Images are not built if the change is not touching code or docs.
* In case we have no need for CI images we run stripped-down
pre-commit checks which skip the long checks and only run for
changed files
* If none of the CLI/Providers/Kubernetes/WWW files changed
the relevant tests are skipped, unless some of the core files
changed as well.
* The selective checks logic is explained and documented.
This is the second attempt at the problem with better
strategy to get the list of files from the incoming PR.
The strategy works now better in a number of cases:
* when PR comes from the same repo
* when PR comes from the pull_repo
* when PR contains more than one commit
* when PR is based on older master and GitHub creates
merge commit
* Images are not built if the change is not touching code or docs.
* In case we have no need for CI images we run stripped-down
pre-commit checks which skip the long checks and only run for
changed files
* If none of the CLI/Providers/Kubernetes/WWW files changed
the relevant tests are skipped, unless some of the core files
changed as well.
* The selective checks logic is explained and documented.
* Separate changes/readmes for backport and regular providers
We have now separate release notes for backport provider
packages and regular provider packages.
They have different versioning - backport provider
packages with CALVER, regular provider packages with
semver.
* Added support for provider packages for Airflow 2.0
This change consists of the following changes:
* adds provider package support for 2.0
* adds generation of package readme and change notes
* versions are for now hard-coded to 0.0.1 for first release
* adds automated tests for installation of the packages
* rename backport package readmes/changes to BACKPORT_*
* adds regulaar packge readmes/changes
* updates documentation on generating the provider packaes
* adds CI tests for the packages
* maintains backport packages generation with --backports flag
Fixes#11421Fixes#11424
In preparation for adding provider packages to 2.0 line we
are renaming backport packages to provider packages.
We want to implement this in stages - first to rename the
packages, then split-out backport/2.0 providers as part of
the #11421 issue.
I noticed that when there is no setup.py changes, the constraints
are not upgraded automatically. This is because of the docker
caching strategy used - it simply does not even know that the
upgrade of pip should happen.
I believe this is really good (from security and incremental updates
POV to attempt to upgrade at every successfull merge (not that
the upgrade will not be committed if any of the tests fail and this
is only happening on every merge to master or scheduled run.
This way we will have more often but smaller constraint changes.
Depends on #10828
We introduced deletion of the old artifacts as this was
the suspected culprit of Kubernetes Job failures. It turned out
eventually that those Kubernetes Job failures were caused by
the #11017 change, but it's good to do housekeeping of the
artifacts anyway.
The delete workflow action introduced in a hurry had two problems:
* it runs for every fork if they sync master. This is a bit
too invasive
* it fails continuously after 10 - 30 minutes every time
as we have too many old artifacts to delete (GitHub has
90 days retention policy so we have likely tens of
thousands of artifacts to delete)
* it runs every hour and it causes occasional API rate limit
exhaustion (because we have too many artifacts to loop trough)
This PR introduces filtering with the repo, changes the frequency
of deletion to be 4 times a day. Back of the envelope calculation
tops 4/day at 2500 artifacts to delete at every run so we have low risk
of reaching 5000 API calls/hr rate limit. and adds script that we are
running manually to delete those excessive artifacts now. Eventually
when the number of artifacts goes down the regular job should delete
maybe a few hundreds of artifacts appearing within the 6 hours window
in normal circumstances and it should stop failing then.
* Constraint files are now maintained automatically
* No need to generate requirements when setup.py changes
* requirements are kept in separate orphan branches not in main repo
* merges to master verify if latest requirements are working and
push tested requirements to orphaned branches
* we keep history of requirement changes and can label them
individually for each version (by constraint-1.10.n tag name)
* consistently changed all references to be 'constraints' not
'requirements'
* add git sync sidecars
* add a helm test
* add more tests
* allow users to provide git username and pass via a k8s secrets
* set default values for airflow worker repository & tag
* change ci timeout
* fix link
* add credentials_secret to airflow.cfg configmap
* set GIT_SYNC_ADD_USER on kubernetes worker pods, set uid
* add fsGroup to webserver and kubernete workers
* move gitSync to dags.gitSync
* rename valueFields
* turn off git sync and dag persistence by default
* provide option to specify known_hosts
* add git-sync details into the chart documentation
* Update .gitignore
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
* make git sync max failures configurable
* Apply suggestions from code review
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
* add back requirements.lock
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
The Kubernetes tests are now run using Helm chart
rather than the custom templates we used to have.
The Helm Chart uses locally build production image
so the tests are testing not only Airflow but also
Helm Chart and a Production image - all at the
same time. Later on we will add more tests
covering more functionalities of both Helm Chart
and Production Image. This is the first step to
get all of those bundle together and become
testable.
This change introduces also 'shell' sub-command
for Breeze's kind-cluster command and
EMBEDDED_DAGS build args for production image -
both of them useful to run the Kubernetes tests
more easily - without building two images
and with an easy-to-iterate-over-tests
shell command - which works without any
other development environment.
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Daniel Imberman <daniel@astronomer.io>
All PRs will used cached "latest good" version of the python
base images from our GitHub registry. The python versions in
the Github Registry will only get updated after a master
build (which pulls latest Python image from DockerHub) builds
and passes test correctly.
This is to avoid problems that we had recently with Python
patchlevel releases breaking our Docker builds.
The CRON job from previous runs did not have everything working
after the emergency migration to Github Actions.
This change brings back following improvements:
* rebuilding images from the scratch in CRON job
* automatically upgrading all requirements to test if they are new
* pushing production images to github packages as cache
* pushing nightly tag to github