The provider.yaml contains more information that required at
runtime (specifically about documentation building). Those
fields are not needed at runtime and their presence is optional.
Also the runtime check for provider information should be more
relexed and allow for future compatibility (with
additional properties set to false). This way we can add new,
optional fields to provider.yaml without worrying about breaking
future-compatibility of providers with future airflow versions.
This changei restores 'additionalProperties': false in the
main, development-focused provider.yaml schema and introduced
new runtime schema that is used to verify the provider info when
providers are discovered by airflow.
This 'runtime' version should change very rarely as change to
add a new required property in it breaks compatibility of
providers with already released versions of Airflow.
We also trim-down the provider.yaml file when preparing provider
packages to only contain those fields that are required in the
runtime schema.
- `from airflow.utils.helpers import chain` to `from airflow.models.baseoperator import chain`
This commit also adds Bowler refactor for backport packages
* Rewrite handwritten argument parser in prepare_provider_packages.py
- Replaced sys.argv manipulation with argparse.
- Replaced positional argument for PACKAGE with optional argument.
Issue : 13069
To be reviewed by : Kamil, Jarek.
* Modified help text for prepare_provider_packages as suggested by Kamil.
Signed-off-by: Debodirno Chandra <debodirnochandra@gmail.com>
* Moved each CLI subcommand in prepare_provider_packages.py to a separate function for modularity and code cleanup.
Signed-off-by: Debodirno Chandra <debodirnochandra@gmail.com>
The additional properties should be allowed in provider schema,
otherwise future version of providers will not be compatible with
older versions of Airflow.
Specifying 'additionalProperties' as allowed we are opening up to
adding more properties to provider.yaml.
This change fixes this is for now by removing extra fields
added since the Airlow 2.0.0 schema and verifying that the 2.0.0
schema correctly validates such modified dictionary.
In the future we might deprecate 2.0.0 and add >=2.0.1 limitation
to the provider packages in which case we will be able to remove
this modification of the provider_info dict.
Also added additional test for provider packages whether they
install on Airflow 2.0.0. This tests might remain even after the
deprecation of 2.0.0 - we can just move it to 2.0.1. However this
will give us much bigger confidence that the providers will
continue work even for older versions of Airflow 2.0.
We might have to modify that test and only include the providers
that are backwards-compatible, in case we have some providers
that depend on future Airflow versions. For now we assume
all providers should be installable from master on 2.0.0.
Curl has a sophisticated back-off mechanism when trying to connect
and it causes sometimes that it hangs for a very long time
when first few attempts to connect failed with a 'soft' error.
Similarly, when curl starts transfer after connecting but the
other party hanged, the client curl call might hang as well.
This causes various problems for example sometimes waitig for
images in the ci build gets cancelled because curl command
to check for image fails - example:
https://github.com/apache/airflow/pull/13413/checks?check_run_id=1635401914
This change adds appropriate timeouts to all curl commands we
use in CI/manual operations. In many cases we implemented
retry so the effect will be that those cases will stop happening
but even in no-retry case, failing curl is better than hangs.
This is a complete refactor of the setup.py providers/dependencies.
It much better reflects the current setup where we have most of
the extras 1-1 reflecting providers but also some extras that do
not have their own providers.
The pre-commits that were verifying setup versus documentation
can now be vastly simplified (no more need to parse the
comments so we can import setup.py variables directly rather
than parse it via regexps. Also we can better categorize the
extras - separate out (and verify) whether we correctly
described deprecated extras and to mark extras that install
additional providers as such.
Fixes: #13309
This change introduces improvements in the way logs are displayed
in CI jobs and in amount of logs produced in general for CI jobs
due to much smarter cache usage.
Logs in all CI jobs are now grouped in groups which are folded
by default when there is no error generated in such group. Similar
solution has been already used in docs job and it improved
both readability and speed of loading of the logs in CI after
recent improvements in Github UI (previously the speed of loading
the logs was not improved by groups).
Also cache usage has been reviewed and fixed in a number of places
which will result in much shorted setup times for static checks
and kubernetes virtualenv but also far shorter logs generated by
cache setup (we are using restore-keys feature that implements
incremental approach for cache building even if cache keys in
GitHub Actions are immutable.
* Changes release image preparation to use PyPI packages
Since we released all teh provider packages to PyPI now in
RC version, we can now change the mechanism to prepare the
production to use released packages in case of tagged builds.
The "branch" production images are still prepared using the
CI images and .whl packages built from sources, but the
release packages are built from officially released PyPI
packages.
Also some corrections and updates were made to the release process:
* the constraint tags when RC candidate is sent should contain
rcn suffix.
* there was missing step about pushing the release tag once the
release is out
* pushing tag to GitHub should be done after the PyPI packages
are uploaded, so that automated image building in DockerHub
can use those packages.
* added a note that in case we will release some provider
packages that depend on the just released airflow version
they shoudl be released after airflow is in PyPI but before
the tag is pushed to GitHub (also to allow the image to be
build automatically from the released packages)
Fixes: #12970
* Update dev/README_RELEASE_AIRFLOW.md
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
* Update dev/README_RELEASE_AIRFLOW.md
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Setuptools has a built in mechanism for adding a `suffix` to a version,
so we don't have to edit a file and then remember to delete it!
It's a little bit messy to get the suffix like this -- using sed to
strip out any leading digits or periods, but it's copy-pasteable this
way.
As discussed in AIP-21
* Rename airflow.sensors.external_task_sensor to airflow.sensors.external_task
* Rename airflow.sensors.sql_sensor to airflow.sensors.sql
* Rename airflow.contrib.sensors.weekday_sensor to airflow.sensors.weekday
As discussed in AIP-21
* Rename airflow.hooks.base_hook to airflow.hooks.base
* Rename airflow.hooks.dbapi_hook to airflow.hooks.dbapi
* Rename airflow.sensors.base_sensor_operator to airflow.sensors.base
* Rename airflow.sensors.date_time_sensor to airflow.sensors.date_time
* Rename airflow.sensors.time_delta_sensor to airflow.sensors.time_delta
Co-authored-by: Kaxil Naik <kaxilnaik@apache.org>
So far, the production images of Airflow were using sources
when they were built on CI. This PR changes that, to build
airflow + providers packages first and install them
rather than use sources as installation mechanism.
Part of #12261
This PR implements discovering and readin provider information from
packages (using entry_points) and - if found - from local
provider yaml files for the built-in airflow providers,
when they are found in the airflow.provider packages.
The provider.yaml files - if found - take precedence over the
package-provided ones.
Add displaying provider information in CLI
Closes: #12470
* Make the KubernetesPodOperator backwards compatible
This PR significantly reduces the pain of upgrading to Airflow 2.0
for users of the KubernetesPodOperator. Users will be allowed to
continue using the airflow.kubernetes custom classes
* spellcheck
* spelling
* clean up unecessary files in 1.10
* clean up unecessary files in 1.10
* clean up unecessary files in 1.10
The inbuilt functions all() and any() in python also support
short-circuiting (evaluation stops as soon as the overall return value
of the function is known), but this behavior is lost if you use
comprehension. This affects performance.