Граф коммитов

11446 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk 9b0ea24ad6 Install airflow and providers together from context files (#13441)
Airflow and provider packages need to be installed together to
make sure that constrainst are taken into account and that airflow
does not get reinstalled from PyPI when eager upgrade runs.

(cherry picked from commit bc6f5ea088)
2021-01-21 18:52:33 +00:00
Kamil Breguła 9a66fc0dab fixup! Fixed failing pylint errors introduced in #13403 (#13429) (#13437)
(cherry picked from commit ad0d3e17e1)
2021-01-21 18:51:54 +00:00
Kamil Breguła 4c4b475d8d fixup! Adds timeout to all curl commands (#13431) (#13435)
(cherry picked from commit abcb0874a4)
2021-01-21 18:51:49 +00:00
Jarek Potiuk 6707bbe74c Add extras when installing prod image from packages (#13432)
In the latest change #13422 change in the way product images are
prepared removed extras from installed airflow - thus caused
failing production image verification check.

This change restores extras when airflow is installed from packages

(cherry picked from commit 3a731108f5)
2021-01-21 18:51:15 +00:00
Jarek Potiuk 94fb61382a Adds timeout to all curl commands (#13431)
Curl has a sophisticated back-off mechanism when trying to connect
and it causes sometimes that it hangs for a very long time
when first few attempts to connect failed with a 'soft' error.
Similarly, when curl starts transfer after connecting but the
other party hanged, the client curl call might hang as well.

This causes various problems for example sometimes waitig for
images in the ci build gets cancelled because curl command
to check for image fails - example:

https://github.com/apache/airflow/pull/13413/checks?check_run_id=1635401914

This change adds appropriate timeouts to all curl commands we
use in CI/manual operations. In many cases we implemented
retry so the effect will be that those cases will stop happening
but even in no-retry case, failing curl is better than hangs.

(cherry picked from commit 0909ddfd24)
2021-01-21 18:51:10 +00:00
Jarek Potiuk 202b636a4c Fix selective checks for changes outside of airflow .py files (#13430)
When no airflow files change, selective tests only run basic
tests, but this is wrong, because many of .py files are
outside of the airflow folder.

In this case we should enable image building because only then
full set of static checks is executed.

This bug caused for example #13403 to succeed even if it failed
static checks after merge.

(cherry picked from commit 1fe83a435d)
2021-01-21 18:51:05 +00:00
Kamil Breguła a4c0221dc9 Fix pylint issues - broken master (#13427)
(cherry picked from commit 10165849b2)
2021-01-21 18:50:51 +00:00
Jarek Potiuk d2b58eef8c Fixed failing pylint errors introduced in #13403 (#13429)
This fixes a failing pylint error introduced in #13403. This error
also trigger another pylint problem involved with c-extension

(cherry picked from commit 6e1a6ff3c8)
2021-01-21 18:50:24 +00:00
Jarek Potiuk 9bfc783449 Removes pip download when installing from local packages (#13422)
This PR improves building production image from local packages,
in preparation for moving provider requirements out of setup.cfg.

Previously `pip download` step was executed in the CI scripts
in order to download all the packages that were needed. However
this had two problems:

1) PIP download was executed outside of Dockerfile in CI scripts
   which means that any change to requirements there could not
   be executed in 'workflow_run' event - because main branch version
   of CI scripts is used there. We want to add extra requirements
   when installing airflow so in order to be able to change
   it, those requirements should be added in Dockerfile.
   This will be done in the follow-up #13409 PR.

2) Packages downloaded with PIP download have a "file" version
   rather than regular == version when you run pip freeze/check.
   This looks weird and while you can figure out the version
   from file name, when you `pip install` them, they look
   much more normal. The airflow package and provider package
   will still get the "file" form but this is ok because we are
   building those packages from sources and they are not yet
   available in PyPI.

Example:

  adal==1.2.5
  aiohttp==3.7.3
  alembic==1.4.3
  amqp==2.6.1
  apache-airflow @ file:///docker-context-files/apache_airflow-2.1.0.dev0-py3-none-any.whl
  apache-airflow-providers-amazon @ file:///docker-context-files/apache_airflow_providers_amazon-1.0.0-py3-none-any.whl
  apache-airflow-providers-celery @ file:///docker-context-files/apache_airflow_providers_celery-1.0.0-py3-none-any.whl
  ...

With this PR, we do not `pip download` all packages, but instead
we prepare airflow + providers packages as .whl files and
install them from there (all the dependencies are installed
from PyPI)

(cherry picked from commit e436883583)
2021-01-21 18:47:14 +00:00
Kamil Breguła 754f14651e Add verbose flag to ./build_docs.py (#13403)
(cherry picked from commit c674f81cb7)
2021-01-21 18:47:09 +00:00
Jarek Potiuk b9c0ed6e1d Change timeout s and disables reverse IP lookup for integrations (#13424)
Seems that we are hitting more often one of the most favourite
bugs by Ash: DNS. Quote: "It's always DNS".

It looks like there is a race condition with docker compose
that causes services that started fast enough (before DNS)
to get a different reverse-DNS IP lookup (usually it is
just `<SERVICE>` but sometimes it is
`<DOCKER_COMPOSE_APP>_<SERVICE>_1_<NETWORK>`).
This produces misleading messages in log that might
make analysis of such problems difficult, that's why
we chose to get rid of the reverse lookup and give
bigger time for each service to check if it is ready.

Netcat, unfortunately performs both forward and reverse
lookup when given a name - forward lookup to find the
IP address and reverse lookup to write information to the
log about the host it connected to - and if it sees
that the original and reverse-looked-up names do not match
even if it manages to connect, it retunrs an error:

`DNS fwd/rev mismatch` - which is very misleading.

This change performs the following:

1) We lookup the host name in python via gethostbyname
2) We set -n in netcat to disable ANY DNS use
3) We feed netcat with the IP address
4) We've standardized all waiting times to be up to 50 seconds

This way we should get rid of the DNS fwd/rev mismatch once
and for all.

(cherry picked from commit ae625b4483)
2021-01-21 18:46:55 +00:00
Bijan Soltani 2244dced69 Improves documentation regarding providers and custom connections 2 (#13410)
(cherry picked from commit 09a2413fe6)
2021-01-21 18:46:28 +00:00
Kamil Breguła 7f117b49cc Fix environment checking for Apache Pinot (#13419)
(cherry picked from commit 27c757d1ee)
2021-01-21 18:46:14 +00:00
Jarek Potiuk 4f16cd8598 Add last-commit example to static-check --help message. (#13411)
(cherry picked from commit 43f150b7f8)
2021-01-21 18:45:57 +00:00
Kamil Breguła 6dd2dbb1d8 Allow ./run_tmux.sh script to run standalone (#13420)
(cherry picked from commit 57143d6b79)
2021-01-21 18:45:40 +00:00
Vivek Bhojawala 5fa5d5cd14 Developers Quick Guide (#13417)
rebased and updated new tmux image as per new changes.

(cherry picked from commit 181d8b66a9)
2021-01-21 18:45:26 +00:00
Kamil Breguła 16018c68bb Enable interpretation of backslash escapes for colored message (#13418)
(cherry picked from commit f6a3c822a3)
2021-01-21 18:45:22 +00:00
Jarek Potiuk 05b805d7bc Set minimum SQLite version supported. (#13412)
* Set minimum SQLite version supported.

Some users reported that some older versions of SQLite do not
work with Airflow 2.0. This happens for example with latest
sqlite available by default on RHEL7 (sqlite version available
in fully updated system there is 7 (!) years old)

Example of such issue: #13397.

Not sure which 'minimum' version is supported but
in the Breeze environment based on debian buster we have
3.27.2 version in fully updated system. This shoudl be our
baseline.

* Update README.md

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>
(cherry picked from commit 670056311a)
2021-01-21 18:44:39 +00:00
Kamil Breguła 26a5178475 Warns politely, do not force run a long operation (#13313)
* Warns politely, do not force run a long operation

(cherry picked from commit 7e1d28b381)
2021-01-21 18:44:33 +00:00
dstandish c2763cb474 Remove reference to scheduler run_duration param in docs (#13346)
* links were old / dead
* run_duration was removed from scheduler
* clarify related notes on backward compat in helm chart values

Co-authored-by: Daniel Standish <dstandish@techstyle.com>
(cherry picked from commit 028d8e8efb)
2021-01-21 18:43:42 +00:00
Kamil Breguła e9cf263a22 Limit old versions of pinotdb to force update on CI (#13402)
(cherry picked from commit a1f4938ec5)
2021-01-21 18:43:36 +00:00
Jarek Potiuk 1cdb04f25c Update persists-credentials (#13401)
Previous change to add persist-credentials #13389 wrongly added
persists-credentials to python-setup rather than checkout
action. Also one of the checkout actions used master rather than
v2 tag.

(cherry picked from commit 85ac03f58c)
2021-01-21 18:43:19 +00:00
I-Yang Chen 856848d137 Update celery.rst to fix broken links. (#13400)
(cherry picked from commit cc9a19d2cd)
2021-01-21 18:42:52 +00:00
dstandish 48caec3724 Simplify CeleryKubernetesExecutor tests (#13307)
* Simplify CeleryKubernetesExecutor tests

Co-authored-by: Daniel Standish <dstandish@techstyle.com>
(cherry picked from commit 10be37513c)
2021-01-21 18:42:34 +00:00
Kamil Breguła 57ebf0135f Fix mallformed table in production-deployment.rst (#13395)
(cherry picked from commit dcedb813e4)
2021-01-21 18:42:05 +00:00
Bijan Soltani 1deeb796be Improves documentation regarding providers and custom connections (#13375)
Co-authored-by: Bijan <me+git@bijansoltani.com>
(cherry picked from commit b52d39f047)
2021-01-21 18:41:33 +00:00
Jarek Potiuk 97d0b0505b Disable persisting credentials in Github Action's checkout (#13389)
This PR disables persisting credentials in Github Actions checkout.

This is a result of discussion in builds@apache.org
https://lists.apache.org/thread.html/r435c45dfc28ec74e28314aa9db8a216a2b45ff7f27b15932035d3f65%40%3Cbuilds.apache.org%3E

It turns out that contrary to the documentation actios (specifically
checkout action) can use GITHUB_TOKEN without specifying it as
input in the yaml file and the GitHub checkout action
leaves the repository with credentials stored locally that
enable pushing to Github Repository by any step in the same
job. This was thought to be forbidden initially (and the
documentation clearly says that the action must have the
GITHUB_TOKEN passed to it in .yaml workflow in order to
use it). But apparently it behaves differently.

This leaves open an attack vector where for example
any PIP package installed in the following steps could push
any changes to GitHub Repository of Apache Airflow.

Security incidents have been reported to both GitHub and
Apache Security team, but in the meantime we add configuration
to remove credentials after checkout step.

https://docs.github.com/en/free-pro-team@latest/actions/reference/authentication-in-a-workflow#using-the-github_token-in-a-workflow

> Using the GITHUB_TOKEN in a workflow

> To use the GITHUB_TOKEN secret, you *must* reference it in your workflow
  file. Using a token might include passing the token as an input to an
  action that requires it, or making authenticated GitHub API calls.

(cherry picked from commit d079b913d2)
2021-01-21 18:41:21 +00:00
Kaxil Naik 28ed4ff28e Use 2.0.0 in Airflow docs & Breeze (#13379)
(cherry picked from commit fe45f1bab5)
2021-01-21 18:40:58 +00:00
Kaxil Naik e591158743 Fix Apache Airflow icon link in Helm Chart (#13387)
Previous link (https://www.astronomer.io/static/airflowNewA.png) is broken.

This commit uses link from official docs too instead of Astronomer.

(cherry picked from commit 410ab8975a)
2021-01-21 18:40:42 +00:00
Kamil Breguła 1a4c651267 Add integration tests for Apache Pinot (#13195)
* Add integration tests for Apache Pinot

* fixup! Add integration tests for Apache Pinot

* fixup! fixup! Add integration tests for Apache Pinot

* fixup! fixup! fixup! Add integration tests for Apache Pinot

* fixup! fixup! fixup! fixup! Add integration tests for Apache Pinot

* Update setup.cfg

(cherry picked from commit 98f097e542)
2021-01-21 18:38:23 +00:00
Kaxil Naik 1a741521bb Fix broken link in PR Welcome message (#13386)
https://github.com/apache/airflow/blob/master/docs/howto/custom-operator.rst no longer exists

New link: https://github.com/apache/airflow/blob/master/docs/apache-airflow/howto/custom-operator.rst

(cherry picked from commit 57cbcf6bbd)
2021-01-21 18:37:27 +00:00
Kaxil Naik 9bd5de3202 Bugfix: Sync Access Control defined in DAGs when running sync-perm (#13377)
fixes https://github.com/apache/airflow/issues/13376

(cherry picked from commit 1b94346fbe)
2021-01-21 18:37:02 +00:00
Kaxil Naik 0f850344c7 Fix typo in Open API docs (#13374)
`releaase` -> `release`

(cherry picked from commit d5cf993f81)
2021-01-21 18:36:53 +00:00
Kaxil Naik 8364362655 Minor enhancements to Sensors docs (#13381)
- Removed redundant comma
- Used list-table so that modifications are easy
- Added syntax highlighting for config code-block

(cherry picked from commit a4a3d3f262)
2021-01-21 18:36:39 +00:00
Kaxil Naik 4b1da369bb Fix Grammar in PIP warning (#13380)
`might leads to errors` -> `might lead to errors`

(cherry picked from commit 295d66f914)
2021-01-21 18:36:26 +00:00
Kian Ghodoussi cbb69e1153 Add Fleek Fashion to the list of Airflow users (#13372)
Add Fleek Fashion to INTHEWILD.md.

(cherry picked from commit bafd258e66)
2021-01-21 18:36:23 +00:00
Jarek Potiuk 643d878bc8 Production image can also be upgraded to newer dependencies (#13345)
Previously UPGRADE_TO_LATEST_CONSTRAINTS variable controlled
whether the CI image uses latest dependencies rather than
fixed constraints. This PR brings it also to PROD image.

The name of the ARG is changed to UPGRADE_TO_NEWER_DEPENDENCIES
as this corresponds better with the intention.

(cherry picked from commit 82fa048c12)
2021-01-21 18:36:19 +00:00
Jarek Potiuk 803a064479 Refactored setup.py to better reflect changes in providers (#13314)
This is a complete refactor of the setup.py providers/dependencies.

It much better reflects the current setup where we have most of
the extras 1-1 reflecting providers but also some extras that do
not have their own providers.

The pre-commits that were verifying setup versus documentation
can now be vastly simplified (no more need to parse the
comments so we can import setup.py variables directly rather
than parse it via regexps. Also we can better categorize the
extras - separate out (and verify) whether we correctly
described deprecated extras and to mark extras that install
additional providers as such.

Fixes: #13309
(cherry picked from commit 0d214575a1)
2021-01-21 18:35:37 +00:00
Jarek Potiuk 047a19790a Adds missing LDAP "extra" dependencies to ldap provider. (#13308)
It seems that for quite some time (1.10.4) the "ldap" extra
missed python-ldap dependency.

https://issues.apache.org/jira/browse/AIRFLOW-5261

Also LDAP seems to be popular enough to be added as default
extra in the production image.

Fixes #13306

(cherry picked from commit d23ac9b235)
2021-01-21 18:34:29 +00:00
Jarek Potiuk f9868b25d1 Print better error message when tests fail (#13339)
The recently added log groupping hides error messages in case
there is an error in tests. You need to manually unfold last test
step which is somewhate hidden - it is followed by several
'dump-container' logs.

This change adds clear error message showing the exact log
group that you need to unfold in case you want to look for
a problem.

(cherry picked from commit f7d354df1c)
2021-01-21 18:34:04 +00:00
Jarek Potiuk 38b9a410e2 Re-enables verification of production image (#13329)
The PROD image is now verified by several checks:

* whether all expected providers are installed
* whether pip-check shows no conflicts
* whether imports are working for expected features

Part of #13315

(cherry picked from commit 3b4290d055)
2021-01-21 18:34:00 +00:00
Kamil Breguła ffe1bea89c json-merge-patch becomes optional library and has looser restrictions (#13175)
(cherry picked from commit e35bdb94b2)
2021-01-21 18:32:56 +00:00
Jarek Potiuk ec68308065 Add missing sqlite provider for production image (#13332)
The production image was missing sqlite provider (this fails
pip check)

(cherry picked from commit 09c6549200)
2021-01-21 18:31:56 +00:00
André Amaral 1c439dae85 Fix the behavior for deactivate the authentication option and documenting the process to do it (#13191)
(cherry picked from commit 4be27af04d)
2021-01-21 18:31:00 +00:00
dstandish 67a3b03317 Prefer newer CLI syntax over legacy in helm chart (#13330)
* saves approx 1 second and an error message when using >= 1.10.14

Co-authored-by: Daniel Standish <dstandish@techstyle.com>
(cherry picked from commit 641f63c2c4)
2021-01-21 18:29:58 +00:00
Jarek Potiuk 42ec1e262c Vastly improves usability of CI logs (#13323)
This change introduces improvements in the way logs are displayed
in CI jobs and in amount of logs produced in general for CI jobs
due to much smarter cache usage.

Logs in all CI jobs are now grouped in groups which are folded
by default when there is no error generated in such group. Similar
solution has been already used in docs job and it improved
both readability and speed of loading of the logs in CI after
recent improvements in Github UI (previously the speed of loading
the logs was not improved by groups).

Also cache usage has been reviewed and fixed in a number of places
which will result in much shorted setup times for static checks
and kubernetes virtualenv but also far shorter logs generated by
cache setup (we are using restore-keys feature that implements
incremental approach for cache building even if cache keys in
GitHub Actions are immutable.

(cherry picked from commit d41c6a46b1)
2021-01-21 18:29:52 +00:00
Jarek Potiuk d352f52fd1 Also add codecov action to apache airflow repo (#13328)
Follow up after #13327

(cherry picked from commit 98896e4e32)
2021-01-21 18:29:48 +00:00
Jarek Potiuk 3a7370f530 Switch to Apache-owned GitHub actions (#13327)
There was a change in Policy of ASF that only "Made by GitHub"
actions and actions residing in Apache-owned repositories
are allowed to be used for ASF projects. This was in
response to a security incident.

More details:

Policy:

* https://infra.apache.org/github-actions-secrets.html

Discussion builds@apache.org:

* https://lists.apache.org/thread.html/r435c45dfc28ec74e28314aa9db8a216a2b45ff7f27b15932035d3f65%40%3Cbuilds.apache.org%3E

Discussion users@infra.apache.org:

* https://lists.apache.org/thread.html/r900f8f9a874006ed8121bdc901a0d1acccbb340882c1f94dad61a5e9%40%3Cusers.infra.apache.org%3E

(cherry picked from commit c6d66cd15f)
2021-01-21 18:29:44 +00:00
Jarek Potiuk dc843192e6 Rename PIP_VERSION to AIRFLOW_PIP_VERSION (#13320)
Some older versions of PIP (including the one in dockerhub!) treat
all env variables starting with PIP_ as a way to pass
options. Setting PIP_VERSION to 20.2.4 and exporting it causes
error "ValueError: invalid truth value '20.2.4'" because it
does not have --version option and it treats it as --verbose

¯\_(ツ)_/¯

You can read more about it here:

https://github.com/pypa/pip/issues/4528

This PR renames the variable to avoid this side effect.

(cherry picked from commit 8fed541192)
2021-01-21 18:29:21 +00:00
dstandish 025793c50b Add pre-commit hook limiting hook name length (#13319)
* When hook names are too long, pre-commit dispay becomes very ugly with many blank lines

Co-authored-by: Daniel Standish <dstandish@techstyle.com>
(cherry picked from commit 91acdbea05)
2021-01-21 18:29:13 +00:00