Граф коммитов

11094 Коммитов

Автор SHA1 Сообщение Дата
Tomek Urbaszek 67acdbdf92
Remove store_serialized_dags from config (#12754)
The store_serialized_dags is no longer used in our code base as
both webserver and scheduler require serialization.
2020-12-02 18:18:48 +01:00
Kaxil Naik 0400ee32d4
Allow using _CMD / _SECRET to set `[webserver] secret_key` config (#12742)
`[webserver] secret_key` is also a secret like Fernet key. Allowing
it to be set via _CMD or _SECRET allows users to use the external secret store for it.
2020-12-02 14:39:26 +00:00
Darwin Yip 2947e09999
SlackWebhookHook use password instead of extra (#12674)
closes: #12214
2020-12-02 14:09:28 +00:00
Danny 03fa6edc7a
Order broken DAG messages in UI (#12749) 2020-12-02 13:37:59 +00:00
Kaxil Naik 101da213c5
Optimize subclasses of DummyOperator for Scheduling (#12745)
Custom operators inheriting from DummyOperator will now instead
 of going to a scheduled state will go set straight to success
 if they don't have callbacks set.

 closes https://github.com/apache/airflow/issues/11393
2020-12-02 12:49:17 +00:00
Ash Berlin-Taylor dab783fcdc
Don't let webserver run with dangerous config (#12747) 2020-12-02 10:55:22 +00:00
Tomek Urbaszek cba8d62553
Refactor list rendering in commands (#12704)
This commit unifies the mechanism of rendering output of tabular
data. This gives users a possibility to eiter display a tabular
representation of data or render it as valid json or yaml payload.

Closes: #12699

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-12-02 10:20:16 +01:00
Xiaodong DENG ae0e8f4732
Move config item 'worker_precheck' from section [core] to [celery] (#12746)
* Move config item 'worker_precheck' from section [core] to [celery]

This configuration is ONLY applicable for Celery Worker.
So it should be in section [celery], rather than [core]

* Add to deprecation/migration automatic list
2020-12-02 07:47:02 +01:00
Shivansh Saini da2a7d6b33
Added Headout to INTHEWILD (#12734)
Signed-off-by: Shivansh Saini <shivanshs9@gmail.com>
2020-12-01 19:21:48 +00:00
Florent Chehab a697c588c4
Fix chart jobs delete policy for improved idempotency (#12646)
The chart has two jobs (migrate-database & create-user).
These jobs are run post-install and post-upgrade and only deleted on success.

So if for some reason (quick reinstall / upgrade), the job fails or is stuck then helm
will fail because the job already exists.

This commit sets the `helm.sh/hook-delete-policy` to `before-hook-creation,hook-succeeded`
so helm will always delete the jobs before creating them again.
2020-12-01 19:24:42 +01:00
Jarek Potiuk a02e0f746f
User-friendly output of Breeze and CI scripts (#12735) 2020-12-01 17:44:05 +01:00
Jarek Potiuk 0451d84ea2
Pins PIP to 20.2.4 in our Dockerfiles (#12738)
Until we make sure that the new resolver in PIP 20.3 works
we should pin PIP to 20.2.4.

This is hopefully a temporary measure.

Part of #12737
2020-12-01 17:39:55 +01:00
Kaxil Naik ac3a8bfb0c
Allow switching xcom_pickling to JSON/Pickle (#12724)
Without this commit, the Webserver throws an error when
enabling xcom_pickling in the airflow_config by setting `enable_xcom_pickling = True`
(the default is `False`).

Example error:

```
>           return pickle.loads(result.value)
E           _pickle.UnpicklingError: invalid load key, '{'.

airflow/models/xcom.py:250: UnpicklingError
--------------------------------------------------
```
2020-12-01 11:35:34 +00:00
Kamil Breguła 027fd743d6
Fix static checks - #12715 (#12729) 2020-12-01 11:10:54 +00:00
Ry Walker 91c22032b9
Stronger language re: SQLite (#12727) 2020-12-01 10:01:07 +01:00
Jarek Potiuk ebc8fcf199
Improve verification of images with PIP check (#12718)
Verification of images with PIP is done in separate jobs and
they provide useful information to committers and contributors
when the pip check fails.
2020-12-01 09:51:24 +01:00
Jarek Potiuk e4cb0ef192
Output of installing remaining packages is shown also on success (#12723)
Previously the output of instaling remaining packges when testing
provider imports was only shown on error. However it is useful
to know what's going on even if it clutters the log.

Note that this installation is only needed until we include
apache-beam in the installed packages on CI.

Related to #12703

This PR shows the output always .
2020-12-01 09:51:05 +01:00
Kamil Breguła 42f0a3d628
Move apache-airflow docs to subdirectory (#12715) 2020-12-01 02:14:20 +01:00
Mutlu Polatcan dee304b222
Add Getir to in the wild! (#12719)
* Add Getir to in the wild!

* hotfix - Add Getir to in the wild
2020-11-30 18:29:53 +01:00
HasanJ 4ac66cf8c4
Deprecate BaseHook.get_connections method (#10135) (#10192)
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-30 16:29:30 +00:00
Ash Berlin-Taylor 5e13c37286
Remove deprecated dagbag metrics (#12695)
These were deprecated in 1.10.6 via #6157, so we should remove them
before 2.0 rolls around.
2020-11-30 09:14:15 +00:00
Kamil Breguła bd90136aaf
Move operator guides to provider documentation packages (#12681) 2020-11-30 08:48:24 +01:00
Xiaodong DENG bb00f164da
Refine the DB query logics in www.views.task_stats() (#12707)
* Refine the DB query logics in www.views.task_stats()

- given filter_dag_ids is either allowed_dag_ids, or intersection of allowed_dag_ids and selected_dag_ids,
  no matter if selected_dag_ids is None or not, filter_dag_ids should ALWAYS be considered into the SQL query.

  Currently, if selected_dag_ids is None, the query is actually getting the full result (then 'filter' at the end).
  This means more (unnecessary) data travel between Airflow and DB.

- When we join table A and B with A.id == B.id (default is INNER join), if we always confirm ALL A.id is in a specific list,
  implicitly ALL ids in the result table are already guaranteed in this specific list as well.
  This is why the two redundant .filter() chunks are removed.

Minor performance improvement should be expected.
Meanwhile, this change makes the code cleaner.
2020-11-29 22:22:38 +01:00
raphaelauv e57de8c758
Remove now-incorrect warning about pools and multiple schedulers (#12709)
Reverts #7643 now it's not true anymore after Scheduler HA has landed
2020-11-29 20:51:21 +00:00
Jarek Potiuk 7e426d3f47
Improve wording of selective checks comments (#12701) 2020-11-29 18:27:09 +01:00
Jarek Potiuk 2037303eef
Adds support for Connection/Hook discovery from providers (#12466)
* Adds support for Hook discovery from providers

This PR extends providers discovery with the mechanism
of retrieving mapping of connections from type to hook.

Fixes #12456

* fixup! Adds support for Hook discovery from providers

* fixup! fixup! Adds support for Hook discovery from providers
2020-11-29 15:31:49 +01:00
Tomek Urbaszek c9d1ea5cf8
Refactor airflow plugins command (#12697)
This commit refactors plugins command to make it more
user-friendly, structured and easier to read.
2020-11-29 12:16:59 +01:00
Ash Berlin-Taylor 02d94349be
Don't use time.time() or timezone.utcnow() for duration calculations (#12353)
`time.time() - start`, or `timezone.utcnow() - start_dttm` will work
fine in 99% of cases, but it has one fatal flaw:

They both operate on system time, and that can go backwards.

While this might be surprising, it can happen -- usually due to clocks
being adjusted.

And while it is might seem rare, for long running processes it is more
common than we might expect. Most of these durations are harmless to get
wrong (just being logs) it is better to be safe than sorry.

Also the `utcnow()` style I have replaced will be much lighter weight -
creating a date time object is a comparatively expensive operation, and
computing a diff between two even more so, _especially_ when compared to
just subtracting two floats.

To make the "common" case easier of wanting to compute a duration for a
block, I have made `Stats.timer()` return an object that has a
`duration` field.
2020-11-29 10:12:30 +00:00
Ash Berlin-Taylor 8291fabaf9
Ensure that tasks set to up_for_retry have an end date (#12675)
If a task is "manually" set to up_for_retry (via the UI for instance) it
might not have an end date, and much of the logic about computing
retries assumes that it does.

Without this, manually setting a running task to up_for_retry will make
the make it impossible to view the TaskInstance details page (as it
tries to print the is_premature property), and also the NotInRetryPeriod
TIDep fails - both with an exception:

> File "airflow/models/taskinstance.py", line 882, in next_retry_datetime
>   return self.end_date + delay
> TypeError: unsupported operand type(s) for +: 'NoneType' and 'datetime.timedelta'
2020-11-29 10:11:50 +00:00
Ash Berlin-Taylor 7ef9aa7d54
Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (#12694)
Using `pkg_resources.iter_entry_points` validates the version
constraints, and if any fail it will throw an Exception for that
entrypoint.

This sounds nice, but is a huge mis-feature.

So instead of that, switch to using importlib.metadata (well, it's
backport importlib_metadata) that just gives us the entrypoints - no
other verification of requirements is performed.

This has two advantages:

1. providers and plugins load much more reliably.
2. it's faster too

Closes #12692
2020-11-29 07:19:47 +01:00
Tomek Urbaszek 850b74befe
Use rich to render info and cheat-sheet command (#12689) 2020-11-29 00:29:58 +01:00
Kamil Breguła 0a1b434d28
Move production deployments tips to docs/production-deployment.rst (#12686) 2020-11-28 20:05:06 +01:00
Jarek Potiuk 64f14759e4
Fix typos and added missing descriptions in provider.yaml schema (#12690) 2020-11-28 19:27:08 +01:00
Jarek Potiuk b858683abf
Adds providers information to `airflow info` command (#12687) 2020-11-28 13:42:19 +01:00
Jarek Potiuk 66b552513c
The Pyarrow limitation in install_requires is not needed. (#12683)
It was added to make snowflake happy, but it is not needed as
package requirement in fact and google provider complains when
the version of pyarrow is too low.

Also when PyArrow limitation is removed, we have to limit
the importlib_resources library back.
2020-11-28 12:40:17 +01:00
Jarek Potiuk e4ab453a37
Setup.cfg change triggers full build (#12684)
Since we moved part of the setup.py specification to
setup.cfg, we should trigger full build when only that file
changes.
2020-11-28 12:39:46 +01:00
Kamil Breguła 08bc62b64d
Validate JSON schema files with JSON Schema (#12682) 2020-11-28 12:12:54 +01:00
Jarek Potiuk 1c500ee62c
Temporarily disable PROD image check until Azure Blob is fixed (#12679)
This PR disables temporarily PIP check result for production
image, until the fix to switch Azure Blob to v12 is fixed.
2020-11-28 10:45:14 +01:00
Nathan Hadfield 76bcd08dca
Added `@apply_defaults` decorator. (#12620) 2020-11-28 09:57:29 +01:00
Kamil Breguła de3b1e687b
Move connection guides to provider documentation packages (#12653) 2020-11-28 08:09:53 +01:00
Ephraim Anierobi 543d88b3a1
Add example dag and system tests for azure wasb and fileshare (#12673) 2020-11-28 06:27:00 +01:00
Jarek Potiuk 3b138d2d60
Remove "@" references from constraints generattion (#12671)
Likely fixes: #12665
2020-11-28 06:04:45 +01:00
Kamil Breguła 944bd4c658
Fix packages errors summary for docs build (#12658) 2020-11-28 00:17:14 +01:00
Kaxil Naik 704e724cc1
Make migrations using kube_resource_version idempotent (#12670)
closes https://github.com/apache/airflow/issues/12666
2020-11-27 22:49:31 +00:00
Jarek Potiuk fa8af2d165
Enable PIP check for both CI and PROD image (#12664)
This PR enables PIP check after constraints have been updated
to be stable and 'pip check' compliant in #12636
2020-11-27 21:33:50 +01:00
Jarek Potiuk 6b3c6add9e
Update setup.py to get non-conflicting set of dependencies (#12636)
This change upgrades setup.py and setup.cfg to provide non-conflicting
`pip check` valid set of constraints for CI image.

Fixes #10854

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>
2020-11-27 20:06:44 +01:00
Jarek Potiuk 41a699a7bd
Implement reading provider information from packages/sources (#12512)
This PR implements discovering and readin provider information from
packages (using entry_points) and - if found - from local
provider yaml files for the built-in airflow providers,
when they are found in the airflow.provider packages.
The provider.yaml files - if found - take precedence over the
package-provided ones.

Add displaying provider information in CLI

Closes: #12470
2020-11-27 18:42:32 +01:00
Kaxil Naik 531e00660a
Typo Fix: Deprecated config force_log_out_after was not used (#12661)
`force_logout_after` should be `force_log_out_after` in the code
section https://github.com/apache/airflow/blob/master/airflow/settings.py#L372-L381.

As `force_log_out_after` is actually used and written in
c5700a56bb/UPDATING.md (unify-user-session-lifetime-configuration).
2020-11-27 18:36:10 +01:00
Tomek Urbaszek 456a1c5dc9
Restructure the extras in setup.py and described them (#12548)
Closes: #12544

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-11-27 15:34:47 +01:00
Kaxil Naik 9a74ee5fff
Add 1.10.13 to CI, Breeze and Docs (#12652) 2020-11-27 13:35:28 +00:00