Граф коммитов

3214 Коммитов

Автор SHA1 Сообщение Дата
dstandish 62a5b2dfa4
Fix return type in prev-date context variables (#12910) 2020-12-09 15:10:03 +00:00
Anderson Reyes 1d91ca70f2
Infer multiple outputs from dict annotations in TaskFlow API (#10349) 2020-12-09 14:45:15 +00:00
Ash Berlin-Taylor 0bf386fdf2
Rename airflow.operators.dagrun_operator to airflow.operators.trigger_dagrun (#12933)
Part of AIP-21

Co-authored-by: Kishore Vancheeshwaran <24776049+kishvanchee@users.noreply.github.com>
2020-12-09 14:00:51 +00:00
Kaxil Naik a075b6df99
Rename remaining Sensors to match AIP-21 (#12927)
As discussed in AIP-21

* Rename airflow.sensors.external_task_sensor to airflow.sensors.external_task
* Rename airflow.sensors.sql_sensor to airflow.sensors.sql
* Rename airflow.contrib.sensors.weekday_sensor to airflow.sensors.weekday
2020-12-09 00:09:08 +00:00
Kamil Breguła abce78c53b
Disable experimental REST API by default (#12337) 2020-12-08 19:27:27 +00:00
Kishore Vancheeshwaran d5589673a9
Move dummy_operator.py to dummy.py (#11178) (#11293)
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-12-08 19:19:46 +00:00
Ash Berlin-Taylor b40dffa085
Rename remaing modules to match AIP-21 (#12917)
As discussed in AIP-21

* Rename airflow.hooks.base_hook to airflow.hooks.base
* Rename airflow.hooks.dbapi_hook to airflow.hooks.dbapi
* Rename airflow.sensors.base_sensor_operator to airflow.sensors.base
* Rename airflow.sensors.date_time_sensor to airflow.sensors.date_time
* Rename airflow.sensors.time_delta_sensor to airflow.sensors.time_delta

Co-authored-by: Kaxil Naik <kaxilnaik@apache.org>
2020-12-08 18:01:58 +00:00
Kishore Vancheeshwaran bfbd4bbb70
Moved subdag_operator.py to subdag.py (#11307)
Part of #11178
2020-12-08 16:26:39 +00:00
Jarek Potiuk 9b39f24780
Add support for dynamic connection form fields per provider (#12558)
Connection form behaviour depends on the connection type. Since we've
separated providers into separate packages, the connection form should
be extendable by each provider. This PR implements both:

  * extra fields added by provider
  * configurable behaviour per provider

This PR will be followed by separate documentation on how to write your
provider.

Also this change triggers (in tests only) the snowflake annoyance
described in #12881 so we had to xfail presto test where monkeypatching
of snowflake causes the test to fail.

Part of #11429
2020-12-08 16:00:37 +01:00
Kaxil Naik ff25bd6ffe
Make xcom_pull results order deterministic (#12905)
closes https://github.com/apache/airflow/issues/11858
2020-12-08 13:00:06 +00:00
Kaxil Naik ef523b4c2b
Move branch_operator.py to branch.py (#12900)
Part of #11178 for the branch_operator
2020-12-08 08:41:16 +00:00
Kaxil Naik 5e0a2d772b
Fix failing test in TestSchedulerJob (#12906)
This change was missed in https://github.com/apache/airflow/pull/12899
2020-12-08 02:49:50 +00:00
Rik Heijdens 29d78489e7
Fix plugin macros not being exposed through airflow.macros (#12788)
In order to allow a plugin-provided macro to be used at templating time,
it needs to be exposed through the airflow.macros module.

* Add cleanup logic to test_registering_plugin_macros

This test-case has side-effects in the sense that the symbol table of
the airflow.macros module is altered when integrate_macros_plugins() is
invoked. This commit adds a finalizer to the test case that ensures that
that module is being reloaded completely in order to prevent impact on
other tests.


* Integrate plugin-provided macros in subprocesses

When Airflow is available in a virtual environment, and when this
environment runs at least Python 3, then plugin-provided macros should
be made available to the Python callable that is being executed in this
environment.

* Document macros limitation

Plugin-provided macros can not be used on Python 2 when using
PythonVirtualenvOperator any longer.
2020-12-07 22:34:14 +00:00
Daniel Imberman 190066cf20
Kubernetes worker pod doesn't use docker container entrypoint (#12766)
* Kubernetes worker pod doesn't use docker container entrypoint

Fixes issue on openshift caused by KubernetesExecutor pods not running
via the entrypoint script

* fix

* Update UPGRADING_TO_2.0.md

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* fix UPDGRADING

* @ashb comments

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-12-07 13:54:45 -08:00
Kaxil Naik 312a2813c5
Bugfix: Entrypoint Import Exception masked by attribute error (#12862)
`entry_point.module_name` -- Entrypoint does not have a `module_name`
attribute.

This commit also makes importlib_metadata conditional as it is not
needed for Py 3.9
2020-12-07 17:21:48 +00:00
Alok Shenoy 01707d71d9
Improve support for special characters in DbApiHook.get_uri (#12775) 2020-12-07 17:51:09 +01:00
Ash Berlin-Taylor 5d328a2f7e
Show DAG serialization errors in the UI. (#12866)
The previous behaviour led to "bad" data being written in the DB -- for
example:

```json
    "dag": {
        "tasks": [
            "serialization_failed"
        ],
```

(`tasks` should be a list of dictionaries. It clearly isn't.)

Instead of doing this we throw an error, that is captured and showing
using the existing import_error mechanism for DAGs. This almost
certainly happens because a user has done "something interesting".
2020-12-07 12:28:12 +00:00
Kamil Breguła a878959b2c
Remove old option - git_password from sensitive_config_values (#12821) 2020-12-07 09:52:14 +01:00
Jarek Potiuk ed1825c026
Production images on CI are now built from packages (#12685)
So far, the production images of Airflow were using sources
when they were built on CI. This PR changes that, to build
airflow + providers packages first and install them
rather than use sources as installation mechanism.

Part of #12261
2020-12-06 23:36:33 +01:00
Ash Berlin-Taylor c045ff335e
Store per-task TIDeps in serialized blob (#12858)
Without this change sensors in "reschedule" mode were being instantly
rescheduled because they didn't have the extra dep that
BaseSensorOperator added.

To fix that we need to include deps in the serialization format (but to
save space only when they are not the default list). As of this PR right
now, only built-in deps are allowed -- a custom dep will result in a DAG
parse error.

We can fix that for 2.0.x, as I think it is a _very_ uncommon thing to
do.

Fixes #12783
2020-12-06 21:55:53 +00:00
Jarek Potiuk 490a01bcab
Quarantine test TestSchedulerJob.test_scheduler_task_start_date (#12860) 2020-12-06 21:21:32 +01:00
Ash Berlin-Taylor 4a02e0a287
Don't emit first_task_scheduling_delay metric for only-once dags (#12835)
Dags with a schedule interval of None, or `@once` don't have a following
schedule, so we can't realistically calculate this metric.

Additionally, this changes the emitted metric from seconds to
milliseconds -- all timers to statsd should be in milliseconds -- this
is what Statsd and apps that consume data from there expect. See #10629
for more details.

This will be a "breaking" change from 1.10.14, where the metric was
back-ported to, but was (incorrectly) emitting seconds.
2020-12-05 21:56:51 +00:00
Jarek Potiuk 1dcd3e13fd
Add support for extra links coming from the providers (#12472)
Closes: #11431
2020-12-05 16:24:38 +01:00
Ash Berlin-Taylor 3ff5a35494
Add paused column to `dags list` sub-command (#12830)
This can still show "None" if the dag is not yet in the metadata DB --
showing either True or False there would give a false impression
(especially False -- as if it doesn't exist in the DB it can't be
unpaused yet!)
2020-12-05 14:34:36 +00:00
yuqian90 12ce5be77f
Fix for empty Graph View when task does not have a DAG during relationship setting (#12829)
Closes #12757
2020-12-05 11:52:55 +00:00
Shekhar Singh cd66450b4e
Add Telegram hook and operator (#11850)
closes: #11845

Adds:

Telegram Hook
Telegram Operator
2020-12-05 11:21:11 +00:00
Ash Berlin-Taylor 252b04718e
Configuration.getsection copes with sections that only exist in user config (#12816)
If you try to run `airflow config list` with an old config you upgraded
from 1.8, it would fail for any sections that have been removed from the
default cofing -- `ldap` for instance.

This would also be a problem if the user makes a typo in a config
section, or is using the airflow config for storing their own
information.

While I was changing this code, I also removed the use of private
methods/variable access in favour of public API
2020-12-05 07:16:47 +01:00
Daniel Imberman e82cf0d01d
Dagrun object doesn't exist in the TriggerDagRunOperator (#12819)
* Dagrun object doesn't exist in the TriggerDagRunOperator

fixes  https://github.com/apache/airflow/issues/12587

Fixes issue where dag_run object is not populated if the dag_run already
exists and is reset

* change to get_last_dag_run

* Update airflow/operators/dagrun_operator.py

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
2020-12-04 20:10:45 -08:00
Siddartha Ravichandran 88aa174047
Add SMTP timeout and retry limit for SMTP email backend. (#12801) 2020-12-04 16:20:39 +01:00
Tomek Urbaszek 1bd98cd54c
Improve error handling in cli and introduce consistency (#12764)
This PR is a followup after #12375 and #12704 it improves handling
of some errors in cli commands to avoid show users to much traceback
and uses SystemExit consistently.
2020-12-04 10:41:41 +01:00
宋财礼 d9d6dafb18
Fix the exception that the port is empty when using db shell (#12740)
* Fix the exception that the port is empty when using db shell
2020-12-03 19:19:26 +01:00
Ephraim Anierobi b62abfbfae
Handle ParserError when dag is triggered with invalid execution_date (#12618) 2020-12-03 17:04:14 +01:00
Darwin Yip 2947e09999
SlackWebhookHook use password instead of extra (#12674)
closes: #12214
2020-12-02 14:09:28 +00:00
Kaxil Naik 101da213c5
Optimize subclasses of DummyOperator for Scheduling (#12745)
Custom operators inheriting from DummyOperator will now instead
 of going to a scheduled state will go set straight to success
 if they don't have callbacks set.

 closes https://github.com/apache/airflow/issues/11393
2020-12-02 12:49:17 +00:00
Tomek Urbaszek cba8d62553
Refactor list rendering in commands (#12704)
This commit unifies the mechanism of rendering output of tabular
data. This gives users a possibility to eiter display a tabular
representation of data or render it as valid json or yaml payload.

Closes: #12699

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-12-02 10:20:16 +01:00
Xiaodong DENG ae0e8f4732
Move config item 'worker_precheck' from section [core] to [celery] (#12746)
* Move config item 'worker_precheck' from section [core] to [celery]

This configuration is ONLY applicable for Celery Worker.
So it should be in section [celery], rather than [core]

* Add to deprecation/migration automatic list
2020-12-02 07:47:02 +01:00
Jarek Potiuk a02e0f746f
User-friendly output of Breeze and CI scripts (#12735) 2020-12-01 17:44:05 +01:00
Kaxil Naik ac3a8bfb0c
Allow switching xcom_pickling to JSON/Pickle (#12724)
Without this commit, the Webserver throws an error when
enabling xcom_pickling in the airflow_config by setting `enable_xcom_pickling = True`
(the default is `False`).

Example error:

```
>           return pickle.loads(result.value)
E           _pickle.UnpicklingError: invalid load key, '{'.

airflow/models/xcom.py:250: UnpicklingError
--------------------------------------------------
```
2020-12-01 11:35:34 +00:00
HasanJ 4ac66cf8c4
Deprecate BaseHook.get_connections method (#10135) (#10192)
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-30 16:29:30 +00:00
Jarek Potiuk 2037303eef
Adds support for Connection/Hook discovery from providers (#12466)
* Adds support for Hook discovery from providers

This PR extends providers discovery with the mechanism
of retrieving mapping of connections from type to hook.

Fixes #12456

* fixup! Adds support for Hook discovery from providers

* fixup! fixup! Adds support for Hook discovery from providers
2020-11-29 15:31:49 +01:00
Tomek Urbaszek c9d1ea5cf8
Refactor airflow plugins command (#12697)
This commit refactors plugins command to make it more
user-friendly, structured and easier to read.
2020-11-29 12:16:59 +01:00
Ash Berlin-Taylor 02d94349be
Don't use time.time() or timezone.utcnow() for duration calculations (#12353)
`time.time() - start`, or `timezone.utcnow() - start_dttm` will work
fine in 99% of cases, but it has one fatal flaw:

They both operate on system time, and that can go backwards.

While this might be surprising, it can happen -- usually due to clocks
being adjusted.

And while it is might seem rare, for long running processes it is more
common than we might expect. Most of these durations are harmless to get
wrong (just being logs) it is better to be safe than sorry.

Also the `utcnow()` style I have replaced will be much lighter weight -
creating a date time object is a comparatively expensive operation, and
computing a diff between two even more so, _especially_ when compared to
just subtracting two floats.

To make the "common" case easier of wanting to compute a duration for a
block, I have made `Stats.timer()` return an object that has a
`duration` field.
2020-11-29 10:12:30 +00:00
Ash Berlin-Taylor 8291fabaf9
Ensure that tasks set to up_for_retry have an end date (#12675)
If a task is "manually" set to up_for_retry (via the UI for instance) it
might not have an end date, and much of the logic about computing
retries assumes that it does.

Without this, manually setting a running task to up_for_retry will make
the make it impossible to view the TaskInstance details page (as it
tries to print the is_premature property), and also the NotInRetryPeriod
TIDep fails - both with an exception:

> File "airflow/models/taskinstance.py", line 882, in next_retry_datetime
>   return self.end_date + delay
> TypeError: unsupported operand type(s) for +: 'NoneType' and 'datetime.timedelta'
2020-11-29 10:11:50 +00:00
Ash Berlin-Taylor 7ef9aa7d54
Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (#12694)
Using `pkg_resources.iter_entry_points` validates the version
constraints, and if any fail it will throw an Exception for that
entrypoint.

This sounds nice, but is a huge mis-feature.

So instead of that, switch to using importlib.metadata (well, it's
backport importlib_metadata) that just gives us the entrypoints - no
other verification of requirements is performed.

This has two advantages:

1. providers and plugins load much more reliably.
2. it's faster too

Closes #12692
2020-11-29 07:19:47 +01:00
Tomek Urbaszek 850b74befe
Use rich to render info and cheat-sheet command (#12689) 2020-11-29 00:29:58 +01:00
Ephraim Anierobi 543d88b3a1
Add example dag and system tests for azure wasb and fileshare (#12673) 2020-11-28 06:27:00 +01:00
Jarek Potiuk 6b3c6add9e
Update setup.py to get non-conflicting set of dependencies (#12636)
This change upgrades setup.py and setup.cfg to provide non-conflicting
`pip check` valid set of constraints for CI image.

Fixes #10854

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>
2020-11-27 20:06:44 +01:00
Jarek Potiuk 41a699a7bd
Implement reading provider information from packages/sources (#12512)
This PR implements discovering and readin provider information from
packages (using entry_points) and - if found - from local
provider yaml files for the built-in airflow providers,
when they are found in the airflow.provider packages.
The provider.yaml files - if found - take precedence over the
package-provided ones.

Add displaying provider information in CLI

Closes: #12470
2020-11-27 18:42:32 +01:00
Kaxil Naik 5fafd982ca
Replace foreign key constraints with foreign annotation (#12603)
closes https://github.com/apache/airflow/issues/12448
2020-11-27 12:59:23 +00:00
Tobiasz Kędzierski e1ebfa68b1
Add DataflowJobMessagesSensor and DataflowAutoscalingEventsSensor (#12249) 2020-11-27 13:02:13 +01:00