Граф коммитов

7071 Коммитов

Автор SHA1 Сообщение Дата
Ash Berlin-Taylor c045ff335e
Store per-task TIDeps in serialized blob (#12858)
Without this change sensors in "reschedule" mode were being instantly
rescheduled because they didn't have the extra dep that
BaseSensorOperator added.

To fix that we need to include deps in the serialization format (but to
save space only when they are not the default list). As of this PR right
now, only built-in deps are allowed -- a custom dep will result in a DAG
parse error.

We can fix that for 2.0.x, as I think it is a _very_ uncommon thing to
do.

Fixes #12783
2020-12-06 21:55:53 +00:00
Ash Berlin-Taylor 75d8ff96b4
Mark required fields in Forms as required (#12856)
We have a number of custom forms that have required fields that weren't
explicitly marked as required.

This allowed you to submit the Connection form (for example) with
nothing as the Conn Id, leading to an empty string being used as the
connection id. This marks that and all the other required fields as
required.

We also replace DataRequired with InputRequired. The previous one is
tested the truthyness of the value, rather than just that a value was
submitted.
2020-12-06 21:54:13 +00:00
Jarek Potiuk 537aba738c
Add conditional version retrieval from setup. (#12853)
When airflow is not installed as package (for example for local
development from sources) there is no package metadata.

Many of our unit tests use the version field and they fail if they
are run within virtual environment where airflow is not installed
as package (for example in IntelliJ this is default setting.

This PR adds fall-back to read airflow version from setup in
case it cannot be read from package metadata.
2020-12-06 19:58:17 +01:00
Ash Berlin-Taylor 4a02e0a287
Don't emit first_task_scheduling_delay metric for only-once dags (#12835)
Dags with a schedule interval of None, or `@once` don't have a following
schedule, so we can't realistically calculate this metric.

Additionally, this changes the emitted metric from seconds to
milliseconds -- all timers to statsd should be in milliseconds -- this
is what Statsd and apps that consume data from there expect. See #10629
for more details.

This will be a "breaking" change from 1.10.14, where the metric was
back-ported to, but was (incorrectly) emitting seconds.
2020-12-05 21:56:51 +00:00
Ash Berlin-Taylor 37b2679112
Make `airflow --help` run five times quicker (#12836)
Importing anything from airflow.models pulls in _a lot_ of airflow, so
delay imports until the functions are called, or make use of the
`TYPE_CHECKING` to not actually import at runtime.

**Before**: mean 2.58s (with a lot of variance)
```
airflow ❯ for i in 1 2 3; do time airflow --help >/dev/null; done
airflow --help > /dev/null  2.00s user 1.39s system 176% cpu 1.928 total
airflow --help > /dev/null  2.84s user 1.43s system 151% cpu 2.817 total
airflow --help > /dev/null  3.00s user 1.37s system 145% cpu 3.009 total

```

**After**: 0.526s

```
airflow --help > /dev/null  0.39s user 0.04s system 99% cpu 0.435 total
airflow --help > /dev/null  0.40s user 0.05s system 99% cpu 0.446 total
airflow --help > /dev/null  0.64s user 0.05s system 99% cpu 0.698 total
```

This also has an advantage in development where a syntax error doesn't
fail with a slightly confusing error message about "unable to configure
logger 'task'".
2020-12-05 21:56:25 +00:00
Jarek Potiuk e9b2ff57b8
Add notes about PIP 20.3 breaking Airflow installation (#12840)
Part of #12838
2020-12-05 19:53:09 +01:00
Xiaodong DENG 1f4152b551
Fix docstring for models.Variable.get() (#12828) 2020-12-05 17:08:28 +01:00
Jarek Potiuk 1dcd3e13fd
Add support for extra links coming from the providers (#12472)
Closes: #11431
2020-12-05 16:24:38 +01:00
Ash Berlin-Taylor 6150e265a0
Add `-o` as short form option for `--output` in CLI (#12831) 2020-12-05 15:16:14 +00:00
Ash Berlin-Taylor 3ff5a35494
Add paused column to `dags list` sub-command (#12830)
This can still show "None" if the dag is not yet in the metadata DB --
showing either True or False there would give a false impression
(especially False -- as if it doesn't exist in the DB it can't be
unpaused yet!)
2020-12-05 14:34:36 +00:00
yuqian90 12ce5be77f
Fix for empty Graph View when task does not have a DAG during relationship setting (#12829)
Closes #12757
2020-12-05 11:52:55 +00:00
Shekhar Singh cd66450b4e
Add Telegram hook and operator (#11850)
closes: #11845

Adds:

Telegram Hook
Telegram Operator
2020-12-05 11:21:11 +00:00
Ash Berlin-Taylor 252b04718e
Configuration.getsection copes with sections that only exist in user config (#12816)
If you try to run `airflow config list` with an old config you upgraded
from 1.8, it would fail for any sections that have been removed from the
default cofing -- `ldap` for instance.

This would also be a problem if the user makes a typo in a config
section, or is using the airflow config for storing their own
information.

While I was changing this code, I also removed the use of private
methods/variable access in favour of public API
2020-12-05 07:16:47 +01:00
Xiaodong DENG fbb8a4a151
Cleanup & improvements around scheduling (#12815)
* Cleanup & improvement around scheduling

- Remove unneeded code line
- Remove stale docstring
- Fix wrong docstring
- Fix stale doc image link in docstring
- avoid unnecessary loop in DagRun.schedule_tis()
- Minor improvement on DAG.deactivate_stale_dags()
  which is invoked inside SchedulerJob

* Revert one change, because we plan to have a dedicated project-wise PR for this issue

* One more fix: dagbag.read_dags_from_db = True in DagFileProcess.process_file() is not needed anymore
2020-12-05 07:12:59 +01:00
Sam Wheating c85f49454d
Updating documentation to specify sensitive config options (#12820) 2020-12-05 07:00:18 +01:00
Daniel Imberman e82cf0d01d
Dagrun object doesn't exist in the TriggerDagRunOperator (#12819)
* Dagrun object doesn't exist in the TriggerDagRunOperator

fixes  https://github.com/apache/airflow/issues/12587

Fixes issue where dag_run object is not populated if the dag_run already
exists and is reset

* change to get_last_dag_run

* Update airflow/operators/dagrun_operator.py

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
2020-12-04 20:10:45 -08:00
zclimes c1cd50465c
Add 'headers' to template_fields in HttpSensor (#12809)
Co-authored-by: Zachary Climes <zclimes@zclimes-DCH2.com>

closes: #12796
2020-12-05 00:59:52 +00:00
Sam Wheating 37afe55775
Fix paths to images in README.md (#12756) 2020-12-04 22:07:53 +01:00
Ash Berlin-Taylor 2936c13a44
Get airflow version from importlib.metadata rather than hard-coding (#12786)
One less thing to change, and one less pre-commit step needed :)
2020-12-04 16:42:25 +00:00
Siddartha Ravichandran 88aa174047
Add SMTP timeout and retry limit for SMTP email backend. (#12801) 2020-12-04 16:20:39 +01:00
Tomek Urbaszek 1bd98cd54c
Improve error handling in cli and introduce consistency (#12764)
This PR is a followup after #12375 and #12704 it improves handling
of some errors in cli commands to avoid show users to much traceback
and uses SystemExit consistently.
2020-12-04 10:41:41 +01:00
James Timmins abf5104354
Convert state arguments to ExternalTaskSensor to list (#12794)
Without this if you pass a tuple or a set etc things would break.
2020-12-04 09:30:03 +00:00
Xiaodong DENG 4da94b5a19
Clean noqa labels wrongly handled by black linter (#12791) 2020-12-03 22:42:15 +01:00
Xiaodong DENG 6b339c70c4
Avoid log spam & have more meaningful log when pull image in DockerOperator (#12763)
Fixing issue reported in https://github.com/apache/airflow/issues/12576

This change actually also makes the log more meaningful

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
2020-12-03 22:36:25 +01:00
Ryan Hamilton 28e83c30eb
Prevent unused scrollbars from appearing in FF on Linux (#12795) 2020-12-03 15:53:04 -05:00
宋财礼 d9d6dafb18
Fix the exception that the port is empty when using db shell (#12740)
* Fix the exception that the port is empty when using db shell
2020-12-03 19:19:26 +01:00
Kaxil Naik be7d867459
BugFix: Editing a DAG run or Task Instance on UI causes an Error (#12770)
closes https://github.com/apache/airflow/issues/12489
2020-12-03 17:53:59 +00:00
Ephraim Anierobi b62abfbfae
Handle ParserError when dag is triggered with invalid execution_date (#12618) 2020-12-03 17:04:14 +01:00
Tomek Urbaszek 56f82ba225
Change DEBUG color to green in coloured logger (#12784)
In this way it's easier to see difference between debug and error.
2020-12-03 16:07:43 +01:00
Kaxil Naik 8f48f12128
Fix typo in airflow/serialization/serialized_objects.py (#12767)
"Cnn't load plugins" -> "Can not load plugins"
2020-12-03 06:54:59 +01:00
James Timmins 386f6b2ecb
Refactor and speed up "DAG:" prefix permissions migration (#12720) 2020-12-02 20:57:00 +00:00
Tomek Urbaszek 67acdbdf92
Remove store_serialized_dags from config (#12754)
The store_serialized_dags is no longer used in our code base as
both webserver and scheduler require serialization.
2020-12-02 18:18:48 +01:00
Kaxil Naik 0400ee32d4
Allow using _CMD / _SECRET to set `[webserver] secret_key` config (#12742)
`[webserver] secret_key` is also a secret like Fernet key. Allowing
it to be set via _CMD or _SECRET allows users to use the external secret store for it.
2020-12-02 14:39:26 +00:00
Darwin Yip 2947e09999
SlackWebhookHook use password instead of extra (#12674)
closes: #12214
2020-12-02 14:09:28 +00:00
Danny 03fa6edc7a
Order broken DAG messages in UI (#12749) 2020-12-02 13:37:59 +00:00
Kaxil Naik 101da213c5
Optimize subclasses of DummyOperator for Scheduling (#12745)
Custom operators inheriting from DummyOperator will now instead
 of going to a scheduled state will go set straight to success
 if they don't have callbacks set.

 closes https://github.com/apache/airflow/issues/11393
2020-12-02 12:49:17 +00:00
Ash Berlin-Taylor dab783fcdc
Don't let webserver run with dangerous config (#12747) 2020-12-02 10:55:22 +00:00
Tomek Urbaszek cba8d62553
Refactor list rendering in commands (#12704)
This commit unifies the mechanism of rendering output of tabular
data. This gives users a possibility to eiter display a tabular
representation of data or render it as valid json or yaml payload.

Closes: #12699

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-12-02 10:20:16 +01:00
Xiaodong DENG ae0e8f4732
Move config item 'worker_precheck' from section [core] to [celery] (#12746)
* Move config item 'worker_precheck' from section [core] to [celery]

This configuration is ONLY applicable for Celery Worker.
So it should be in section [celery], rather than [core]

* Add to deprecation/migration automatic list
2020-12-02 07:47:02 +01:00
Kaxil Naik ac3a8bfb0c
Allow switching xcom_pickling to JSON/Pickle (#12724)
Without this commit, the Webserver throws an error when
enabling xcom_pickling in the airflow_config by setting `enable_xcom_pickling = True`
(the default is `False`).

Example error:

```
>           return pickle.loads(result.value)
E           _pickle.UnpicklingError: invalid load key, '{'.

airflow/models/xcom.py:250: UnpicklingError
--------------------------------------------------
```
2020-12-01 11:35:34 +00:00
HasanJ 4ac66cf8c4
Deprecate BaseHook.get_connections method (#10135) (#10192)
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-30 16:29:30 +00:00
Ash Berlin-Taylor 5e13c37286
Remove deprecated dagbag metrics (#12695)
These were deprecated in 1.10.6 via #6157, so we should remove them
before 2.0 rolls around.
2020-11-30 09:14:15 +00:00
Kamil Breguła bd90136aaf
Move operator guides to provider documentation packages (#12681) 2020-11-30 08:48:24 +01:00
Xiaodong DENG bb00f164da
Refine the DB query logics in www.views.task_stats() (#12707)
* Refine the DB query logics in www.views.task_stats()

- given filter_dag_ids is either allowed_dag_ids, or intersection of allowed_dag_ids and selected_dag_ids,
  no matter if selected_dag_ids is None or not, filter_dag_ids should ALWAYS be considered into the SQL query.

  Currently, if selected_dag_ids is None, the query is actually getting the full result (then 'filter' at the end).
  This means more (unnecessary) data travel between Airflow and DB.

- When we join table A and B with A.id == B.id (default is INNER join), if we always confirm ALL A.id is in a specific list,
  implicitly ALL ids in the result table are already guaranteed in this specific list as well.
  This is why the two redundant .filter() chunks are removed.

Minor performance improvement should be expected.
Meanwhile, this change makes the code cleaner.
2020-11-29 22:22:38 +01:00
Jarek Potiuk 2037303eef
Adds support for Connection/Hook discovery from providers (#12466)
* Adds support for Hook discovery from providers

This PR extends providers discovery with the mechanism
of retrieving mapping of connections from type to hook.

Fixes #12456

* fixup! Adds support for Hook discovery from providers

* fixup! fixup! Adds support for Hook discovery from providers
2020-11-29 15:31:49 +01:00
Tomek Urbaszek c9d1ea5cf8
Refactor airflow plugins command (#12697)
This commit refactors plugins command to make it more
user-friendly, structured and easier to read.
2020-11-29 12:16:59 +01:00
Ash Berlin-Taylor 02d94349be
Don't use time.time() or timezone.utcnow() for duration calculations (#12353)
`time.time() - start`, or `timezone.utcnow() - start_dttm` will work
fine in 99% of cases, but it has one fatal flaw:

They both operate on system time, and that can go backwards.

While this might be surprising, it can happen -- usually due to clocks
being adjusted.

And while it is might seem rare, for long running processes it is more
common than we might expect. Most of these durations are harmless to get
wrong (just being logs) it is better to be safe than sorry.

Also the `utcnow()` style I have replaced will be much lighter weight -
creating a date time object is a comparatively expensive operation, and
computing a diff between two even more so, _especially_ when compared to
just subtracting two floats.

To make the "common" case easier of wanting to compute a duration for a
block, I have made `Stats.timer()` return an object that has a
`duration` field.
2020-11-29 10:12:30 +00:00
Ash Berlin-Taylor 8291fabaf9
Ensure that tasks set to up_for_retry have an end date (#12675)
If a task is "manually" set to up_for_retry (via the UI for instance) it
might not have an end date, and much of the logic about computing
retries assumes that it does.

Without this, manually setting a running task to up_for_retry will make
the make it impossible to view the TaskInstance details page (as it
tries to print the is_premature property), and also the NotInRetryPeriod
TIDep fails - both with an exception:

> File "airflow/models/taskinstance.py", line 882, in next_retry_datetime
>   return self.end_date + delay
> TypeError: unsupported operand type(s) for +: 'NoneType' and 'datetime.timedelta'
2020-11-29 10:11:50 +00:00
Ash Berlin-Taylor 7ef9aa7d54
Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (#12694)
Using `pkg_resources.iter_entry_points` validates the version
constraints, and if any fail it will throw an Exception for that
entrypoint.

This sounds nice, but is a huge mis-feature.

So instead of that, switch to using importlib.metadata (well, it's
backport importlib_metadata) that just gives us the entrypoints - no
other verification of requirements is performed.

This has two advantages:

1. providers and plugins load much more reliably.
2. it's faster too

Closes #12692
2020-11-29 07:19:47 +01:00
Tomek Urbaszek 850b74befe
Use rich to render info and cheat-sheet command (#12689) 2020-11-29 00:29:58 +01:00
Jarek Potiuk 64f14759e4
Fix typos and added missing descriptions in provider.yaml schema (#12690) 2020-11-28 19:27:08 +01:00
Jarek Potiuk b858683abf
Adds providers information to `airflow info` command (#12687) 2020-11-28 13:42:19 +01:00
Kamil Breguła 08bc62b64d
Validate JSON schema files with JSON Schema (#12682) 2020-11-28 12:12:54 +01:00
Nathan Hadfield 76bcd08dca
Added `@apply_defaults` decorator. (#12620) 2020-11-28 09:57:29 +01:00
Kamil Breguła de3b1e687b
Move connection guides to provider documentation packages (#12653) 2020-11-28 08:09:53 +01:00
Ephraim Anierobi 543d88b3a1
Add example dag and system tests for azure wasb and fileshare (#12673) 2020-11-28 06:27:00 +01:00
Kaxil Naik 704e724cc1
Make migrations using kube_resource_version idempotent (#12670)
closes https://github.com/apache/airflow/issues/12666
2020-11-27 22:49:31 +00:00
Jarek Potiuk 6b3c6add9e
Update setup.py to get non-conflicting set of dependencies (#12636)
This change upgrades setup.py and setup.cfg to provide non-conflicting
`pip check` valid set of constraints for CI image.

Fixes #10854

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>

Co-authored-by: Tomek Urbaszek <turbaszek@apache.org>
2020-11-27 20:06:44 +01:00
Jarek Potiuk 41a699a7bd
Implement reading provider information from packages/sources (#12512)
This PR implements discovering and readin provider information from
packages (using entry_points) and - if found - from local
provider yaml files for the built-in airflow providers,
when they are found in the airflow.provider packages.
The provider.yaml files - if found - take precedence over the
package-provided ones.

Add displaying provider information in CLI

Closes: #12470
2020-11-27 18:42:32 +01:00
Kaxil Naik 531e00660a
Typo Fix: Deprecated config force_log_out_after was not used (#12661)
`force_logout_after` should be `force_log_out_after` in the code
section https://github.com/apache/airflow/blob/master/airflow/settings.py#L372-L381.

As `force_log_out_after` is actually used and written in
c5700a56bb/UPDATING.md (unify-user-session-lifetime-configuration).
2020-11-27 18:36:10 +01:00
Kaxil Naik 5fafd982ca
Replace foreign key constraints with foreign annotation (#12603)
closes https://github.com/apache/airflow/issues/12448
2020-11-27 12:59:23 +00:00
Tobiasz Kędzierski e1ebfa68b1
Add DataflowJobMessagesSensor and DataflowAutoscalingEventsSensor (#12249) 2020-11-27 13:02:13 +01:00
Xiaodong DENG 6e9c110e8b
Housekeeping: Remove 'dirty_ids' in www/views.py (#12645)
This is a clean-up/housekeeping change.

Usage of 'dirty_ids' is no longer applicable since PR #4378, back in Dec 2018
https://github.com/apache/airflow/pull/4378
2020-11-27 05:50:26 +01:00
Xiaodong DENG f16fa095e8
Clean-up airflow/kubernetes/kube_config.py (#12627)
- Remove the stale internal method `_get_security_context_val`
  It was added in PR #5429, but to what I can see, it's not needed anymore.
- Avoid hard-coding when we can (we already have `core_section` specified, so can avoid using ``'core'`)
- Narrow down what we import from `airflow.settings`
2020-11-26 13:28:10 +01:00
Ash Berlin-Taylor eacf40d85e
Ensure that the `prohibit_commit` guard only applies to _one_ session. (#12575)
By listening on the engine's `commit` we were picking up _all_ session
commit calls, even from sessions other than the one passed to
`prohibit_commit`, which was not intended.

This changes it to listen to before_commit, which is session specific,
rather than engine "global" and also adds tests which were lacking
before hand.
2020-11-25 22:44:17 +00:00
QP Hou 8f29c6d5b1
fix db migration downgrade actions (#12608) 2020-11-25 13:29:39 -08:00
Kaxil Naik 4f4714fa33
Fix session_lifetime_minutes config docs (#12628)
- Update `version_added` to 1.10.13
- Better format it using two back-ticks for code-block instead of italics
2020-11-25 19:49:08 +01:00
Kaxil Naik 324bc6f72d
Make AzureKeyVaultBackend backwards-compatible (#12626)
This module was released in 1.10.13. This commit just makes
it backwards-compatible
2020-11-25 18:28:02 +00:00
Jarek Potiuk cdaaff12c7
Fix Connection.description migration for MySQL8 (#12596)
Due to not executing MySQL8 tests Fixed in #12591 added
description for connection table was not compatible with
MySQL8 with utf8mb4 character set.

This change adds migration and fixes the previous migration
to make it compatible.
2020-11-25 14:30:31 +01:00
Kaxil Naik 486134426b
Rename `[scheduler] max_threads` to `[scheduler] parsing_processes` (#12605)
From Airflow 2.0, `max_threads` config under `[scheduler]` section has been renamed to `parsing_processes`.

This is to align the name with the actual code where the Scheduler launches the number of processes defined by
`[scheduler] parsing_processes` to Parse DAG files, calculates next DagRun date for each DAG,
serialize them and store them in the DB.
2020-11-25 09:33:19 +00:00
Ehsan Poursaeed 08251c145d
Remove foreign key constraint on SerializedDagModel's dag_runs field (#12586)
Issue: https://github.com/apache/airflow/issues/12448
2020-11-25 02:07:10 +00:00
Xiaodong DENG c6467ba12d
Update logging & doc for LocalFilesystem Secrets Backend (#12597)
- Support towards YAML is added in PR https://github.com/apache/airflow/pull/9477
  Most docs were updated for this. But a few docstrings and exception logging were missed
2020-11-25 00:24:12 +00:00
Bjorn Olsen 663259d4b5
Fix AWS DataSync tests failing (#11020)
closes #10985
2020-11-25 00:09:05 +00:00
Daniel Imberman 6caf2607e0
Don't set child tasks to schedulable in test runs (#12595)
Fixes a bug where Airflow will attempt to set child tasks to schedulable
for test tasks when users run `airflow task test.` This causes an error
as Airflow runs a DB seek for a task that has not been recorded.
2020-11-24 13:43:22 -08:00
Tomek Urbaszek b57b932113
Improve code quality of ExternalTaskSensor (#12574) 2020-11-24 09:47:57 +01:00
Xiaodong DENG 74ed92b3ff
Drop random.choice() in BaseHook.get_connection() (#12573)
https://github.com/apache/airflow/pull/9067 made conn_id unique,
and this is effective from 2.0.*.
Due to this change, BaseHook.get_connections() will return a List of length 1, or raise Exception.

In such a case, we should simply always get the only element
from the result from BaseHook.get_connections(),
and drop random.choice() in BaseHook.get_connection(), which was only applicable for the earlier
setting (multiple connections is allowed for single conn_id)
2020-11-24 06:34:08 +01:00
Tobiasz Kędzierski 3fa51f94d7
Add check for duplicates in provider.yaml files (#12578) 2020-11-24 05:25:29 +01:00
Joshua Carp 6d0dcd2f30
Use html urls instead of onclick for dags view links. (#12539)
The dags view uses onclick events for dagrun and taskinstance links.
This breaks url previews, copying urls, opening links in a new tab, etc.
This patch uses svg anchors with href attributes instead of onclick
events so that these links behave like normal links.
2020-11-23 19:38:50 -05:00
QP Hou 01ff088dfb
Fix Dag Serialization crash caused by preset DagContext (#12530) 2020-11-23 20:42:50 +00:00
Xiaodong DENG f6ba8b5757
Doc Fix around Secret/Connection/Variable (#12571)
Documentation fixes/improvements:

- For Variables set by Environment Variable,
   it was highlighted that it may not appear in the web UI.
   But this was not highlighted for Connections set by Environment Variable.
   This PR adds this note (in docs/howto/connection/index.rst).

- Fix wrong docstring of airflow.secrets.base_secrets.BaseSecretsBackend.get_variable().

- The Secret Backends don't properly mentioning Variables in the docstrings
   (all the focus was put on Connections only). This PR addresses this.

- Other a few minor changes.
2020-11-23 21:26:44 +01:00
Xiaodong DENG 753f53f77c
Housekeeping for www/security.py (#12516)
## Housekeeping for www/security.py

- correct type hint for dag_id (str rather than int)
- Use DAG name without prefix "DAG:" in logging (line 644)
- avoid unnecessary duplicated operation (line 653)

## Clean-up the logic in update_admin_perm_view()

Because RESOURCE_DAG_PREFIX is "DAG:" and RESOURCE_DAG is "DAGs",
if we have view_menu_id.in_(pv_ids),
we can be sure that view_menu_id != all_dag_view.id.

By making this change, we have cleaner logic, and can avoid some talks to DB.
2020-11-23 18:07:57 +01:00
Beni Ben zikry c02a3f59e4
Spark-on-k8s sensor logs - properly pass defined namespace to pod log call (#11199)
This is a follow up to #10023 - it seems that passing the defined namespace to the log call was missed in one of the rebases done during the PR.
Without this fix, logging will fail when the k8s connection uses a different namespace than the one SparkApplication(s) are actually submitted to.
2020-11-23 14:02:01 +00:00
Dmitriy Synkov ed09915a02
[AIRFLOW-5115] Bugfix for S3KeySensor failing to accept template_fields (#12389) 2020-11-23 09:18:58 +01:00
Aman Ranjan Thakur ff990f245e
Add capability to specify gunicorn access log format (#10261) 2020-11-23 04:29:10 +01:00
Kamil Breguła de15aa30d4
Deprecate Read the Docs (#12541) 2020-11-22 19:45:12 +01:00
Kamil Breguła ef4af21351
Move providers docs to separate package + Spell-check in a common job with docs-build (#12527) 2020-11-22 09:29:51 +01:00
Joshua Carp 397d9128b6
Return nonzero exit codes on pool import errors. (#12095)
The pool import command returns an exit code of zero in a few different
error cases. This causes problems for scripts that invoke the command,
since commands that actually failed will appear to have worked. This
patch returns a nonzero code if the pool file doesn't exist, if the file
isn't valid json, or if any of the pools in the file is invalid.
2020-11-21 19:33:06 -05:00
John Bampton 370e7d07d1
Fix Python Docstring parameters (#12513) 2020-11-21 12:11:31 +01:00
Kengo Seki 234d689387
Fix S3ToSnowflakeOperator docstring (#12504)
There's a parameter called s3_bucket in its docstring,
but it doesn't exist actually. The stage parameter exists instead.
2020-11-21 08:27:53 +01:00
Kaxil Naik 36a9b0f48b
Fix the default value for VaultBackend's config_path (#12518)
It is `config` not `configs`
2020-11-20 21:52:28 +00:00
Ephraim Anierobi 20843ff89d
Add missing file_token field to get dag details API endpoint (#12463) 2020-11-20 16:28:55 +00:00
Kamil Breguła c34ef853c8
Separate out documentation building per provider (#12444)
* POC

* fixup! POC
2020-11-20 15:35:56 +01:00
Kaxil Naik 7d55d45498
Reorder Migrations to make it 1.10.13 compatible (#12496)
This commits makes Airflow 2.0 migrations compatible with 1.10.13 so users can
easily upgrade from 1.10.13 to 2.0
2020-11-20 12:34:10 +00:00
Tim van de Keer 4428235071
Fixes taskInstances API endpoint when start_date, end_date or state are None(null) (#12453)
Fixes a bug when calling `/api/v1/dags/~/dagRuns/~/taskInstances/list` with dag_ids as parameter.

The schema had defined `start_date`, `end_date` and `state` as non-nullable, but they are optional.
2020-11-20 11:36:46 +00:00
James Timmins e9cfa393ab
Turn off foreign keys before altering table to prevent sqlite issue. (#12487)
Closes #12488
2020-11-20 08:43:46 +00:00
Kaxil Naik 502e883bbd
Make kubernetes requirement optional for Example DAGs (#12494) 2020-11-19 22:19:33 +00:00
Ryan Hamilton de9d2fa3a2
ensure Moment date is valid before attempting to render it (#12492) 2020-11-19 15:48:21 -05:00
Ash Berlin-Taylor 9e089ab895
Fix Kube tests (#12479)
This is the same fix as in #12461, but we didn't notice it as the tests
failed after 50 failures.

It also turns out that the k8s API doesn't take a V1NodeSelector and instead
just takes a dict.

Co-authored-by: Daniel Imberman <daniel.imberman@gmail.com>
2020-11-19 20:45:33 +00:00
Ryan Hamilton 93d64e557b
Update tag color to be neutral (and match DAGs index view) (#12493) 2020-11-19 15:22:32 -05:00
Ryan Hamilton fedf633227
Remove unused/uncompiled JS file (#12490) 2020-11-19 15:04:14 -05:00
Xiaodong DENG 8b95e51370
Improve www.security.get_accessible_dags() and webserver performance (#12458)
* Improve www.security.get_accessible_dags() and webserver performance

- the performance of get_accessible_dags() is improved by returning as early as possible
- the changes made in www.views.py are based on the fact that the check
  on permissions.RESOURCE_DAG is already done in get_accessible_dags(),
  which is invoked by get_accessible_dag_ids() then.

* Fix-up. Incorporate the changes suggested by jhtimmins with minor change

Co-Authored-By: jhtimmins <jameshtimmins@gmail.com>
2020-11-19 20:59:10 +01:00
Faisal 9e3b2c554d
GCP Secrets Optional Lookup (#12360) 2020-11-19 19:27:04 +00:00
Ryan Hamilton bc01907eea
Improve UI file naming/patterns (#12486)
* Use friendlier terms for file naming

* Correlate asset names to template names
2020-11-19 13:12:42 -05:00
John Bampton 13128f44ec
Fix Python docstring parameter (#12483) 2020-11-19 17:29:57 +01:00
Kaxil Naik a3dfd04ce4
Webserver: Further Sanitize values passed to origin param (#12459)
Follow-up of https://github.com/apache/airflow/pull/10334
2020-11-18 21:46:36 +00:00
Ash Berlin-Taylor 94ba200d42
Bump version to 2.0.0b3 (#12462) 2020-11-18 21:25:54 +00:00
James Timmins f30c0a638f
Fix typoe in migrations: RESOURCE_DAGS to RESOURCE_DAG. (#12460)
This was missed in CI because it only became a problem when run with existing DAG data in the DB, which CI doesn't have.
2020-11-18 20:45:31 +00:00
Ash Berlin-Taylor d32fe78c0d
Update readmes for cncf.kube provider fixes (#12457) 2020-11-18 18:30:17 +00:00
Ryan Hamilton 411c686800
Improve the layout of TI modal when browser at narrower widths (#12456) 2020-11-18 13:23:27 -05:00
Daniel Imberman d84a52dc8f
Fix broken example_kubernetes DAG (#12455) 2020-11-18 17:13:18 +00:00
Daniel Imberman 7c8b71d201
Fix backwards compatibility further (#12451)
* Fix backwards compatibility further

This PR ensures that node_selector, affinity, and tolerations are all
converted into k8s API objects before they are sent to the
pod_mutation_hook. this fixes an inconsistency that would force airflow
engineers to consider both cases when writing their pod_mutation_hook

* nit
2020-11-18 08:22:23 -08:00
Ash Berlin-Taylor 0080354502
Update provider READMEs for 1.0.0b2 batch release (#12449) 2020-11-18 07:42:12 -08:00
Ryan Hamilton b584adbe11
Fix bug in server timezone indicator (#12447) 2020-11-18 14:58:06 +00:00
Jarek Potiuk 7ca0b6f121
Enable Markdownlint rule MD003/heading-style/header-style (#12427) (#12438)
Co-authored-by: John Bampton <jbampton@users.noreply.github.com>
2020-11-18 15:37:51 +01:00
Tomek Urbaszek 8d09506464
Fix download method in GCSToBigQueryOperator (#12442)
closes: #12439
2020-11-18 12:25:06 +01:00
Abhilash Kishore bf6da166a9
Add description field to connection (#10873)
closes https://github.com/apache/airflow/issues/10840
2020-11-18 11:00:30 +00:00
Dr. Dennis Akpenyi fa36f3314e
PR to add 'files' to template-fields in EmailOperator class (#12428) 2020-11-18 07:30:28 +01:00
Kamil Breguła c9f9d2cea8
Optimize json schema validation in providers_manager (#12420) 2020-11-18 07:07:53 +01:00
Akim Akimov 966ee7d994
JSON Response is returned for invalid API requests (#12305) 2020-11-18 05:35:22 +01:00
Kaxil Naik 763b40d223
Raise correct Warning in kubernetes/backcompat/volume_mount.py (#12432)
It was raising warning with message to use `V1Volume` instead of `V1VolumeMount`
2020-11-18 03:51:41 +00:00
Kaxil Naik bc4bb30588
Fix docstrings for Kubernetes Backcompat module (#12422)
This were missed in https://github.com/apache/airflow/pull/12384
2020-11-18 01:01:19 +00:00
Kaxil Naik 506ee1f06c
Fix issues with Gantt View (#12419)
closes https://github.com/apache/airflow/issues/9813
closes https://github.com/apache/airflow/issues/9633

and does some cleanup
2020-11-17 21:50:39 +00:00
Daniel Imberman cab86d80d4
Make K8sPodOperator backwards compatible (#12384)
* Make the KubernetesPodOperator backwards compatible

This PR significantly reduces the pain of upgrading to Airflow 2.0
for users of the KubernetesPodOperator. Users will be allowed to
    continue using the airflow.kubernetes custom classes

* spellcheck

* spelling

* clean up unecessary files in 1.10

* clean up unecessary files in 1.10

* clean up unecessary files in 1.10
2020-11-17 13:47:18 -08:00
Ryan Hamilton a80a320ab9
Don't display when None (#12415) 2020-11-17 15:51:15 -05:00
Kaxil Naik bf3ead13cc
Change log level for User's session to DEBUG (#12414)
This line was logged too often -- too chatty
2020-11-17 21:48:24 +01:00
Tomek Urbaszek a4aa32b875
Simplify using XComArg in jinja template string (#12405)
This changes XComArg string representation from 'task_instance.pull(...)'
to '{{ task_instance.xcom_pull(...) }}' so users can use XComArgs with
f-string (and other) in simpler way. Instead of doing
f'echo {{{{ {op.output} }}}}' they can simply do f'echo {op.output}'.
2020-11-17 19:35:12 +01:00
Jarek Potiuk 2c0920fba5
Adds mechanism for provider package discovery. (#12383)
This is a simple mechanism that will allow us to dynamically
discover and register all provider packages in the Airflow core.

Closes: #11422
2020-11-17 18:48:57 +01:00
Kamil Breguła 2cda2f2a0a
Add missing pre-commit definition - provider-yamls (#12393) 2020-11-17 15:44:46 +01:00
Tobiasz Kędzierski 80a957f142
Add Dataflow sensors - job metrics (#12039) 2020-11-17 11:43:13 +01:00
Jarek Potiuk ae7cb4a1e2
Update wrong commit hash in backport provider changes (#12390)
The commit was rebased so hash changed. This restores the right one.
2020-11-17 10:29:14 +01:00
Xiaodong DENG 35b5614817
Remove inapplicable configuration section [ldap] (since 2.0.0) (#12386)
[ldap] section in airflow.cfg is not applicable anymore in 2.0 and master,
because the LDAP authentication (for webserver and API) is handled by FAB,
and the configuration for this is handled by webserver_config.py file.
2020-11-16 21:34:20 +01:00
James Timmins d4e1ff290f
Handle outdated webserver session timeout gracefully. (#12332) 2020-11-16 19:38:28 +00:00
Tomek Urbaszek 1623df8721
Use different deserialization method in XCom init_on_load (#12327)
The init_on_load method used deserialize_value method which
in case of custom XCom backends may perform requests to external
services (for example downloading file from buckets).

This is problematic because wherever we query XCom the resuest would be
send (for example when listing XCom in webui). This PR proposes implementing
orm_deserialize_value which allows overriding this behavior. By default
we use BaseXCom.deserialize_value.

closes: #12315
2020-11-16 13:32:36 +01:00
Tomek Urbaszek 917e6c4424
Add provide_file_and_upload to GCSHook (#12310)
This commit adds provide_file_and_upload context manager
which works similar to provide_file. Users using it can
avoid boilerplate code of creating temporary file and then
uploading its content to GCS.
2020-11-16 10:46:34 +01:00
John Bampton 6f0cf3f724
Remove unneeded parentheses after Black formatting (#12380) 2020-11-16 09:24:54 +00:00
Ash Berlin-Taylor 6d05108fed
Add info log message about duration taken to load plugins (#12308)
Loading plugins, particularly from setuptools entry points can be slow,
and since by default this happens per-task, it can slow down task
execution unexpectedly.

By having this log message users can know the source of the delay

(The change to test_standard_task_runner was to remove logging-config
side effects from that test)
2020-11-16 09:03:43 +00:00
Xiaodong DENG 561e459491
Proper exit status for failed CLI requests (#12375)
Some CLI commands simply print messages when the requests fail.
The issue is the exit code for these commands are 0 while it should be non-zero.

Pursuing very detailed status code may not make sense here.
But we can at least ensure we give non-zero status by using raise SystemExit().

More proper exist status ensures people can better make use of the CLI.

(A few minor string expression issues are fixed here as well).
2020-11-15 21:39:14 +01:00
Tobiasz Kędzierski cfa4ecfeb0
Add DataflowJobStatusSensor and support non-blocking execution of jobs (#11726) 2020-11-15 20:54:05 +01:00
Tomek Urbaszek 39ea8722c0
Check for TaskGroup in _PythonDecoratedOperator (#12312)
Crucial feature of functions decorated by @task is to be able
to invoke them multiple times in single DAG. To do this we are
generating custom task_id for each invocation. However, this didn't
work with TaskGroup as the task_id is already altered by adding group_id
prefix. This PR fixes it.

closes: #12309

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-11-15 12:28:04 +01:00
Xiaodong DENG 823b3aace2
Reject 'connections add' CLI request if URI provided is invalid (#12370)
The validity is decided by availability of both 'scheme' and 'netloc' in the parse result
2020-11-15 11:47:57 +01:00
Kamil Breguła 6889a333cf
Improvements for operators and hooks ref docs (#12366) 2020-11-15 00:50:30 +01:00
Daniel Imberman 221f809c1b
Fix full_pod_spec for k8spodoperator (#12354)
* Fix full_pod_spec for k8spodoperator

Fixes a bug where the `full_pod_spec` argument is never factored
into the kubernetespodoperator. The new order of operations is as
follows:

1. Check to see if there is a pod_template_file and if so create the initial pod, else start with empty pod
2. if there is a full_pod_spec , reconcile the pod_template_file pod and the full_pod_spec pod
3.  reconcile with any of the argument overrides

* add tests
2020-11-14 11:32:34 -08:00
Tomek Urbaszek bcb2437343
Remove redundant method in KubernetesExecutor (#12317)
The _inject_secrets method was invoked but it performed no action so
it seems that we can remove it.
2020-11-14 16:39:34 +01:00
Kaxil Naik 9e7b7efb69
Reorder Database Migrations (#12362)
Becase `2c6edca13270` (Resource based permissions) & `849da589634d` (Prefix DAG permissions)
were run before `92c57b58940d_add_fab_tables.py` and `03afc6b6f902_increase_length_of_fab_ab_view_menu_.py`,
the FAB tables were already created because those migrations imported `from airflow.www.app import create_app`
which calls the following lines that creates tables:

0e7f62418b/flask_appbuilder/security/sqla/manager.py (L86-L97)

Previously:

```
INFO  [alembic.runtime.migration] Running upgrade bef4f3d11e8b -> 98271e7606e2, Add scheduling_decision to DagRun and DAG
INFO  [alembic.runtime.migration] Running upgrade 98271e7606e2 -> 52d53670a240, fix_mssql_exec_date_rendered_task_instance_fields_for_MSSQL
INFO  [alembic.runtime.migration] Running upgrade 52d53670a240 -> 849da589634d, Prefix DAG permissions.
[2020-11-14 02:35:43,055] {manager.py:727} WARNING - No user yet created, use flask fab command to do it.
[2020-11-14 02:35:46,790] {migration.py:515} INFO - Running upgrade 849da589634d -> 364159666cbd, Add creating_job_id to DagRun table
[2020-11-14 02:35:46,794] {migration.py:515} INFO - Running upgrade 364159666cbd -> 2c6edca13270, Resource based permissions.
[2020-11-14 02:35:46,795] {app.py:87} INFO - User session lifetime is set to 43200 minutes.
[2020-11-14 02:35:46,806] {manager.py:727} WARNING - No user yet created, use flask fab command to do it.
[2020-11-14 02:35:48,221] {migration.py:515} INFO - Running upgrade 2c6edca13270 -> 45ba3f1493b9, add-k8s-yaml-to-rendered-templates
[2020-11-14 02:35:48,226] {migration.py:515} INFO - Running upgrade 45ba3f1493b9 -> 92c57b58940d, Create FAB Tables
[2020-11-14 02:35:48,227] {migration.py:515} INFO - Running upgrade 92c57b58940d -> 03afc6b6f902, Increase length of FAB ab_view_menu.name column
```

Now:

```
INFO  [alembic.runtime.migration] Running upgrade bef4f3d11e8b -> 98271e7606e2, Add scheduling_decision to DagRun and DAG
INFO  [alembic.runtime.migration] Running upgrade 98271e7606e2 -> 52d53670a240, fix_mssql_exec_date_rendered_task_instance_fields_for_MSSQL
INFO  [alembic.runtime.migration] Running upgrade 52d53670a240 -> 364159666cbd, Add creating_job_id to DagRun table
INFO  [alembic.runtime.migration] Running upgrade 364159666cbd -> 45ba3f1493b9, add-k8s-yaml-to-rendered-templates
INFO  [alembic.runtime.migration] Running upgrade 45ba3f1493b9 -> 92c57b58940d, Create FAB Tables
INFO  [alembic.runtime.migration] Running upgrade 92c57b58940d -> 03afc6b6f902, Increase length of FAB ab_view_menu.name column
INFO  [alembic.runtime.migration] Running upgrade 03afc6b6f902 -> 849da589634d, Prefix DAG permissions.
[2020-11-14 02:57:18,886] {manager.py:727} WARNING - No user yet created, use flask fab command to do it.
[2020-11-14 02:57:22,380] {migration.py:515} INFO - Running upgrade 849da589634d -> 2c6edca13270, Resource based permissions.
```
2020-11-14 09:13:33 +00:00
Ace Haidrey f32497395a
Add success/failed sets to State class (#12359)
Co-authored-by: Ace Haidrey <ahaidrey@pinterest.com>
2020-11-14 09:22:32 +01:00
Martijn Pieters 4c25e76360
Refactor root logger handling in task run (#12342)
- Use a context manager to encapsulate task logging setup and teardown
- Create a copy, not a reference, of the handlers list
- Remove logging.shutdown(), it simply should not be called

Closes #12090
2020-11-14 04:10:08 +00:00
Kaxil Naik 3a72fc8247
Fix Description of Provider Docs (#12361)
Apache Druid had description for Cassandra. Dingding had it for Datadog. And typo in Vertica
2020-11-14 03:55:43 +00:00
Kaxil Naik 02ef8e1cb3
Manage Flask AppBuilder Tables using Alembic Migrations (#12352)
closes https://github.com/apache/airflow/issues/9155

The Migration is idempotent and allows both upgrade and downgrade.
It also takes care of https://github.com/dpgaspar/Flask-AppBuilder/pull/1368
i.e. increasing the length of ab_view_menu.name column from 100 to 250
2020-11-14 03:25:36 +00:00
Ryan Hamilton ba76eb4961
Make nav fully accessible y keyboard, fix mobile nav menus (#12351) 2020-11-13 18:50:42 -05:00
Ace Haidrey aac3877ec3
Add metric for scheduling delay between first run task & expected start time (#9544)
Co-authored-by: Ace Haidrey <ahaidrey@pinterest.com>
2020-11-13 23:03:42 +01:00
Daniel Imberman 4e362c1347
K8s yaml templates not rendered by k8sexecutor (#12303)
* K8s yaml templates not rendered by k8sexecutor

There is a bug in the yaml template rendering caused by the logic that
yaml templates are only generated when the current executor is the
k8sexecutor. This is a problem as the templates are generated by the
task pod, which is itself running a LocalExecutor. Also generates a
"base" template if this taskInstance has not run yet.

* fix tests

* fix taskinstance test

* fix taskinstance

* fix pod generator tests

* fix podgen

* Update tests/kubernetes/test_pod_generator.py

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>

* @ashb comment

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-13 12:06:29 -08:00
Martijn Pieters d54f087b66
Use the backend-configured model (#12336)
Rather than import the backend Task model directly, use the class that the backend actually uses. This could have been customised, and there is no reason not to use this reference.
2020-11-13 19:58:45 +01:00
Ephraim Anierobi 3e4aa06617
Bugfix: REST API Variables update endpoint returns 204 No Content (#12321) 2020-11-13 17:58:09 +00:00
Ryan Hamilton e5e47dac47
Fix/Enhancement: Disable forms and communicate to user when no DAG Runs (#12320)
* Disable forms and communicate to user when no DAG runs yet

* Refactor method name to not use negation in name

* lint fix
2020-11-13 12:46:59 -05:00
Ryan Hamilton 450bd32082
Improve presentation of DAG Docs (#12330)
* Improve presentation of DAG docs

* syntax fix
2020-11-13 12:35:24 -05:00
Nicolas Lecoy 309b325c17
Update deprecated Apache Pinot Broker API (#12333)
* Update depricated Apache Pinot Broker API

* Fix typo on Pinot Hook
2020-11-13 16:48:06 +01:00
Tomek Urbaszek 1222ebd4e1
Create DAG-level cluster policy (#12184)
This commit adds new concept of dag_policy which is checked
once for every DAG when creating DagBag. It also improves
documentation around cluster policies.

closes: #12179

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-13 14:32:49 +01:00
João Ponte c94b1241a1
Add extra error handling to S3 remote logging (#9908)
If you have configured S3 logs, but there is a problem then this is never
surfaced to the UI (nor the webserver logs) making this very hard to
debug.

This PR exposes some of these errors to the user.

Co-authored-by: Joao Ponte <jpe@plista.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-13 13:04:59 +00:00
Kamil Breguła 7825e8f590
Docs installation improvements (#12304)
* Improvements for installation docs
2020-11-13 09:38:54 +01:00
Trinity Xia b027223132
Add install/uninstall api to databricks hook (#12316)
- adding install Databricks API to databricks hook(api/2.0/libraries/install)

- adding uninstall Databricks API to databricks hook (2.0/libraries/uninstall)
2020-11-13 07:41:31 +01:00
Kaxil Naik 75f25bd8b9
Fix and Unquarantine test_change_state_for_tis_without_dagrun (#12323)
The test was simply wrong and failed since the new logic was added in
c9a97baa86
2020-11-13 05:18:44 +00:00
Nathan Hadfield 32b59f8350
Fixes the sending of an empty list to BigQuery `list_rows` (#12307)
* Fixes an issue that was causing an empty list being sent to the BigQuery client `list_rows` method resulting in no schema being returned.

* Added a test to check that providing an empty list for `selected_fields` results in `list_rows` being called wth `None`.
2020-11-12 23:47:00 +01:00
Ryan Hamilton 7f828b03cc
Get all "tags" parameters not just one (#12324) 2020-11-12 16:45:25 -05:00
Kaxil Naik 5d5c119187
Remove deprecated Elasticsearch Configs (#12296)
Since Airflow 1.10.4 we have removed `elasticsearch_` prefix from all
config items under `[elasticsearch]` section. It is time we remove them
from 2.0.

https://github.com/apache/airflow/blob/1.10.4/UPDATING.md#changes-in-writing-logs-to-elasticsearch
2020-11-12 12:47:16 +00:00
Kaxil Naik ae93fdbabc
Remove deprecated BashTaskRunner (#12295)
This commit:

- Remove support for BashTaskRunner, this task_runner was deprecated from
Airflow 1.10.3 (https://github.com/apache/airflow/blob/1.10.3/UPDATING.md#rename-of-bashtaskrunner-to-standardtaskrunner)

- Support deprecated `hostname_callable` & `email_backedn` until 2.1 since it has not been deprecated in any relased Airflow versions
2020-11-12 12:46:50 +00:00
kukigai af2f2e8c29
Wait option for dagrun operator (#12126)
* Add wait_for_completion option to dag run operator.

* Add wait_for_completion option to dag run operator.

* Change code format to pass sanity check.

* Simplify the logic to check dag run state.

* Move sleep in the beginning of loop and update pydoc.

* Change elif to if on checking allowed_states

Co-authored-by: Kaz Ukigai <kukigai@apple.com>
2020-11-12 12:21:29 +01:00
Peter Kosztolanyi 9276607b58
Add session_parameters option to snowflake_hook (#12071) 2020-11-12 00:30:14 +00:00
Tomek Urbaszek 289c9b5a99
Use default view in TriggerDagRunLink (#11778) 2020-11-11 23:11:53 +01:00
Ephraim Anierobi 7478e18ee5
Handle naive datetimes in REST APIi (#12248) 2020-11-11 20:10:44 +01:00
Ephraim Anierobi 0d37c59669
Make dag_id, task_id, and execution_date nullable in event log schema (#12287) 2020-11-11 20:10:13 +01:00
Ash Berlin-Taylor 0d51a12e26
Don't wrap warrning messages when stderr is not a TTY (#12285)
If stderr is not a TTY, rich was hard-wrapping warning messages at 80
characters:

```
/home/ash/code/airflow/airflow/airflow/configuration.py:328 DeprecationWarning:
The remote_logging option in [core] has been moved to the remote_logging option
in [logging] - the old setting has been used, but please update your config.
```

After

```
/home/ash/code/airflow/airflow/airflow/configuration.py:328 DeprecationWarning: The remote_logging option in [core] has been moved to the remote_logging option in [logging] - the old setting has been used, but please update your config.
```

`rich.print()` doesn't take a `soft_wrap` option, so I had to create a
`rich.console.Console` object -- and it seems best to cache those.
2020-11-11 17:03:24 +00:00
Ash Berlin-Taylor cbf49848af
Don't treat warning message as rich formatting codes. (#12283)
Before this commit:

```
  ...airflow/configuration.py:328 DeprecationWarning: The remote_logging option in  has been moved to the remote_logging option in  - the old setting has been used, but please update your config.
```

After this commit:

```
  ...airflow/configuration.py:328 DeprecationWarning: The remote_logging option in [core] has been moved to the remote_logging option in [logging] - the old setting has been used, but please update your config.
```

As this file is _always_ imported by anything in airflow, but warnings are quite rare I have
also delayed the import.
2020-11-11 15:34:12 +00:00
Michał Misiewicz e03a3f456f
Unify user session lifetime configuration (#11970)
* Unify user session lifetime configuration

* align with new linting rules

* exit app when removed args are provided in conf

* add one more test

* extract stopping gunicorn to method

* add docstring to stop_webserver method

* use lazy formatting

* exit webserver when removed options are provided

* align with markdown lint

* Move unify user session lifetime configuration section to master

* add new line

* remove quotes
2020-11-11 13:28:58 +01:00
Ryan Hamilton 7d5d334857
Fix pause/unpause toggle to display failed state when unsuccessful (#12267) 2020-11-10 22:31:01 +00:00
Ryan Hamilton 938c512c6d
Fix: Conditionally update button URL only when it is present (#12268)
Resolves #12254

A bug introduced in #11815. The function that updates the button URLs was failing when trying to update the "K8s Pod Spec" which is conditionally displayed (if k8s_or_k8scelery_executor). This fix adds a check to confirm the button exists before attempting.
2020-11-10 22:18:15 +00:00
Tomek Urbaszek 0cd1c846b2
Remove providers imports from core examples (#12252)
Core example DAGs should not depend on any non-core dependency
like providers packages.

closes: #12247

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>
2020-11-10 22:49:08 +01:00
Ash Berlin-Taylor 1521965bef
Release 2.0.0b2 (#12243) 2020-11-10 12:48:54 +00:00
Ash Berlin-Taylor c5806efb54
Added missing sendgrid readme (#12245)
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
2020-11-10 12:39:38 +00:00
Kaxil Naik f8ae6e5cb6
Remove Unnecessary comprehension (#12221)
The inbuilt functions all() and any() in python also support
short-circuiting (evaluation stops as soon as the overall return value
of the function is known), but this behavior is lost if you use
comprehension. This affects performance.
2020-11-10 12:01:24 +00:00
Ash Berlin-Taylor 249d1741e4
Add back missing api_connextion/__init__.py file (#12240)
A bad rebase in #12082 deleted this file by mistake.

This missing file was also the cause of needing the documentation
to exclude these files

Fixes #12239
2020-11-10 11:26:34 +00:00
John Bampton 250436d962
Fix spelling in Python files (#12230) 2020-11-10 10:32:45 +00:00
John Bampton 502ba309ea
Enable Markdownlint rule - MD022/blanks-around-headings (#12225)
https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md022---headings-should-be-surrounded-by-blank-lines
2020-11-10 10:36:45 +01:00
Xiaodong DENG dd2095f4a8
Simplify string expressions & Use f-string (#12216)
* Simplify string expressions & Use f-string

This is a follow-up clean-up work for the minor issues caused in the process of introducing Black

* Fixup
2020-11-10 08:48:27 +01:00
Ephraim Anierobi f37c6e6fce
Add Compute Engine SSH hook (#9879) 2020-11-10 02:20:38 +01:00
Kaxil Naik cd82fc3ada
Fix typo in docstrings (#12220)
`meatadata` -> `metadata`
2020-11-10 00:14:32 +00:00
Ash Berlin-Taylor 71d3eaf47e
Release 2.0.0beta1 (#12215) 2020-11-09 22:35:41 +00:00
Kaxil Naik a7272f47f6
Remove redundant parenthesis (#12213) 2020-11-09 22:00:55 +00:00
Ash Berlin-Taylor 85a18e13d9
Point at pypi project pages for cross-dependency of provider packages (#12212)
We mistakenly said "backport" which was clearly wrong, and also pointed
at the source code for them, but the pypi project page is more
appropriate.
2020-11-09 22:00:44 +00:00
Tomek Urbaszek cc12db79e2
Make warnings more visible (#12204)
This PR proposes to use custom showwarning function that
provides users with better information about warnings using
rich library to highlight the warning.
2020-11-09 21:14:46 +00:00
yuqian90 badd890675
Extend the same keyword args callable support in PythonOperator to some other sensors/operators (#11922)
This PR Standardises the callable signatures in PythonOperator, PythonSensor, ExternalTaskSensor, SimpleHttpOperator and HttpSensor.

The callable facilities in PythonOperator have been refactored into airflow.utils.helper.make_kwargs_callable. And it's used in those other places to make them work the same way.


Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-09 21:05:07 +00:00
Daniel Imberman 90a147813a
Render k8s yaml for tasks via the Airflow UI (#11815)
This function allows users of the k8s executor to get previews
of their tasks via the Airflow UI before they launch

Co-authored-by: Ryan Hamilton <ryan@ryanahamilton.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
2020-11-09 19:55:48 +00:00
Ash Berlin-Taylor 59eb5de78c
Update provider READMEs for up-coming 1.0.0beta1 releases (#12206) 2020-11-09 19:40:30 +00:00
Ash Berlin-Taylor 55c401dbf9
Remove BaseDag and BaseDagBag classes (#12195)
Since #7694 these haven't really be needed, but we hadn't removed them
yet.

No UPDATING.md note for this as I think it's extremely unlikely anyone
was using this directly -- it's very much an implementation detail
relating to DAG/SimpleDag.
2020-11-09 15:34:27 +00:00
sangarshanan 8f423c7a43
Filter dags by owner (#11121)
* Filter dags by owner

* Seperate links for multiple owners

* Minor style change

Co-authored-by: Ryan Hamilton <ryan@ryanahamilton.com>

Co-authored-by: Ryan Hamilton <ryan@ryanahamilton.com>
2020-11-09 10:08:40 -05:00
Jarek Potiuk 61feb6ec45
Provider's readmes generated for elasticsearch and google packages (#12194) 2020-11-09 14:22:19 +01:00
Mariusz Strzelecki 3f59e75cdf
KubernetesPodOperator: use randomized name to get the failure status (#12171)
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-09 13:13:49 +00:00
Jarek Potiuk b2a28d1590
Moves provider packages scripts to dev (#12082)
The change #10806 made airflow works with implicit packages
when "airflow" got imported. This is a good change, however
it has some unforeseen consequences. The 'provider_packages'
script copy all the providers code for backports in order
to refactor them to the empty "airflow" directory in
provider_packages folder. The #10806 change turned that
empty folder in 'airflow' package because it was in the
same directory as the provider_packages scripts.

Moving the scripts to dev solves this problem.
2020-11-09 13:27:10 +01:00
Ash Berlin-Taylor 92e405e729
Call scheduler "book-keeping" operations less frequently. (#12139)
This change makes it so that certain operations in the scheduler are
called on a regular interval, instead of only once at start up, or every
time around the loop:

- adopt_or_reset_orphaned_tasks (detecting SchedulerJobs that died) was
  previously only called on start up.
- _clean_tis_without_dagrun was previously called every time around the
  scheduling loop, but this isn't so needed to be done every time as
  this is a relatively rare cleanup operation
- _emit_pool_metrics doesn't need to be called _every_ time around the
  loop, once every 5 seconds is enough.

This uses the built in ["sched" module][sched] to handle the "timers".

[sched]: https://docs.python.org/3/library/sched.html
2020-11-09 12:14:24 +00:00
Sharad M 7e0d08e1f0
Add how-to Guide for Databricks operators (#12175) 2020-11-09 12:26:35 +01:00
Kamil Breguła 833ba453de
Move metrics configuration to new section - metrics (#12165)
* Move metrics configuration to new section

* fixup! Move metrics configuration to new section

* fixup! fixup! Move metrics configuration to new section

* Apply suggestions from code review

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>

* fixup! Apply suggestions from code review

Co-authored-by: Xiaodong DENG <xd.deng.r@gmail.com>
2020-11-09 07:34:38 +01:00
Xiaodong DENG 6ce95fb268
Fix broken 'Blocked Highlight' feature in UI (#12183)
* Fix broken 'Blocked Highlight' feature in UI main page

* based on the latest UI design, changed how the highlighting is done.
2020-11-08 20:51:40 +01:00
Kamil Breguła fcb6b00efe
Add authentication to AWS with Google credentials (#12079)
* Add authentication to AWS with Google credentials

* fixup! Add authentication to AWS with Google credentials

* fixup! fixup! Add authentication to AWS with Google credentials

* fixup! fixup! fixup! Add authentication to AWS with Google credentials
2020-11-08 19:06:24 +01:00
Faisal 3ff7e0743a
azure key vault optional lookup (#12174) 2020-11-08 10:50:51 +01:00
Davide Consonni 2ef3b7ef8c
Fix ERROR - Object of type 'bytes' is not JSON serializable when using store_to_xcom_key parameter (#12172) 2020-11-08 09:17:07 +01:00
Xiaodong DENG 8d5ad6969f
Proper title for XCom List View page (#12169) 2020-11-07 23:04:52 +01:00
Xiaodong DENG bedaf5353d
Allow Connection Edit View to handle entries with NULL 'extra' (#12149) 2020-11-07 22:29:20 +01:00
Kamil Breguła fbbb199058
Move docs for max_db_retries option to core (#12167) 2020-11-07 21:02:57 +01:00
Kaxil Naik ed133267f5
Sync FAB Permissions for all base views (#12162)
If a user has set `[webserver] update_fab_perms = False` and runs `airflow sync-perm` command to sync all permissions, they will receive the following error:

```
webserver_1  | [2020-11-07 15:13:07,431] {decorators.py:113} WARNING - Access is Denied for: can_index on: Airflow
```

and if the user was created before and some perms were sync'd a user won't be able to find Security Menu & Configurations View
2020-11-07 17:32:53 +00:00
Ephraim Anierobi 0cd7a0aa6e
Make doc_md field nullable and raise json for non-existing dag in dag detail endpoint (#12142) 2020-11-07 16:57:11 +01:00
Ryan Hamilton b7b401acdb
fix spacing between table and pagination (#12160) 2020-11-07 15:59:34 +01:00
Faisal fb6bddba0c
In AWS Secrets backend, a lookup is optional (#12143) 2020-11-07 08:08:48 +01:00
Kaxil Naik 070362510f
Retry Publishing Task to Celery Broker (#12140)
If for some reason (network blip, redis is down) if AirflowTaskTimeout is raised (controlled by `[celery] operation_timeout`) when publishing Task to the broker, Airflow will be default atleast retry 3 times to publish the messages controlled by `[celery] task_publish_max_retries`.
2020-11-06 22:26:22 +00:00
Tobiasz Kędzierski 0caec9fd32
Dataflow - add waiting for successful job cancel (#11501)
Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com>
2020-11-06 15:14:14 +01:00
Ash Berlin-Taylor bdcb6f8d2a
Remove the ability to add hooks to airflow.hooks namespace (#12108)
Hooks do not need to live under "airflow.hooks" namespace for them to
work -- so remove the ability to create them under there in plugins.

Using them as normal python imports is good enough!

We still allow them to be "registered" to support dynamically populating
the connections list in the UI (which won't be done for 2.0)

Closes #9507
2020-11-06 13:24:10 +00:00
Kaxil Naik cf9437d79f
Simplify string expressions (#12123)
Black has trouble formatting strings that are too long and produces unusual sring expressions.
2020-11-06 13:49:48 +01:00
Mariusz Strzelecki 24a8370664
airflow info fixed for python 3.8+ (#12132) 2020-11-06 12:17:23 +01:00
Kaxil Naik a83be66840
Replace conditional with builtin max (#12122)
It is unnecessary to use an if statement to check the maximum of two values and then assign the value to a name. Just using the max built-in is straightforward and more readable.
2020-11-06 07:36:43 +01:00
Kaxil Naik f68225ed55
Remove commented line (#12125)
This line does not add any meaning and I think was left over in the PR
2020-11-06 02:07:21 +00:00
Mariusz Strzelecki 7825be50d8
Randomize pod name (#12117) 2020-11-05 23:48:46 +01:00
Daniel Imberman 68ba54bbd5
Add ability to specify pod_template_file in executor_config (#11784)
* Add pod_template_override to executor_config

Users will be able to override the base pod_template_file on a per-task
basis.

* change docstring

* fix doc

* fix static checks

* add description
2020-11-05 14:48:05 -08:00
Kaxil Naik 60cf315d1b
Remove redundant parenthesis (#12118) 2020-11-05 22:22:11 +00:00
Ash Berlin-Taylor 5d9703718a
Add SIGUSR2 handler to Scheduler to dump executor state (#12107)
This provides a means to get a snapshot of the in-memory state of state
a running scheduler, without having to turn on debug logging
2020-11-05 20:08:33 +00:00
Vikram Koka 31dc6cf827
Changed tutorial file to reflect name change to TaskFlow API (#12099)
Changed the tutorial for decorated flows in the example dags directory to reflect the name change to TaskFlow API
2020-11-05 09:35:37 +00:00
Matt Traynham fcfc7f1242
Improve reading SSL credentials file in GRPC Hook (#12094) 2020-11-04 22:05:49 +01:00
Kamil Breguła 91a64db505
Format all files (without excepions) by black (#12091) 2020-11-04 20:33:07 +01:00
Mário Hunka fd3db778e7
Add server side cursor support for postgres to GCS operator (#11793) 2020-11-04 20:32:42 +01:00
Fai cadae496b3
Correct failure message in sql_sensor.py. (#12057)
Co-authored-by: Fai <faihegberg@gmail.com>
2020-11-04 20:23:52 +01:00
José Francisco Molano-Pulido 75f229601e
Adding MySql howto-documentation and example DAG (#12077)
closes https://github.com/apache/airflow/issues/11918
2020-11-04 17:43:58 +00:00
Kamil Breguła f1f1940261
Add DataflowStartSQLQuery operator (#8553) 2020-11-04 18:35:19 +01:00
Kamil Breguła 41bf172c1d
Simplify string expressions (#12093) 2020-11-04 18:31:08 +01:00
Xiaodong DENG 7597f3a6c1
Remove explicit casting to List when sorted() is applied (#12085)
sorted() returns a sorted list.
sorted(list(A)) is equivalent to sorted(A) no matter A is Tuple, List, or Set

Ref: https://docs.python.org/3/library/functions.html#sorted
2020-11-04 18:07:36 +01:00
Xiaodong DENG 2ac53ee160
Avoid unnecessary IF checks when generate Duration & Landing Time views (#12075)
The original code is looping in a space which could be smaller, meanwhile IF checks is not necessary.

This change aims for:
- bring MINOR performance improvement
- cleaner code
2020-11-04 10:56:25 +00:00
Tomek Urbaszek 5f5244b74d
Add template fields renderers to Biguery and Dataproc operators (#12067) 2020-11-04 11:30:03 +01:00
Ash Berlin-Taylor 5e8b537b85
Remove the ability to import operators and sensors from plugins (#12072)
We have deprecated this in #12069 (for inclusion in 1.10.13) and the
docs http://airflow.apache.org/docs/stable/howto/custom-operator.html
already show how to do this without a plugin.

Closes #9498
2020-11-04 01:17:18 +00:00
Kaxil Naik 4e8f9cc8d0
Enable Black - Python Auto Formmatter (#9550) 2020-11-03 23:51:54 +00:00
Jarek Potiuk 1dc7099315
Fixes import of BaseOperator in dinging (#12063)
The import was wrongly importing BaseOperator from bash_operator.

Now it correctly imports it from models.
2020-11-03 23:45:05 +01:00
Kaxil Naik 8c42cf1b00
Use PyUpgrade to use Python 3.6 features (#11447)
Use features like `f-strings` instead of format across the code-base.
More details: https://github.com/asottile/pyupgrade
2020-11-03 21:53:59 +00:00
Kaxil Naik 980c7252c0
Add Kubernetes cleanup-pods CLI command for Helm Chart (#11802)
closes: https://github.com/apache/airflow/issues/11146
2020-11-03 15:28:51 +00:00
Kaxil Naik 2ebe623312
Replace deprecated PythonOperator module with the new one (#12064)
Without this change, users will get a warning when using example dags too
2020-11-03 15:28:14 +00:00
Kamil Breguła bb598d5565
Delete an environment-dependent value from CLI documentation (#12055) 2020-11-03 11:50:59 +01:00
Joshua Carp 45ae145c25
Log BigQuery job id in insert method of BigQueryHook (#12056) 2020-11-03 10:34:40 +01:00
James Timmins eea6c4f273
Perform "mini scheduling run" after task has finished (#11589)
In order to further reduce intra-dag task scheduling lag we add an
optimization: when a task has just finished executing (success or
failure) we can look at the downstream tasks of just that task, and then
make scheduling decisions for those tasks there -- we've already got the
dag loaded, and we know they are likely actionable as we just finished.

We should set tasks to scheduled if we can (but no further, i.e. not to
queued, as the scheduler has to make that decision with info about the
Pool usage etc.).

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-11-03 06:47:05 +00:00
Joshua Carp e324b37a67
Add job name and progress logs to Cloud Storage Transfer Hook (#12014) 2020-11-03 01:00:34 +01:00
Ryan Hamilton 088b98e71f
Remove unused JavaScript function (#12052) 2020-11-02 20:53:08 +00:00
Faisal dd2442b1e6
Vault with optional Variables or Connections (#11736) 2020-11-02 20:51:04 +00:00
Xiaodong DENG 5e77a61543
Docstring fix for S3DeleteBucketOperator (#12049) 2020-11-02 21:05:44 +01:00
Kaxil Naik 2192010ee3
Retry Dagbag.sync_to_db to avoid Deadlocks (#12046)
Previously we added Retry in DagFileProcessor.process_file to
retry dagbag.sync_to_db. However, this meant that if anyone calls
dagbag.sync_to_db separately then also need to manage retrying it
by themselves. This caused failures in CI for MySQL.

resolves https://github.com/apache/airflow/issues/11543
2020-11-02 18:15:17 +00:00
Ryan Hamilton a1a1fc9f32
Override FAB table views where table width extends beyond parent containers (#12048) 2020-11-02 11:55:55 -05:00
Sergey Serebryakov 5204ff6f72
Fix incorrect .airflowignore behavior with multiple nested directories (#11994)
* Add failing test

* Fix failing test
2020-11-02 17:39:16 +01:00
Sergey Serebryakov 644791989e
Ignore the basepath when ignoring files via .airflowignore (#11993) 2020-11-02 15:35:14 +01:00