Граф коммитов

8858 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk 0b0e4f7a4c
Preparing for RC3 relase of backports (#9026) 2020-05-26 21:35:06 +02:00
Jarek Potiuk 00642a46d0
Fixed name of 20 remaining wrongly named operators. (#8994) 2020-05-26 19:12:21 +02:00
Jarek Potiuk 9764c90b92
Better content of backport packages CHANGELOG and INSTALL files (#9013)
* Better content of backport packages CHANGELOG and INSTALL files

The content of Backport Packages CHANGELOG.txt and INSTALL files
has been updated to reflect that those are not full Airflow
releases.

1) Source package:
- INSTALL contains only references to preparing backport packages
- CHANGELOG.txt contains combined change log of all the packages

2) Binary packages:
- No INSTALL
- CHANGELOG.txt contains changelog for this package only

3) Whl packages

No change

* Update backport_packages/INSTALL
2020-05-26 13:54:58 +02:00
Jarek Potiuk 7883885e6c
Move setup order check back to pre-commit (#9010)
* Move setup order check back to pre-commit

The order check used to be working from pre-commit but then it
was moved to be regular test case. That was a mistake

The test is super-fast and actually making it use assertEquals
was not very useful and it was very late when you found it out.

I changed it to be normal python script which made it works again
(it did not work when it was a test because pre-commit does not
run tests - it runs python scripts).

The messages printed now are much more informative as well.
2020-05-26 12:58:24 +02:00
Jarek Potiuk 2a889553ef
Finding cross-provider dependencies fails when encoding wrong (#9012)
This forces encoding of read python files to utf-8
2020-05-26 10:04:19 +02:00
Tomek Urbaszek 3994030ea6
Refactor BigQuery operators (#8858)
* Refactor BigQueryCreateEmptyTableOperator

* Refactor BigQueryCreateExternalTableOperator

* Refactor BigQueryDeleteDatasetOperator

* Refactor BigQueryCreateEmptyDatasetOperator

* Refactor BigQueryGetDataOperator

* BigQueryGetDatasetTablesOperator

* Refactor BigQueryPatchDatasetOperator

* Refactor BigQueryUpdateDatasetOperator

* Refactor BigQueryDeleteTableOperator

* Refactor BigQueryUpsertTableOperator

* Apply cr suggestions

* fixup! Apply cr suggestions
2020-05-26 10:03:23 +02:00
Jarek Potiuk cdb3f25456
All classes in backport providers are now importable in Airflow 1.10 (#8991)
* All classes in backport providers are now importable in Airflow 1.10

* fixup! All classes in backport providers are now importable in Airflow 1.10

* fixup! fixup! All classes in backport providers are now importable in Airflow 1.10
2020-05-26 02:43:16 +02:00
Jacob Shao 14fb58530c
[AIRFLOW-8902] Fix Dag Run UI execution date with timezone cannot be saved issue (#8902)
Closes #8842
2020-05-25 14:09:02 +01:00
Kamil Breguła 030261a372
Assign area:webserver label to webserver_command.py (#8998) 2020-05-24 22:55:24 +02:00
Diego Lopes 7d525f3c6e
added Paranabanco to official company list (#8990) 2020-05-24 10:48:32 +02:00
Kaxil Naik 427257c2e2
Remove defunct code from setup.py (#8982)
* Remove defunct code from setup.py
2020-05-24 03:52:01 +02:00
larluo 971cae3520
Fix migration message (#8988) 2020-05-23 20:21:08 +01:00
Kamil Breguła 1d36b0303b
Fix references in docs (#8984) 2020-05-23 17:43:04 +02:00
Jarek Potiuk f946f96da4
Old json boto compat removed from dynamodb_to_s3 operator (#8987) 2020-05-23 16:12:00 +02:00
Adam Dobrawy f3456b125f
Fix formatting code block in TESTING.rst (#8985) 2020-05-23 11:43:53 +02:00
Kamil Breguła bdb83699f1
Add secrets to test_deprecated_packages (#8979) 2020-05-23 11:11:25 +02:00
Joppe Vos cf5cf45e1c
Support YAML input for CloudBuildCreateOperator (#8808) 2020-05-23 08:42:23 +02:00
Mike Clarke db70da2ab1
Flush pending Sentry exceptions before exiting (#7232)
Fixes AIRFLOW-6569 by explicitly flushing pending exceptions prior
to calling `os._exit` within the forked task runner.
2020-05-23 08:34:34 +02:00
Kaxil Naik 4d67704624
Remove duplicate line from CONTRIBUTING.rst (#8981) 2020-05-23 08:28:18 +02:00
Kamil Breguła e742ef7c70
Fix typo in test_project_structure (#8978) 2020-05-23 08:24:33 +02:00
Siddartha Ravichandran f1073381ed
Add support for spark python and submit tasks in Databricks operator(#8846) 2020-05-22 19:43:32 +02:00
Ace Haidrey b055151520
Add context to execution_date_fn in ExternalTaskSensor (#8702)
Co-authored-by: Ace Haidrey <ahaidrey@pinterest.com>
2020-05-22 07:02:34 -07:00
curiousjazz77 9a4a2d1ad7
[AIRFLOW-5262] Update timeout exception to include dag (#8466)
* [AIRFLOW-5262] Update timeout exception to include dag

* PR comment: extract dag id in log to variable
2020-05-22 12:12:05 +02:00
Kaxil Naik 94a7673e8b
Pin google-cloud-datacatalog to <0.8 (#8957)
`field_path` was renamed to `tag_template_field_path` in >=0.8 and there might be other unknown errors
2020-05-21 22:31:10 +01:00
Kaxil Naik dd7204066f
Pin Version of Azure Cosmos to <4 (#8956)
Old Repo: https://github.com/Azure/azure-cosmos-python
New Repo: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/cosmos/azure-cosmos

azure-cosmos==4.0.0 was released on 20 May 2020 that breaks Airflow
2020-05-21 21:58:57 +01:00
Daniel Imberman 90a07d8184
Cache 1 10 ci images (#8955)
* Push CI images to Docker packcage cache for v1-10 branches

This is done as a commit to master so that we can keep the two branches
in sync

Co-Authored-By: Ash Berlin-Taylor <ash_github@firemirror.com>

* Run Github Actions against v1-10-stable too

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-05-21 13:20:29 -07:00
Ash Berlin-Taylor 113982b25d
Make scheduler_dag_execution_timing grok dynamic start date of elastic dag (#8952)
The scheduler_dag_execution_timing script wants to run _n_ dag runs to
completion. However since the start date of those dags is Dynamic (`now
- delta`) we can't pre-compute the execution_dates like we were before.
(This is because the execution_date of the very first dag run would be
`now()` of the parser process, but if we try to pre-compute that in
the benchmark process it would see a different value of now().)

This PR changes it to instead watch for the first _n_ dag runs to be
completed. This should make it work with more dags with less changes to
them.
2020-05-21 15:56:54 +01:00
Ash Berlin-Taylor b26b3ca978
Don't hard-code constants in scheduler_dag_execution_timing (#8950)
Slight "improvement" on #8949
2020-05-21 13:43:00 +01:00
Jarek Potiuk 41481bb402
Python base images are stored in cache (#8943)
All PRs will used cached "latest good" version of the python
base images from our GitHub registry. The python versions in
the Github Registry will only get updated after a master
build (which pulls latest Python image from DockerHub) builds
and passes test correctly.

This is to avoid problems that we had recently with Python
patchlevel releases breaking our Docker builds.
2020-05-21 13:39:07 +02:00
Kaxil Naik 97b6cc7619
Add note in Updating.md about the removel of DagRun.ID_PREFIX (#8949) 2020-05-21 12:17:48 +01:00
Ash Berlin-Taylor 16206cd626
Update example webserver_config.py to show correct CSRF config (#8944)
CSRF_ENABLED does nothing.

Thankfully, due to sensible defaults in flask-wtf, CSRF is on by
default, but we should set this correctly.

Fixes #8915
2020-05-21 12:12:29 +01:00
Ash Berlin-Taylor 47413d98f0
Remove singularity from CI images (#8945)
The singularity operator tests _have always_ used mocking, so we were
adding 700MB to our docker image for nothing.

Fixes #8774
2020-05-21 12:12:03 +01:00
Kaxil Naik 8d3acd768e
Fix docstring in DagFileProcessor._schedule_task_instances (#8948) 2020-05-21 12:11:44 +01:00
Joe Harris f3f74c7320
Add TaskInstance state to TI Tooltip to be colour-blind friendlier (#8910)
Currently there is no way to determine the state of a TaskInstance in the graph view or tree view for people with colour blindness

Approximately 4.5% of people experience some form of colour vision deficiency
2020-05-21 10:53:54 +01:00
Kamil Breguła a9dfd7d1cf
Remove side-effect of session in FAB (#8940) 2020-05-21 11:53:06 +02:00
Kaxil Naik f17b4bbb89
Fix DagRun Prefix for Performance script (#8934) 2020-05-21 10:51:31 +01:00
Ash Berlin-Taylor 8476c1e387
Hive/Hadoop minicluster needs JDK8 and JAVA_HOME to work (#8938)
Debian Buster only ships with a JDK11, and Hive/Hadoop fails in odd,
hard to debug ways (complains about metastore not being initalized,
possibly related to the class loader issues.)

Until we rip Hive out from the CI (replacing it with Hadoop in a seprate
integration, only on for some builds) we'll have to stick with JRE8

Our previous approach of installing openjdk-8 from Sid/Unstable started
failing as Debian Sid has a new (and conflicting) version of GCC/libc.
The adoptopenjdk package archive is designed for Buster so should be
more resilient
2020-05-21 07:19:49 +02:00
Saran 12c22e0fe0
Added Greytip to Airflow Users list (#8887) 2020-05-21 01:35:04 +01:00
Kaxil Naik c6224e24d7
Remove unused self.max_threads argument in SchedulerJob (#8935) 2020-05-21 01:31:28 +01:00
Ash Berlin-Taylor 51d955787b
Re-run all tests when Dockerfile or Github worflow change (#8924)
Fixes #8921
2020-05-20 18:49:28 +01:00
Kaxil Naik 5360045539
Fix incorrect Env Var to stop Scheduler from creating DagRuns (#8920) 2020-05-20 14:40:28 +01:00
Ash Berlin-Taylor fef00e5a06
Use Debian's provided JRE from Buster (#8919)
Installing the JDK (not even the JRE) from Sid is starting to break on
Buster as the versions of packages conflict:

> The following packages have unmet dependencies:
> libgcc-8-dev : Depends: gcc-8-base (= 8.4.0-4) but 8.3.0-6 is to be installed
>                Depends: libmpx2 (>= 8.4.0-4) but 8.3.0-6 is to be installed

This changes our CI docker images to:

1. Not install something from Sid (unstable, packages change/get
   updated) when we are using Buster (stable, only security fixes).
2. Installed the JRE, not the JDK. We don't need to compile Java code.
2020-05-20 14:18:59 +01:00
Ryan Hamilton ce7fdeae3a
UX Fix: Prevent undesired text selection with DAG title selection in Chrome (#8912)
Negate user-select in Firefox where behavior is already as desired
2020-05-20 00:23:04 +01:00
Jacob Ferriero 499493c5c5
[AIRFLOW-6586] Improvements to gcs sensor (#7197)
* [AIRFLOW-6586] Improvements to gcs sensor

refactors GoogleCloudStorageUploadSessionCompleteSensor to use set instead of number of objects

add poke mode only decorator

assert that poke_mode_only applied to child of BaseSensorOperator

refactor tests

remove assert

[AIRFLOW-6586] Improvements to gcs sensor

refactors GoogleCloudStorageUploadSessionCompleteSensor to use set instead of number of objects

add poke mode only decorator

assert that poke_mode_only applied to child of BaseSensorOperator

remove assert

fix static checks

add back inadvertently remove requirements

pre-commit

fix typo

* gix gcs sensor unit test

* move poke_mode_only to base_sensor_operator module

* add sensor / poke_mode_only docs

* fix ci check add sensor how-to docs

* Update airflow/providers/google/cloud/sensors/gcs.py

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>

* Update airflow/sensors/base_sensor_operator.py

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>

* Update airflow/sensors/base_sensor_operator.py

Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>

* simplify class decorator

* remove type hint

* add note to UPDATING.md

* remove unecessary declaration of class member

* Fix to kwargs in UPDATING.md

Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
2020-05-19 23:14:28 +02:00
Ash Berlin-Taylor bae5cc2f5c
Fix race in Celery tests by pre-creating result tables (#8909)
We noticed our Celery tests failing sometimes with

> (psycopg2.errors.UniqueViolation) duplicate key value violates unique
> constraint "pg_type_typname_nsp_index"
> DETAIL:  Key (typname, typnamespace)=(celery_tasksetmeta, 2200) already exists

It appears this is a race condition in SQLAlchemy's "create_all()"
function, where it first checks which tables exist, builds up a list of
`CREATE TABLE` statements, then issues them. Thus if two celery worker
processes start at the same time, they will find the the table doesn't
yet exist, and both try to create it.

This is _probably_ a bug in SQLA, but this should be an easy enough fix
here, to just ensure that the table exists before launching any Celery tasks.
2020-05-19 13:21:44 +01:00
Jarek Potiuk 375d1ca229
Release candidate 2 for backport packages 2020.05.20 (#8898)
Release candidate 2 for backport packages 2020.05.20
2020-05-19 14:17:22 +02:00
QP Hou dd57ec9e26
Fix task and dag stats on home page (#8865)
d.dag_id is not a valid attribute. in order to use dag_id variable
in a closure callback, it needs to be passed in as a fuction so the
right value can be captured for each for loop.
2020-05-19 10:25:39 +01:00
crazy-2020 841d816647
Allow setting the pooling time in DLPHook (#8824)
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
2020-05-19 04:55:41 +02:00
Jarek Potiuk 2121f494c3
Avoid failure on transient requirements in CI image (#8892)
When you build from the scratch and some transient requirements
fail, the initial step of installation might fail.

We are now using latest valid constraints from the DEFAULT_BRANCH
branch to avoid it.
2020-05-17 22:41:48 +02:00
Jarek Potiuk 12c5e5d8ae
Prepare release candidate for backport packages (#8891)
After preparing the 2020.5.19 release candidate and
reviewing the packages, some changes turned out to be necessary.

Therefore the date was changed to 2020.5.20 with the folowing
fixes:

* cncf.kubernetes.example_dags were hard-coded and added for all
  packagesa and they were removed
* Version suffix is only used to rename the binary packages not for
  the version itself
* Release process description is updated with the release process
* Package version is consistent - leading 0s are skipped in month
  and day
2020-05-17 20:38:46 +02:00