incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
John Bampton	75071831ba	Remove redundant parentheses from Python files (#10967 )	2020-10-10 15:08:38 +02:00
Jarek Potiuk	401a579dd1	Push and schedule duplicates are not cancelled. (#11397 ) The push and schedule builds should not be cancelled even if they are duplicates. By seing which of the master merges failed, we have better visibility on which merge caused a problem and we can trace it's origin faster even if the builds will take longer overall. Scheduled builds also serve it's purpose and they should be always run to completion.	2020-10-10 13:51:58 +02:00
Jarek Potiuk	04973904c3	Constraints and PIP packages can be installed from local sources (#11382 ) * Constraints and PIP packages can be installed from local sources This is the final part of implementing #11171 based on feedback from enterprise customers we worked with. They want to have a capability of building the image using binary wheel packages that are locally available and the official Dockerfile. This means that besides the official APT sources the Dockerfile build should not needd GitHub, nor any other external files pulled from outside including PIP repository. This change also includes documentation on how to prepare set of such binaries ready for inspection and review by security teams in Enterprise environment. Such sets of "known-working-binary-whl" files can then be separately committed, tracked and scrutinized in an artifact repository of such an Enterprise. Fixes: #11171 * Update docs/production-deployment.rst	2020-10-10 12:58:09 +02:00
Daniel Imberman	8640fb6c10	fix tests (#11368 )	2020-10-09 16:56:56 -07:00
Michał Misiewicz	b7404b079a	KubernetesPodOperator should retry log tailing in case of interruption (#11325 ) * KubernetesPodOperator can retry log tailing in case of interruption * fix failing test * change read_pod_logs method formatting * KubernetesPodOperator retry log tailing based on last read log timestamp * fix test_parse_log_line test formatting * add docstring to parse_log_line method * fix kubernetes integration test	2020-10-09 15:59:47 -07:00
Jarek Potiuk	6fe020e105	Add tests for Custom cluster policy (#11381 ) The custom ClusterPolicyViolation has been added in #10282 This one adds more comprehensive test to it. Co-authored-by: Jacob Ferriero <jferriero@google.com>	2020-10-10 00:57:10 +02:00
John Bampton	39fc961eec	Fix case of JavaScript. (#10957 )	2020-10-10 00:50:31 +02:00
Daniel Imberman	3164025a7a	Fix airflow_local_settings.py showing up as directory (#10999 ) Fixes a bug where the airflow_local_settings.py mounts as a volume if there is no value (this causes k8sExecutor pods to fail)	2020-10-10 00:49:45 +02:00
Đặng Minh Dũng	298052fcee	[airflow/providers/cncf/kubernetes] correct hook methods name (#11008 )	2020-10-10 00:48:47 +02:00
mucio	7b0a2f5d8e	Replaced basestring with str in the Exasol hook (#11360 )	2020-10-10 00:44:59 +02:00
Jarek Potiuk	d752575e78	Revert "Revert "Adds --install-wheels flag to breeze command line (#11317 )" (#11348 )" (#11356 ) This reverts commit `f67e6cb805`.	2020-10-10 00:41:11 +02:00
Ash Berlin-Taylor	73b9163a8f	Fully support running more than one scheduler concurrently (#10956 ) * Fully support running more than one scheduler concurrently. This PR implements scheduler HA as proposed in AIP-15. The high level design is as follows: - Move all scheduling decisions into SchedulerJob (requiring DAG serialization in the scheduler) - Use row-level locks to ensure schedulers don't stomp on each other (`SELECT ... FOR UPDATE`) - Use `SKIP LOCKED` for better performance when multiple schedulers are running. (Mysql < 8 and MariaDB don't support this) - Scheduling decisions are not tied to the parsing speed, but can operate just on the database DagFileProcessorProcess: Previously this component was responsible for more than just parsing the DAG files as it's name might imply. It also was responsible for creating DagRuns, and also making scheduling decisions of TIs, sending them from "None" to "scheduled" state. This commit changes it so that the DagFileProcessorProcess now will update the SerializedDAG row for this DAG, and make no scheduling decisions itself. To make the scheduler's job easier (so that it can make as many decisions as possible without having to load the possibly-large SerializedDAG row) we store/update some columns on the DagModel table: - `next_dagrun`: The execution_date of the next dag run that should be created (or None) - `next_dagrun_create_after`: The earliest point at which the next dag run can be created Pre-computing these values (and updating them every time the DAG is parsed) reduce the overall load on the DB as many decisions can be taken by selecting just these two columns/the small DagModel row. In case of max_active_runs, or `@once` these columns will be set to null, meaning "don't create any dag runs" SchedulerJob The SchedulerJob used to only queue/send tasks to the executor after they were parsed, and returned from the DagFileProcessorProcess. This PR breaks the link between parsing and enqueuing of tasks, instead of looking at DAGs as they are parsed, we now: - store a new datetime column, `last_scheduling_decision` on DagRun table, signifying when a scheduler last examined a DagRun - Each time around the loop the scheduler will get (and lock) the next _n_ DagRuns via `DagRun.next_dagruns_to_examine`, prioritising DagRuns which haven't been touched by a scheduler in the longest period - SimpleTaskInstance etc have been almost entirely removed now, as we use the serialized versions * Move callbacks execution from Scheduler loop to DagProcessorProcess * Don’t run verify_integrity if the Serialized DAG hasn’t changed dag_run.verify_integrity is slow, and we don't want to call it every time, just when the dag structure changes (which we can know now thanks to DAG Serialization) * Add escape hatch to disable newly added "SELECT ... FOR UPDATE" queries We are worried that these extra uses of row-level locking will cause problems on MySQL 5.x (most likely deadlocks) so we are providing users an "escape hatch" to be able to make these queries non-locking -- this means that only a singe scheduler should be run, but being able to run one is better than having the scheduler crash. Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-10-09 22:44:27 +01:00
Jarek Potiuk	e198077f3e	Add pypirc initialization (#11386 ) This PR needs to be merged first in order to handle the #11385 which requires .pypirc to be created before dockerfile gets build. This means that the script change needs to be merged to master first in this PR.	2020-10-09 22:55:03 +02:00
Jarek Potiuk	29a145cd69	Add capability of adding service account annotations to Helm Chart (#11387 ) We can now add annotations to the service accounts in a generic way. This allows for example to add Workflow Identitty in GKE environment but it is not limited to it. Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com> Co-authored-by: Jacob Ferriero <jferriero@google.com> Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com>	2020-10-09 22:54:21 +02:00
Daniel Imberman	49aad025b5	Users can specify sub-secrets and paths k8spodop (#11369 ) Allows users to specify items for specific key path projections when using the airflow.kubernetes.secret.Secret class	2020-10-09 09:00:09 -07:00
Tomek Urbaszek	eb5fea7b64	Replace nuke with useful information on error page (#11346 ) This PR replaces nuke asciiart with text about reporting a bug. As we are no longer using asciiarts this PR removes it.	2020-10-09 16:27:39 +02:00
Kaxil Naik	ff1a2aaff8	Set start_date, end_date & duration for tasks failing without DagRun (#11358 )	2020-10-09 15:21:39 +01:00
Ash Berlin-Taylor	fe0bf6e1f0	Reduce "start-up" time for tasks in CeleryExecutor (#11372 ) This is similar to #11327, but for Celery this time. The impact is not quite as pronounced here (for simple dags at least) but takes the average queued to start delay from 1.5s to 0.4s	2020-10-09 13:18:32 +01:00
Satyasheel	d2754ef769	Strict type check for Microsoft (#11359 )	2020-10-09 10:31:53 +01:00
Tobiasz Kędzierski	8baf657fc2	Fix regression in DataflowTemplatedJobStartOperator (#11167 )	2020-10-09 10:21:16 +02:00
Vijayant	422b61a9dd	Adding ElastiCache Hook for creating, describing and deleting replication groups (#8701 )	2020-10-09 09:19:26 +01:00
Sumit Maheshwari	5605d1063b	Fix DagBag bug when a dag has invalid schedule_interval (#11344 )	2020-10-09 13:29:41 +05:30
Kaxil Naik	7f674c685d	Use only-if-needed upgrade strategy for PRs (#11363 ) Currently, upgrading dependencies in setup.py still runs with previous versions of the package for the PR which fails. This will change to upgrade only the package that is required for the PRs	2020-10-09 09:57:51 +02:00
Kamil Breguła	7541c88eaf	Allways use Airlfow db in FAB (#11364 )	2020-10-09 09:55:31 +02:00
Kaxil Naik	27e637fbe3	Bugfix: Error in SSHOperator when command is None (#11361 ) closes https://github.com/apache/airflow/issues/10656	2020-10-09 08:35:39 +01:00
venkatesh selvaraj	11eb649d4a	Fix to make y-axis of Tries chart visible (#10071 ) Co-authored-by: Venkatesh Selvaraj <venkateshselvaraj@pinterest.com>	2020-10-08 20:17:50 +01:00
Jarek Potiuk	f5b7bbcb92	Better diagnostics when there are problems with Kerberos (#11353 )	2020-10-08 21:08:11 +02:00
Jarek Potiuk	666e81ab4a	Bump cache version for kubernetes tests (#11355 ) Seems that the k8s cache for virtualenv got broken during the recent problems. This commits bumps the cache version to make it afresh	2020-10-08 19:10:46 +02:00
Ash Berlin-Taylor	4839a5bc6e	Reduce "start-up" time for tasks in LocalExecutor (#11327 ) Spawning a whole new python process and then re-loading all of Airflow is expensive. All though this time fades to insignificance for long running tasks, this delay gives a "bad" experience for new users when they are just trying out Airflow for the first time. For the LocalExecutor this cuts the "queued time" down from 1.5s to 0.1s on average.	2020-10-08 17:37:51 +01:00
Kaxil Naik	a1f888507f	Improve instructions to install Airflow Version (#11339 ) The instructions can be replaced by `./breeze start-airflow` command	2020-10-08 17:19:31 +01:00
Kaxil Naik	ba60836456	Fix command to run tmux with breeze in BREEZE.rst (#11340 ) `breeze --start-airflow` -> `breeze start-airflow`	2020-10-08 08:47:56 -07:00
Ash Berlin-Taylor	f67e6cb805	Revert "Adds --install-wheels flag to breeze command line (#11317 )" (#11348 ) This reverts commit `de07d135ae`.	2020-10-08 14:35:04 +01:00
Jarek Potiuk	9dc32a3d8a	Better message when Building Image fails or gets cancelled. (#11333 )	2020-10-08 13:09:34 +02:00
Michał Słowikowski	832a7850f1	Add Azure Blob Storage to GCS transfer operator (#11321 )	2020-10-08 12:16:50 +02:00
Kaxil Naik	625afa2af2	Improve Committer's guide docs (#11338 )	2020-10-08 10:24:07 +01:00
Tomek Urbaszek	4d95d9c71b	Improve code quality of SLA mechanism in SchedulerJob (#11257 )	2020-10-08 10:44:46 +02:00
Jarek Potiuk	de07d135ae	Adds --install-wheels flag to breeze command line (#11317 ) If this flag is specified it will look for wheel packages placed in dist folder and it will install the wheels from there after installing Airflow. This is useful for testing backport packages as well as in the future for testing provider packages for 2.0.	2020-10-08 10:06:53 +02:00
Satyasheel	5d007fd2ff	Strict type check for azure hooks (#11342 )	2020-10-08 09:36:35 +02:00
Kaxil Naik	2bac4810a4	Update link for Announcement Page (#11337 )	2020-10-07 22:40:20 +01:00
Fai	b4baa2b04b	Add environment variables documentation to cli-ref.rst. (#10970 ) Co-authored-by: Fai Hegberg <faihegberg@Fais-MacBook-Pro.local>	2020-10-07 21:43:48 +01:00
Jarek Potiuk	d404cb06dd	Moves Commiter's guide to CONTRIBUTING.rst (#11314 ) I decided to move it to CONTRIBUTING.rst as is it is an important documentation on what policies we have agreed to as community and also it is a great resource for the contributor to learn what are the committer's responsibilities. Fixes: #10179	2020-10-07 21:14:55 +02:00
Jarek Potiuk	fe59f26223	Pin versions of "untrusted" 3rd-party GitHub Actions (#11319 ) According to https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/security-hardening-for-github-actions#using-third-party-actionsa it's best practice not to use tags in case of untrusted 3rd-party actions in order to avoid potential attacks.	2020-10-07 13:23:41 +02:00
Ash Berlin-Taylor	d86cf37a35	Automatically upgrade old default navbar color (#11322 ) As part of #11195 we re-styled the UI, changing a lot of the default colours to make them look more modern. However for anyone upgrading and keeping their airflow.cfg from 1.10 to 2.0 they would end up with things looking a bit ugly, as the old navbar color would be kept. This uses the existing config value upgrade feature to automatically change the old default colour in to the new default colour.	2020-10-07 11:21:14 +01:00
FHoffmannCode	b0fcf67559	Add AzureFileShareToGCSOperator (#10991 )	2020-10-07 11:08:58 +02:00
Jarek Potiuk	e2655f60b3	Prints nicer message in case of git push errors (#11320 ) We started to get more often "unknown blob" kind of errors when pushing the images to GitHub Registry. While this is clearly a GitHub issue, it's frequency of occurence and unclear message make it a good candidate to write additional message with instructions to the users, especially that now they have an easy way to get to that information via status checks and links leading to the log file, when this problem happens during image building process. This way users will know that they should simply rebase or amend/force-push their change to fix it.	2020-10-07 10:30:16 +02:00
Tomek Urbaszek	47b05a87f0	Improve handling of job_id in BigQuery operators (#11287 ) Make autogenerated job_id more unique by using microseconds and hash of configuration. Replace dots in job_id. Closes: #11280	2020-10-07 10:08:08 +02:00
Jarek Potiuk	18dcac8a01	Add remaining community guidelines to CONTRIBUTING.rst (#11312 ) We are cleaning up the docs from CWiki and this is what's left of community guidelines that were maintained there. Fixes #10181	2020-10-07 05:33:47 +02:00
Jarek Potiuk	22c6a843d7	Adds --no-rbac-ui flag for Breeze airflow 1.10 installation (#11315 ) When installing airflow 1.10 via breeze we now enable rbac by default, but we can disable it with --no-rbac-ui flag. This is useful to test different variants of 1.10 when testing release candidataes in connection with the 'start-airflow' command.	2020-10-07 01:00:00 +01:00
Kishore Vancheeshwaran	bbc3cea057	Move latest_only_operator.py to latest_only.py (#11178 ) (#11304 )	2020-10-07 00:15:28 +01:00
Kaxil Naik	4af7804549	Bump tenacity to 6.2 (#11313 )	2020-10-06 21:52:35 +01:00

... 4 5 6 7 8 ...

10400 Коммитов Все ветки Поиск

10400 Коммитов

Все ветки