incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
Ash Berlin-Taylor	623d5cdaff	Spend less time waiting for LocalTaskJob's subprocss process to finish (#11373 ) * Spend less time waiting for LocalTaskJob's subprocss process to finish This is about is about a 20% speed up for short running tasks! This change doesn't affect the "duration" reported in the TI table, but does affect the time before the slot is freeded up from the executor - which does affect overall task/dag throughput. (All these tests are with the same BashOperator tasks, just running `echo 1`.) Before ``` Task airflow.executors.celery_executor.execute_command[5e0bb50c-de6b-4c78-980d-f8d535bbd2aa] succeeded in 6.597011625010055s: None Task airflow.executors.celery_executor.execute_command[0a39ec21-2b69-414c-a11b-05466204bcb3] succeeded in 6.604327297012787s: None ``` After ``` Task airflow.executors.celery_executor.execute_command[57077539-e7ea-452c-af03-6393278a2c34] succeeded in 1.7728257849812508s: None Task airflow.executors.celery_executor.execute_command[9aa4a0c5-e310-49ba-a1aa-b0760adfce08] succeeded in 1.7124666879535653s: None ``` After, including change from #11372 ``` Task airflow.executors.celery_executor.execute_command[35822fc6-932d-4a8a-b1d5-43a8b35c52a5] succeeded in 0.5421732050017454s: None Task airflow.executors.celery_executor.execute_command[2ba46c47-c868-4c3a-80f8-40adaf03b720] succeeded in 0.5469810889917426s: None ```	2020-10-13 10:00:16 +01:00
Kaxil Naik	2345cd1f03	Fix Harcoded Airflow version (#11483 ) This test will fail or will need fixing whenever we release new Airflow version	2020-10-13 02:05:35 +01:00
Kaxil Naik	4e32546faf	Mask Password in Log table when using the CLI (#11468 )	2020-10-12 19:27:01 +01:00
Jarek Potiuk	358e61d7d2	Move the test_process_dags_queries_count test to quarantine (#11455 ) The test (test_process_dags_queries_count) randomly produces bigger number of counts. Example here: https://github.com/apache/airflow/runs/1239572585#step:6:421	2020-10-12 11:48:54 +02:00
Tomek Urbaszek	02ce45cafe	Refactor celery worker command (#11336 ) This commit does small refactor of the way we star celery worker. In this way it will be easier to migrate to Celery 5.0.	2020-10-12 11:21:27 +02:00
Kaxil Naik	d305876bee	Remove redundant None provided as default to dict.get() (#11448 )	2020-10-12 00:31:35 +01:00
eladkal	c3e340584b	Change prefix of AwsDynamoDB hook module (#11209 ) * align import path of AwsDynamoDBHook in aws providers Co-authored-by: Tomek Urbaszek <turbaszek@gmail.com>	2020-10-11 20:49:23 +01:00
John Bampton	b786327041	Fix spelling in CeleryExecutor (#11407 )	2020-10-11 18:06:26 +02:00
Ephraim Anierobi	686e0ee7df	Fix incorrect typing, remove hardcoded argument values and improve code in AzureContainerInstancesOperator (#11408 )	2020-10-11 16:48:51 +02:00
Jarek Potiuk	5bc5994c2c	Split tests to more sub-types (#11402 ) We seem to have a problem with running all tests at once - most likely due to some resource problems in our CI, therefore it makes sense to split the tests into more batches. This is not yet full implementation of selective tests but it is going in this direction by splitting to Core/Providers/API/CLI tests. The full selective tests approach will be implemented as part of #10507 issue. This split is possible thanks to #10422 which moved building image to a separate workflow - this way each image is only built once and it is uploaded to a shared registry, where it is quickly downloaded from rather than built by all the jobs separately - this way we can have many more jobs as there is very little per-job overhead before the tests start runnning.	2020-10-11 07:40:31 -07:00
Joshua Carp	bd204bb91b	Optionally set null marker in csv exports in BaseSQLToGCSOperator (#11409 )	2020-10-11 11:48:54 +02:00
Jarek Potiuk	9416bedf8e	Moving the test to quarantine. (#11405 ) I've seen the test being flaky and failing intermittently several times. Moving it to quarantine for now.	2020-10-10 21:29:42 -07:00
John Bampton	7959df94cf	Fix spelling (#11404 )	2020-10-10 20:47:22 +02:00
John Bampton	0620aaa0f8	Fix spelling (#11401 )	2020-10-10 18:47:10 +02:00
Michał Misiewicz	b7404b079a	KubernetesPodOperator should retry log tailing in case of interruption (#11325 ) * KubernetesPodOperator can retry log tailing in case of interruption * fix failing test * change read_pod_logs method formatting * KubernetesPodOperator retry log tailing based on last read log timestamp * fix test_parse_log_line test formatting * add docstring to parse_log_line method * fix kubernetes integration test	2020-10-09 15:59:47 -07:00
Jarek Potiuk	6fe020e105	Add tests for Custom cluster policy (#11381 ) The custom ClusterPolicyViolation has been added in #10282 This one adds more comprehensive test to it. Co-authored-by: Jacob Ferriero <jferriero@google.com>	2020-10-10 00:57:10 +02:00
Ash Berlin-Taylor	73b9163a8f	Fully support running more than one scheduler concurrently (#10956 ) * Fully support running more than one scheduler concurrently. This PR implements scheduler HA as proposed in AIP-15. The high level design is as follows: - Move all scheduling decisions into SchedulerJob (requiring DAG serialization in the scheduler) - Use row-level locks to ensure schedulers don't stomp on each other (`SELECT ... FOR UPDATE`) - Use `SKIP LOCKED` for better performance when multiple schedulers are running. (Mysql < 8 and MariaDB don't support this) - Scheduling decisions are not tied to the parsing speed, but can operate just on the database DagFileProcessorProcess: Previously this component was responsible for more than just parsing the DAG files as it's name might imply. It also was responsible for creating DagRuns, and also making scheduling decisions of TIs, sending them from "None" to "scheduled" state. This commit changes it so that the DagFileProcessorProcess now will update the SerializedDAG row for this DAG, and make no scheduling decisions itself. To make the scheduler's job easier (so that it can make as many decisions as possible without having to load the possibly-large SerializedDAG row) we store/update some columns on the DagModel table: - `next_dagrun`: The execution_date of the next dag run that should be created (or None) - `next_dagrun_create_after`: The earliest point at which the next dag run can be created Pre-computing these values (and updating them every time the DAG is parsed) reduce the overall load on the DB as many decisions can be taken by selecting just these two columns/the small DagModel row. In case of max_active_runs, or `@once` these columns will be set to null, meaning "don't create any dag runs" SchedulerJob The SchedulerJob used to only queue/send tasks to the executor after they were parsed, and returned from the DagFileProcessorProcess. This PR breaks the link between parsing and enqueuing of tasks, instead of looking at DAGs as they are parsed, we now: - store a new datetime column, `last_scheduling_decision` on DagRun table, signifying when a scheduler last examined a DagRun - Each time around the loop the scheduler will get (and lock) the next _n_ DagRuns via `DagRun.next_dagruns_to_examine`, prioritising DagRuns which haven't been touched by a scheduler in the longest period - SimpleTaskInstance etc have been almost entirely removed now, as we use the serialized versions * Move callbacks execution from Scheduler loop to DagProcessorProcess * Don’t run verify_integrity if the Serialized DAG hasn’t changed dag_run.verify_integrity is slow, and we don't want to call it every time, just when the dag structure changes (which we can know now thanks to DAG Serialization) * Add escape hatch to disable newly added "SELECT ... FOR UPDATE" queries We are worried that these extra uses of row-level locking will cause problems on MySQL 5.x (most likely deadlocks) so we are providing users an "escape hatch" to be able to make these queries non-locking -- this means that only a singe scheduler should be run, but being able to run one is better than having the scheduler crash. Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-10-09 22:44:27 +01:00
Daniel Imberman	49aad025b5	Users can specify sub-secrets and paths k8spodop (#11369 ) Allows users to specify items for specific key path projections when using the airflow.kubernetes.secret.Secret class	2020-10-09 09:00:09 -07:00
Kaxil Naik	ff1a2aaff8	Set start_date, end_date & duration for tasks failing without DagRun (#11358 )	2020-10-09 15:21:39 +01:00
Ash Berlin-Taylor	fe0bf6e1f0	Reduce "start-up" time for tasks in CeleryExecutor (#11372 ) This is similar to #11327, but for Celery this time. The impact is not quite as pronounced here (for simple dags at least) but takes the average queued to start delay from 1.5s to 0.4s	2020-10-09 13:18:32 +01:00
Tobiasz Kędzierski	8baf657fc2	Fix regression in DataflowTemplatedJobStartOperator (#11167 )	2020-10-09 10:21:16 +02:00
Vijayant	422b61a9dd	Adding ElastiCache Hook for creating, describing and deleting replication groups (#8701 )	2020-10-09 09:19:26 +01:00
Sumit Maheshwari	5605d1063b	Fix DagBag bug when a dag has invalid schedule_interval (#11344 )	2020-10-09 13:29:41 +05:30
Kaxil Naik	27e637fbe3	Bugfix: Error in SSHOperator when command is None (#11361 ) closes https://github.com/apache/airflow/issues/10656	2020-10-09 08:35:39 +01:00
Ash Berlin-Taylor	4839a5bc6e	Reduce "start-up" time for tasks in LocalExecutor (#11327 ) Spawning a whole new python process and then re-loading all of Airflow is expensive. All though this time fades to insignificance for long running tasks, this delay gives a "bad" experience for new users when they are just trying out Airflow for the first time. For the LocalExecutor this cuts the "queued time" down from 1.5s to 0.1s on average.	2020-10-08 17:37:51 +01:00
Michał Słowikowski	832a7850f1	Add Azure Blob Storage to GCS transfer operator (#11321 )	2020-10-08 12:16:50 +02:00
Satyasheel	5d007fd2ff	Strict type check for azure hooks (#11342 )	2020-10-08 09:36:35 +02:00
FHoffmannCode	b0fcf67559	Add AzureFileShareToGCSOperator (#10991 )	2020-10-07 11:08:58 +02:00
Kishore Vancheeshwaran	bbc3cea057	Move latest_only_operator.py to latest_only.py (#11178 ) (#11304 )	2020-10-07 00:15:28 +01:00
amaterasu-coder	dd98b21494	Add acl_policy parameter to GCSToS3Operator (#10804 ) (#10829 )	2020-10-06 13:09:01 +02:00
Cooper Gillan	03ff067152	Add type annotations to ZendeskHook, update unit test (#10888 ) * Add type annotations to ZendeskHook __What__ * Add correct type annotations to ZendeskHook and each method * Update one unit test to call an empty dictionary rather than a NoneType since the argument should be a dictionary __Why__ * Building out type annotations is good for the code base * The query parameter is accessed with an index at one point, which means that it cannot be a None type, but should rather be defaulted to an empty dictionary if not provided * Remove useless return	2020-10-06 11:32:53 +01:00
Ephraim Anierobi	c51016b0b8	Add LocalToAzureDataLakeStorageOperator (#10814 )	2020-10-05 22:40:19 +02:00
Ash Berlin-Taylor	c9efa56550	Access task type via the property, not dundervars (#11274 ) We don't currently create TIs form serialized dags, but we are about to start -- at which point some of these cases would have just shown "SerializedBaseOperator", rather than the _real_ class name. The other changes are just for "consistency" -- we should always get the task type from this property, not via `__class__.__name__`. I haven't set up a pre-commit rule for this as using this dunder accessor is used elsewhere on things that are not BaseOperator instances, and detecting that is hard to do in a pre-commit rule.	2020-10-05 11:32:42 +01:00
Ephraim Anierobi	fd682fd70a	fix job deletion (#11272 )	2020-10-05 09:39:50 +02:00
Kaxil Naik	6dce7a6c26	Enable MySQL 8 CI jobs (#11247 ) closes https://github.com/apache/airflow/issues/11164	2020-10-04 13:45:05 +02:00
Tomek Urbaszek	f697ff2381	Move test tools from tests.utils to tests.test_utils (#10889 )	2020-10-03 14:27:06 +02:00
Ephraim Anierobi	4210618789	Ensure target_dedicated_nodes or enable_auto_scale is set in AzureBatchOperator (#11251 )	2020-10-03 10:59:51 +01:00
Arunvel Sriram	e4125666b5	Add option to bulk clear DAG Runs in Browse DAG Runs page (#11226 ) closes: #11076	2020-10-03 10:30:08 +01:00
Daniel Imberman	7338912a78	Add task adoption to CeleryKubernetesExecutor (#11244 ) Routes task adoption based on queue name to CeleryExecutor or KubernetesExecutor Co-authored-by: Daniel Imberman <daniel@astronomer.io>	2020-10-02 11:51:11 -07:00
Ryan Hamilton	24d0ecf4ee	Airflow 2.0 UI Overhaul/Refresh (#11195 ) Resolves #10953. A refreshed UI for the 2.0 release. The existing "theming" is a bit long in the tooth and this PR attempts to give it a modern look and some freshness to compliment all of the new features under the hood. The majority of the changes to UI have been done through updates to the Bootstrap theme contained in bootstrap-theme.css. These are simply overrides to the default stylings that are packaged with Bootstrap.	2020-10-02 15:58:58 +01:00
Jed Cunningham	c74b3ac79a	Optional import error tracebacks in web ui (#10663 ) This PR allows for partial import error tracebacks to be exposed on the UI, if enabled. This extra context can be very helpful for users without access to the parsing logs to determine why their DAGs are failing to import properly.	2020-10-01 21:48:48 +02:00
Daniel Imberman	3ca11eb9b0	Kubernetes executor can adopt tasks from other schedulers (#10996 ) * KubernetesExecutor can adopt tasks from other schedulers * simplify * recreate tables properly * fix pylint Co-authored-by: Daniel Imberman <daniel@astronomer.io>	2020-10-01 12:07:38 -07:00
James Timmins	427a4a8f01	Replace get accessible dag ids (#11027 )	2020-10-01 17:37:00 +01:00
Michał Słowikowski	00ffedb8c4	Add amazon glacier to GCS transfer operator (#10947 ) Add Amazon Glacier to GCS transfer operator, Glacier job operator and sensor.	2020-09-30 14:59:26 +02:00
Daniel Imberman	9860719c72	[AIRFLOW-5545] Fixes recursion in DAG cycle testing (#6175 ) * Fixes an issue where cycle detection uses recursion and stack overflows after about 1000 tasks (cherry picked from commit 63f1a180a17729aa937af642cfbf4ddfeccd1b9f) * reduce test length * slightly more efficient * Update airflow/utils/dag_cycle_tester.py Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com> * slightly more efficient * actually works this time Co-authored-by: Daniel Imberman <daniel@astronomer.io> Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-09-29 11:34:55 -07:00
Omair Khan	68e0eb6976	in_container bats pre-commit hook and updated bats-tests hook (#11179 )	2020-09-29 11:59:06 +02:00
Ash Berlin-Taylor	6694eaa831	Show the location of the queries when the assert_queries_count fails. (#11186 ) Example output (I forced one of the existing tests to fail) ``` E AssertionError: The expected number of db queries is 3. The current number is 2. E E Recorded query locations: E scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:94: 1 E scheduler_job.py:_run_scheduler_loop>scheduler_job.py:_emit_pool_metrics>pool.py:slots_stats:101: 1 ``` This makes it a bit easier to see what the queries are, without having to re-run with full query tracing and then analyze the logs.	2020-09-28 19:39:21 +01:00
Ephraim Anierobi	cb52fb0ae1	Add example DAG and system test for MySQLToGCSOperator (#10990 )	2020-09-27 19:05:04 +02:00
Logan Attwood	37798f0d2a	Do not silently allow the use of undefined variables in jinja2 templates (#11016 ) This can have extremely bad consequences. After this change, a jinja2 template like the one below will cause the task instance to fail, if the DAG being executed is not a sub-DAG. This may also display an error on the Rendered tab of the Task Instance page. task_instance.xcom_pull('z', key='return_value', dag_id=dag.parent_dag.dag_id) Prior to the change in this commit, the above template would pull the latest value for task_id 'z', for the given execution_date, from any DAG. If your task_ids between DAGs are all unique, or if DAGs using the same task_id always have different execution_date values, this will appear to act like dag_id=None. Our current theory is SQLAlchemy/Python doesn't behave as expected when comparing `jinja2.Undefined` to `None`.	2020-09-25 09:15:28 +02:00
Nadim Younes	68fa29bff0	Added support for encrypted private keys in SSHHook (#11097 ) * Added support for encrypted private keys in SSHHook * Fixed Styling issues and added unit testing * fixed last pylint styling issue by adding newline to the end of the file * re-fixed newline issue for pylint checks * fixed pep8 styling issues and black formatted files to pass static checks * added comma as per suggestion to fix static check Co-authored-by: Nadim Younes <nyounes@kobo.com>	2020-09-25 07:02:16 +02:00
Tomek Urbaszek	daf8f31080	Add template fields renderers for better UI rendering (#11061 ) This PR adds possibility to define template_fields_renderers for an operator. In this way users will be able to provide information what lexer should be used for rendering a particular field. This is super useful for custom operator and gives more flexibility than predefined keywords. Co-authored-by: Kamil Olszewski <34898234+olchas@users.noreply.github.com> Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>	2020-09-23 15:31:40 +02:00
yuqian90	423a382678	SkipMixin: Add missing session.commit() and test (#10421 )	2020-09-22 21:08:12 +01:00
yuqian90	e59ad5b2c6	Make Skipmixin handle empty branch properly (#10751 ) closes: #10725 Make sure SkipMixin.skip_all_except() handles empty branches like this properly. When "task1" is followed, "join" must not be skipped even though it is considered to be immediately downstream of "branch".	2020-09-22 20:48:26 +01:00
James Timmins	fbd994a4cf	Add permissions for stable API (#10594 ) Related Github Issue: https://github.com/apache/airflow/issues/8112	2020-09-22 17:23:59 +01:00
Jarek Potiuk	1ebd3a631c	Pandas behaviour for None changed in 1.1.2 (#11004 ) In Pandas version 1.1.2 experimental NAN value started to be returned instead of None in a number of places. That broke our tests. Fixing the tests also requires the Pandas to be updated to be >=1.1.2	2020-09-22 14:23:49 +02:00
Kaxil Naik	cb979f9f21	Get Airflow configs with sensitive data from CloudSecretManagerBackend (#11024 )	2020-09-22 08:17:58 +01:00
Daniel Imberman	f4513c0389	Revert "KubernetesJobWatcher no longer inherits from Process (#11017 )" (#11065 ) This reverts commit `1539bd051c`.	2020-09-21 15:28:00 -07:00
Jarek Potiuk	3db4d3b04d	All versions in CI yamls are not hard-coded any more (#10959 ) GitHub Actions allow to use `fromJson` method to read arrays or even more complex json objects into the CI workflow yaml files. This, connected with set::output commands, allows to read the list of allowed versions as well as default ones from the environment variables configured in ./scripts/ci/libraries/initialization.sh This means that we can have one plece in which versions are configured. We also need to do it in "breeze-complete" as this is a standalone script that should not source anything we added BATS tests to verify if the versions in breeze-complete correspond with those defined in the initialization.sh Also we do not limit tests any more in regular PRs now - we run all combinations of available versions. Our tests run quite a bit faster now so we should be able to run more complete matrixes. We can still exclude individual values of the matrixes if this is too much. MySQL 8 is disabled from breeze for now. I plan a separate follow up PR where we will run MySQL 8 tests (they were not run so far)	2020-09-21 20:02:04 +02:00
Kaxil Naik	2410f592a4	Get Airflow configs with sensitive data from AWS Systems Manager (#11023 ) Adds support to AWS SSM for feature added in https://github.com/apache/airflow/pull/9645	2020-09-19 19:05:42 +01:00
Shekhar Singh	9edfcb7ac4	Support extra_args in S3Hook and GCSToS3Operator (#11001 )	2020-09-19 02:03:21 +01:00
yuqian90	49c193fb87	[AIP-34] TaskGroup: A UI task grouping concept as an alternative to SubDagOperator (#10153 ) This commit introduces TaskGroup, which is a simple UI task grouping concept. - TaskGroups can be collapsed/expanded in Graph View when clicked - TaskGroups can be nested - TaskGroups can be put upstream/downstream of tasks or other TaskGroups with >> and << operators - Search box, hovering, focusing in Graph View treats TaskGroup properly. E.g. searching for tasks also highlights TaskGroup that contains matching task_id. When TaskGroup is expanded/collapsed, the affected TaskGroup is put in focus and moved to the centre of the graph. What this commit does not do: - This commit does not change or remove SubDagOperator. Although TaskGroup is intended as an alternative for SubDagOperator, deprecating SubDagOperator will need to be discussed/implemented in the future. - This PR only implemented TaskGroup handling in the Graph View. In places such as Tree View, it will look like as-if - TaskGroup does not exist and all tasks are in the same flat DAG. GitHub Issue: #8078 AIP: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator	2020-09-19 01:51:37 +01:00
Daniel Imberman	1539bd051c	KubernetesJobWatcher no longer inherits from Process (#11017 ) multiprocessing.Process is set up in a very unfortunate manner that pretty much makes it impossible to test a class that inherits from Process or use any of its internal functions. For this reason we decided to seperate the actual process based functionality into a class member	2020-09-18 11:33:22 -07:00
Shubham Joshi	966a06d96b	Fetching databricks host from connection if not supplied in extras. (#10762 ) * Fetching databricks host from connection if not supplied in extras. * Fixing formatting issue in databricks test Co-authored-by: joshi95 <shubham@playsimple.in>	2020-09-18 13:15:11 +02:00
Daniel Imberman	cba51d49ee	Simplify the K8sExecutor and K8sPodOperator (#10393 ) * Simplify Airflow on Kubernetes Story Removes thousands of lines of code that essentially ammount to us re-creating the Kubernetes API. Will offer a faster, simpler KubernetesExecutor for 2.0 * Fix podgen tests * fix documentation * simplify validate function * @mik-laj comments * spellcheck * spellcheck * Update airflow/executors/kubernetes_executor.py Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com> Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-09-17 08:40:20 -07:00
Jarek Potiuk	82a9477cd3	The test_find_not_should_ignore_path is now in heisentests (#10989 ) It seems that the test_find_not_should_ignore_path test has some dependency on side-effects from other tests. See #10988 - we are moving this test to heisentests until we solve the issue.	2020-09-17 14:46:36 +02:00
Kaxil Naik	e066260ef8	Improve the Error message in Breeze for invalid params (#10980 ) Changed `Is` to `Passed` Before: ``` ERROR: Allowed backend: [ sqlite mysql postgres ]. Is: 'dpostgres'. Switch to supported value with --backend flag. ``` After: ``` ERROR: Allowed backend: [ sqlite mysql postgres ]. Passed: 'dpostgres'. Switch to supported value with --backend flag. ```	2020-09-17 03:21:47 +01:00
Ash Berlin-Taylor	59dad1a4ea	Allow CeleryExecutor to "adopt" an orphaned queued or running task (#10949 ) This can happen when a task is enqueued by one executor, and then that scheduler dies/exits. The default fallback behaviour is unchanged -- that queued tasks are cleared and then and then later rescheduled. But for Celery, we can do better -- if we record the Celery-generated task_id, we can then re-create the AsyncResult objects for orphaned tasks at a later date. However, since Celery just reports all AsyncResult as "PENDING", even if they aren't tasks currently in the broker queue, we need to apply a timeout to "unblock" these tasks in case they never actually made it to the Celery broker. This all means that we can adopt tasks that have been enqueued another CeleryExecutor if it dies, without having to clear the task and slow down. This is especially useful as the task may have already started running, and while clearing it would stop it, it's better if we don't have to reset it! Co-authored-by: Kaxil Naik <kaxilnaik@apache.org>	2020-09-16 20:10:30 +01:00
Ephraim Anierobi	76545bb3d6	Add example dag and system test for S3ToGCSOperator (#10951 )	2020-09-16 19:36:08 +02:00
Robert Grizzell	2aec99c228	Fix empty asctime field in JSON formatted logs (#10515 )	2020-09-16 17:50:27 +01:00
Daniel Imberman	1294e15d44	KubernetesPodOperator template fix (#10963 ) * Ensure that K8sPodOperator can pull namespace from pod_template_file Fixes a bug where users who run K8sPodOperator could not run because the operator was expecting a namespace parameter * add test * self.pod * Update airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com> * don't create pod until run * spellcheck Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>	2020-09-16 07:58:32 -07:00
Kaxil Naik	905cdd502a	Add a default for DagModel.default_view (#10897 ) fixes https://github.com/apache/airflow/issues/10283	2020-09-16 00:23:47 +01:00
Denis Evseev	f7da7d94b4	Fix ExternalTaskMarker serialized fields (#10924 ) Co-authored-by: Denis Evseev <xOnelinx@gmail.com> Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-09-15 23:40:41 +01:00
John Bampton	ce19657ec6	Fix case of GitHub. (#10955 ) Changed `Github` to `GitHub`.	2020-09-15 14:49:27 -04:00
Kaxil Naik	d43bb75367	Remove test dependency from TestApiKerberos (#10950 ) TestApiKerberos::test_trigger_dag previously was dependent that the `example_bash_operator` exist in the Database. If one of the other tests didn't write it to the DB or if one of the other tests cleared it from the DB, this test failed.	2020-09-15 14:19:29 +01:00
Ping Zhang	96165185f1	Add CeleryKubernetesExecutor (#10901 ) it consists of CeleryExecutor and KubernetesExecutor, which allows users to route their tasks to either Kubernetes or Celery based on the queue defined on a task	2020-09-15 09:42:55 +02:00
Jed Cunningham	b628067b42	Minor refactor of the login methods in tests.www.test_views (#10918 ) - Instead of supporting only an Admin user in the base test class, you can also use a normal User or Viewer - Only add users when they are being used so we can do a little less in the setup phase (minor speedup in TestDagACLView)	2020-09-14 23:54:23 +02:00
Tomek Urbaszek	5d6d5a2f7d	Allow to specify path to kubeconfig in KubernetesHook (#10453 )	2020-09-14 18:16:53 +02:00
Dmytro Usenko	4e1f3a69db	[AIRFLOW-10645] Add AWS Secrets Manager Hook (#10655 )	2020-09-14 08:54:48 -07:00
Tomek Urbaszek	eaa49b2257	Fix chain methods for XComArg (#10827 ) __lshift__ and __rshift__ methods should return other not self. This PR fixes XComArg implementation to support chain like this one: BaseOprator >> XComArg >> BaseOperator Related to: #10153	2020-09-14 13:13:04 +02:00
Ash Berlin-Taylor	9e42a97f3f	Mark task as failed when it fails sending in Celery (#10881 ) If a task failed hard on celery, _before_ being able to execute the airflow code the task would end up stuck in queued state. This change makes it get retried. This was discovered in load testing the HA work (but unrelated to HA changes), where I swamped the kube-dns pod, meaning the worker was sometimes unable to resolve the db name via DNS, so the state in the DB was never updated	2020-09-14 10:40:14 +01:00
Jarek Potiuk	b2dc346062	Make breeeze-complete Google Shell Guide compatible (#10708 ) Also added unit tests for breeze-complete Part of #10576	2020-09-14 10:21:09 +02:00
Jarek Potiuk	791f9044fe	Adds the maintain-heart-rate to quarantine. (#10922 ) The test occasionally fails, moving it to quarantine for now.	2020-09-14 10:18:54 +02:00
tszerszen	12a652f534	Fix parameter name collision in AutoMLBatchPredictOperator #10723 (#10869 ) Rename `params` to `prediction_params` to avoid clash with BaseOperator arguments	2020-09-13 17:05:57 +02:00
Kaxil Naik	f77a11d5b1	Add Secrets backend for Microsoft Azure Key Vault (#10898 )	2020-09-13 16:45:21 +02:00
Kaxil Naik	92eafc01ed	Parameterize tests in hashicorp/hooks/test_vault.py (#10903 ) Some of the tests were parameterizable, so less line to maintain with the same level of testing	2020-09-12 22:01:47 +01:00
Kaxil Naik	ee42aaeaa2	Fix typo in the word 'instance' (#10902 ) `instnace` -> `instance`	2020-09-12 20:08:43 +01:00
Kaxil Naik	f383bb3416	Fix separated strings in test_secrets_manager.py (#10900 ) "airflow.providers.amazon.aws.secrets.secrets_manager." "SecretsManagerBackend.get_conn_uri" to "airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend.get_conn_uri"	2020-09-12 18:31:38 +02:00
Daniel Cohen	2e8b4ece36	Pass conf to subdags (#9956 )	2020-09-12 11:58:17 +01:00
tszerszen	41a62735ed	Add on_kill method to BigQueryInsertJobOperator (#10866 ) * Add on_kill method to BigQueryInsertJobOperator * BigQueryInsertJobOperator pylint disable=too-many-arguments	2020-09-11 20:48:16 +02:00
Daniel Imberman	56bd9b7d6b	Modify helm chart to use pod_template_file (#10872 ) * Modify helm chart to use pod_template_file Since we are deprecating most k8sexecutor arguments we should use the pod_template_file when launching airflow using the KubernetesExecutor * fix tests * one more nit * fix dag command * fix pylint	2020-09-11 10:47:59 -07:00
Anmol Dhingra	c58d60635d	Update qubole_hook to not remove pool as an arg for qubole_operator (#10820 )	2020-09-11 12:30:02 +05:30
Miller Tracy	b9dc3c51ba	Added Plexus as an Airflow provider (#10591 )	2020-09-10 19:54:38 +02:00
tszerszen	68cc7273bf	Add on_kill method to DataprocSubmitJobOperator (#10847 )	2020-09-10 19:07:08 +02:00
Ash Berlin-Taylor	1a95361122	Fix and unquarantine TestDagFileProcessorAgent.test_parse_once (#10862 ) The SmartSensor PR introduces slightly different behaviour on list_py_files happens when given a file path directly. Prior to that PR, if given a file path it would not include examples. After that PR was merged, it would return that path and the example dags (assuming they were enabled.)	2020-09-10 17:04:14 +01:00
Ash Berlin-Taylor	63b6e53ffd	Detect orphaned task instances by SchedulerJob id and heartbeat (#10729 ) Once HA mode for scheduler lands, we can no longer reset orphaned task by looking at the tasks in (the memory of) the current executor. This changes it to keep track of which (Scheduler)Job queued/scheduled a TaskInstance (the new "queued_by_job_id" column stored against TaskInstance table), and then we can use the existing heartbeat mechanism for jobs to notice when a TI should be reset. As part of this the existing implementation of `reset_state_for_orphaned_tasks` has been moved out of BaseJob in to BackfillJob -- as only this and SchedulerJob had these methods, and the SchedulerJob version now operates differently	2020-09-10 17:01:41 +01:00
Jarek Potiuk	ff72327614	Move parse_once to quarantine (#10857 )	2020-09-10 13:20:23 +01:00
Kaxil Naik	ce66bc944d	Add test for Health Endpoint when there is an exception (#10846 )	2020-09-10 01:00:40 +01:00
Kaxil Naik	ee8b02a14f	Add missing assert call in test_dbapi_hook.py (#10842 ) `assert` call was missing so the statement didn't test or wouldn't fail if condition isn't true	2020-09-09 23:59:16 +01:00
Kaxil Naik	9549274d11	Upgrade black to 20.8b1 (#10818 )	2020-09-09 09:06:24 +01:00
Daniel Imberman	20481c3caf	Add pod_override setting for KubernetesExecutor (#10756 ) * Add podOverride setting for KubernetesExecutor Users of the KubernetesExecutor will now have a "podOverride" option in the executor_config. This option will allow users to modify the pod launched by the KubernetesExecutor using a `kubernetes.client.models.V1Pod` class. This is the first step in deprecating the tradition executor_config. * Fix k8s tests * fix docs	2020-09-08 15:56:59 -07:00

1 2 3 4 5 ...

3020 Коммитов