incubator-airflow

Граф коммитов

Автор	SHA1	Сообщение	Дата
Tomek Urbaszek	daf8f31080	Add template fields renderers for better UI rendering (#11061 ) This PR adds possibility to define template_fields_renderers for an operator. In this way users will be able to provide information what lexer should be used for rendering a particular field. This is super useful for custom operator and gives more flexibility than predefined keywords. Co-authored-by: Kamil Olszewski <34898234+olchas@users.noreply.github.com> Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>	2020-09-23 15:31:40 +02:00
yuqian90	423a382678	SkipMixin: Add missing session.commit() and test (#10421 )	2020-09-22 21:08:12 +01:00
yuqian90	e59ad5b2c6	Make Skipmixin handle empty branch properly (#10751 ) closes: #10725 Make sure SkipMixin.skip_all_except() handles empty branches like this properly. When "task1" is followed, "join" must not be skipped even though it is considered to be immediately downstream of "branch".	2020-09-22 20:48:26 +01:00
James Timmins	fbd994a4cf	Add permissions for stable API (#10594 ) Related Github Issue: https://github.com/apache/airflow/issues/8112	2020-09-22 17:23:59 +01:00
Jarek Potiuk	1ebd3a631c	Pandas behaviour for None changed in 1.1.2 (#11004 ) In Pandas version 1.1.2 experimental NAN value started to be returned instead of None in a number of places. That broke our tests. Fixing the tests also requires the Pandas to be updated to be >=1.1.2	2020-09-22 14:23:49 +02:00
Kaxil Naik	cb979f9f21	Get Airflow configs with sensitive data from CloudSecretManagerBackend (#11024 )	2020-09-22 08:17:58 +01:00
Daniel Imberman	f4513c0389	Revert "KubernetesJobWatcher no longer inherits from Process (#11017 )" (#11065 ) This reverts commit `1539bd051c`.	2020-09-21 15:28:00 -07:00
Jarek Potiuk	3db4d3b04d	All versions in CI yamls are not hard-coded any more (#10959 ) GitHub Actions allow to use `fromJson` method to read arrays or even more complex json objects into the CI workflow yaml files. This, connected with set::output commands, allows to read the list of allowed versions as well as default ones from the environment variables configured in ./scripts/ci/libraries/initialization.sh This means that we can have one plece in which versions are configured. We also need to do it in "breeze-complete" as this is a standalone script that should not source anything we added BATS tests to verify if the versions in breeze-complete correspond with those defined in the initialization.sh Also we do not limit tests any more in regular PRs now - we run all combinations of available versions. Our tests run quite a bit faster now so we should be able to run more complete matrixes. We can still exclude individual values of the matrixes if this is too much. MySQL 8 is disabled from breeze for now. I plan a separate follow up PR where we will run MySQL 8 tests (they were not run so far)	2020-09-21 20:02:04 +02:00
Kaxil Naik	2410f592a4	Get Airflow configs with sensitive data from AWS Systems Manager (#11023 ) Adds support to AWS SSM for feature added in https://github.com/apache/airflow/pull/9645	2020-09-19 19:05:42 +01:00
Shekhar Singh	9edfcb7ac4	Support extra_args in S3Hook and GCSToS3Operator (#11001 )	2020-09-19 02:03:21 +01:00
yuqian90	49c193fb87	[AIP-34] TaskGroup: A UI task grouping concept as an alternative to SubDagOperator (#10153 ) This commit introduces TaskGroup, which is a simple UI task grouping concept. - TaskGroups can be collapsed/expanded in Graph View when clicked - TaskGroups can be nested - TaskGroups can be put upstream/downstream of tasks or other TaskGroups with >> and << operators - Search box, hovering, focusing in Graph View treats TaskGroup properly. E.g. searching for tasks also highlights TaskGroup that contains matching task_id. When TaskGroup is expanded/collapsed, the affected TaskGroup is put in focus and moved to the centre of the graph. What this commit does not do: - This commit does not change or remove SubDagOperator. Although TaskGroup is intended as an alternative for SubDagOperator, deprecating SubDagOperator will need to be discussed/implemented in the future. - This PR only implemented TaskGroup handling in the Graph View. In places such as Tree View, it will look like as-if - TaskGroup does not exist and all tasks are in the same flat DAG. GitHub Issue: #8078 AIP: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-34+TaskGroup%3A+A+UI+task+grouping+concept+as+an+alternative+to+SubDagOperator	2020-09-19 01:51:37 +01:00
Daniel Imberman	1539bd051c	KubernetesJobWatcher no longer inherits from Process (#11017 ) multiprocessing.Process is set up in a very unfortunate manner that pretty much makes it impossible to test a class that inherits from Process or use any of its internal functions. For this reason we decided to seperate the actual process based functionality into a class member	2020-09-18 11:33:22 -07:00
Shubham Joshi	966a06d96b	Fetching databricks host from connection if not supplied in extras. (#10762 ) * Fetching databricks host from connection if not supplied in extras. * Fixing formatting issue in databricks test Co-authored-by: joshi95 <shubham@playsimple.in>	2020-09-18 13:15:11 +02:00
Daniel Imberman	cba51d49ee	Simplify the K8sExecutor and K8sPodOperator (#10393 ) * Simplify Airflow on Kubernetes Story Removes thousands of lines of code that essentially ammount to us re-creating the Kubernetes API. Will offer a faster, simpler KubernetesExecutor for 2.0 * Fix podgen tests * fix documentation * simplify validate function * @mik-laj comments * spellcheck * spellcheck * Update airflow/executors/kubernetes_executor.py Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com> Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-09-17 08:40:20 -07:00
Jarek Potiuk	82a9477cd3	The test_find_not_should_ignore_path is now in heisentests (#10989 ) It seems that the test_find_not_should_ignore_path test has some dependency on side-effects from other tests. See #10988 - we are moving this test to heisentests until we solve the issue.	2020-09-17 14:46:36 +02:00
Kaxil Naik	e066260ef8	Improve the Error message in Breeze for invalid params (#10980 ) Changed `Is` to `Passed` Before: ``` ERROR: Allowed backend: [ sqlite mysql postgres ]. Is: 'dpostgres'. Switch to supported value with --backend flag. ``` After: ``` ERROR: Allowed backend: [ sqlite mysql postgres ]. Passed: 'dpostgres'. Switch to supported value with --backend flag. ```	2020-09-17 03:21:47 +01:00
Ash Berlin-Taylor	59dad1a4ea	Allow CeleryExecutor to "adopt" an orphaned queued or running task (#10949 ) This can happen when a task is enqueued by one executor, and then that scheduler dies/exits. The default fallback behaviour is unchanged -- that queued tasks are cleared and then and then later rescheduled. But for Celery, we can do better -- if we record the Celery-generated task_id, we can then re-create the AsyncResult objects for orphaned tasks at a later date. However, since Celery just reports all AsyncResult as "PENDING", even if they aren't tasks currently in the broker queue, we need to apply a timeout to "unblock" these tasks in case they never actually made it to the Celery broker. This all means that we can adopt tasks that have been enqueued another CeleryExecutor if it dies, without having to clear the task and slow down. This is especially useful as the task may have already started running, and while clearing it would stop it, it's better if we don't have to reset it! Co-authored-by: Kaxil Naik <kaxilnaik@apache.org>	2020-09-16 20:10:30 +01:00
Ephraim Anierobi	76545bb3d6	Add example dag and system test for S3ToGCSOperator (#10951 )	2020-09-16 19:36:08 +02:00
Robert Grizzell	2aec99c228	Fix empty asctime field in JSON formatted logs (#10515 )	2020-09-16 17:50:27 +01:00
Daniel Imberman	1294e15d44	KubernetesPodOperator template fix (#10963 ) * Ensure that K8sPodOperator can pull namespace from pod_template_file Fixes a bug where users who run K8sPodOperator could not run because the operator was expecting a namespace parameter * add test * self.pod * Update airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com> * don't create pod until run * spellcheck Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>	2020-09-16 07:58:32 -07:00
Kaxil Naik	905cdd502a	Add a default for DagModel.default_view (#10897 ) fixes https://github.com/apache/airflow/issues/10283	2020-09-16 00:23:47 +01:00
Denis Evseev	f7da7d94b4	Fix ExternalTaskMarker serialized fields (#10924 ) Co-authored-by: Denis Evseev <xOnelinx@gmail.com> Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>	2020-09-15 23:40:41 +01:00
John Bampton	ce19657ec6	Fix case of GitHub. (#10955 ) Changed `Github` to `GitHub`.	2020-09-15 14:49:27 -04:00
Kaxil Naik	d43bb75367	Remove test dependency from TestApiKerberos (#10950 ) TestApiKerberos::test_trigger_dag previously was dependent that the `example_bash_operator` exist in the Database. If one of the other tests didn't write it to the DB or if one of the other tests cleared it from the DB, this test failed.	2020-09-15 14:19:29 +01:00
Ping Zhang	96165185f1	Add CeleryKubernetesExecutor (#10901 ) it consists of CeleryExecutor and KubernetesExecutor, which allows users to route their tasks to either Kubernetes or Celery based on the queue defined on a task	2020-09-15 09:42:55 +02:00
Jed Cunningham	b628067b42	Minor refactor of the login methods in tests.www.test_views (#10918 ) - Instead of supporting only an Admin user in the base test class, you can also use a normal User or Viewer - Only add users when they are being used so we can do a little less in the setup phase (minor speedup in TestDagACLView)	2020-09-14 23:54:23 +02:00
Tomek Urbaszek	5d6d5a2f7d	Allow to specify path to kubeconfig in KubernetesHook (#10453 )	2020-09-14 18:16:53 +02:00
Dmytro Usenko	4e1f3a69db	[AIRFLOW-10645] Add AWS Secrets Manager Hook (#10655 )	2020-09-14 08:54:48 -07:00
Tomek Urbaszek	eaa49b2257	Fix chain methods for XComArg (#10827 ) __lshift__ and __rshift__ methods should return other not self. This PR fixes XComArg implementation to support chain like this one: BaseOprator >> XComArg >> BaseOperator Related to: #10153	2020-09-14 13:13:04 +02:00
Ash Berlin-Taylor	9e42a97f3f	Mark task as failed when it fails sending in Celery (#10881 ) If a task failed hard on celery, _before_ being able to execute the airflow code the task would end up stuck in queued state. This change makes it get retried. This was discovered in load testing the HA work (but unrelated to HA changes), where I swamped the kube-dns pod, meaning the worker was sometimes unable to resolve the db name via DNS, so the state in the DB was never updated	2020-09-14 10:40:14 +01:00
Jarek Potiuk	b2dc346062	Make breeeze-complete Google Shell Guide compatible (#10708 ) Also added unit tests for breeze-complete Part of #10576	2020-09-14 10:21:09 +02:00
Jarek Potiuk	791f9044fe	Adds the maintain-heart-rate to quarantine. (#10922 ) The test occasionally fails, moving it to quarantine for now.	2020-09-14 10:18:54 +02:00
tszerszen	12a652f534	Fix parameter name collision in AutoMLBatchPredictOperator #10723 (#10869 ) Rename `params` to `prediction_params` to avoid clash with BaseOperator arguments	2020-09-13 17:05:57 +02:00
Kaxil Naik	f77a11d5b1	Add Secrets backend for Microsoft Azure Key Vault (#10898 )	2020-09-13 16:45:21 +02:00
Kaxil Naik	92eafc01ed	Parameterize tests in hashicorp/hooks/test_vault.py (#10903 ) Some of the tests were parameterizable, so less line to maintain with the same level of testing	2020-09-12 22:01:47 +01:00
Kaxil Naik	ee42aaeaa2	Fix typo in the word 'instance' (#10902 ) `instnace` -> `instance`	2020-09-12 20:08:43 +01:00
Kaxil Naik	f383bb3416	Fix separated strings in test_secrets_manager.py (#10900 ) "airflow.providers.amazon.aws.secrets.secrets_manager." "SecretsManagerBackend.get_conn_uri" to "airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend.get_conn_uri"	2020-09-12 18:31:38 +02:00
Daniel Cohen	2e8b4ece36	Pass conf to subdags (#9956 )	2020-09-12 11:58:17 +01:00
tszerszen	41a62735ed	Add on_kill method to BigQueryInsertJobOperator (#10866 ) * Add on_kill method to BigQueryInsertJobOperator * BigQueryInsertJobOperator pylint disable=too-many-arguments	2020-09-11 20:48:16 +02:00
Daniel Imberman	56bd9b7d6b	Modify helm chart to use pod_template_file (#10872 ) * Modify helm chart to use pod_template_file Since we are deprecating most k8sexecutor arguments we should use the pod_template_file when launching airflow using the KubernetesExecutor * fix tests * one more nit * fix dag command * fix pylint	2020-09-11 10:47:59 -07:00
Anmol Dhingra	c58d60635d	Update qubole_hook to not remove pool as an arg for qubole_operator (#10820 )	2020-09-11 12:30:02 +05:30
Miller Tracy	b9dc3c51ba	Added Plexus as an Airflow provider (#10591 )	2020-09-10 19:54:38 +02:00
tszerszen	68cc7273bf	Add on_kill method to DataprocSubmitJobOperator (#10847 )	2020-09-10 19:07:08 +02:00
Ash Berlin-Taylor	1a95361122	Fix and unquarantine TestDagFileProcessorAgent.test_parse_once (#10862 ) The SmartSensor PR introduces slightly different behaviour on list_py_files happens when given a file path directly. Prior to that PR, if given a file path it would not include examples. After that PR was merged, it would return that path and the example dags (assuming they were enabled.)	2020-09-10 17:04:14 +01:00
Ash Berlin-Taylor	63b6e53ffd	Detect orphaned task instances by SchedulerJob id and heartbeat (#10729 ) Once HA mode for scheduler lands, we can no longer reset orphaned task by looking at the tasks in (the memory of) the current executor. This changes it to keep track of which (Scheduler)Job queued/scheduled a TaskInstance (the new "queued_by_job_id" column stored against TaskInstance table), and then we can use the existing heartbeat mechanism for jobs to notice when a TI should be reset. As part of this the existing implementation of `reset_state_for_orphaned_tasks` has been moved out of BaseJob in to BackfillJob -- as only this and SchedulerJob had these methods, and the SchedulerJob version now operates differently	2020-09-10 17:01:41 +01:00
Jarek Potiuk	ff72327614	Move parse_once to quarantine (#10857 )	2020-09-10 13:20:23 +01:00
Kaxil Naik	ce66bc944d	Add test for Health Endpoint when there is an exception (#10846 )	2020-09-10 01:00:40 +01:00
Kaxil Naik	ee8b02a14f	Add missing assert call in test_dbapi_hook.py (#10842 ) `assert` call was missing so the statement didn't test or wouldn't fail if condition isn't true	2020-09-09 23:59:16 +01:00
Kaxil Naik	9549274d11	Upgrade black to 20.8b1 (#10818 )	2020-09-09 09:06:24 +01:00
Daniel Imberman	20481c3caf	Add pod_override setting for KubernetesExecutor (#10756 ) * Add podOverride setting for KubernetesExecutor Users of the KubernetesExecutor will now have a "podOverride" option in the executor_config. This option will allow users to modify the pod launched by the KubernetesExecutor using a `kubernetes.client.models.V1Pod` class. This is the first step in deprecating the tradition executor_config. * Fix k8s tests * fix docs	2020-09-08 15:56:59 -07:00
Yingbo Wang	ac943c9e18	[AIRFLOW-3964][AIP-17] Consolidate and de-dup sensor tasks using Smart Sensor (#5499 ) Co-authored-by: Yingbo Wang <yingbo.wang@airbnb.com>	2020-09-08 22:47:59 +01:00
Kamil Breguła	ff41361e0e	Add task logging handler to airflow info command (#10771 )	2020-09-08 22:12:55 +02:00
Jarek Potiuk	2811851f80	Move Impersonation test back to quarantine (#10809 ) Seems that TestImpersonation is not stable even in isolation Moving it back to quarantine for now.	2020-09-08 21:33:44 +02:00
Kamil Breguła	961131d51c	All files in providers package heve unit tests (#10799 )	2020-09-08 13:55:35 +02:00
Ephraim Anierobi	078bfaf60a	Extract missing gcs_to_local example DAG from gcs example (#10767 ) Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com>	2020-09-08 13:08:06 +02:00
Ephraim Anierobi	3c3342f1fd	Add unit test for AzureCosmosDocumentSensor (#10765 )	2020-09-08 12:21:22 +02:00
Jarek Potiuk	4de67a6731	Move dev docker images to airflow registry (#9652 ) Part of #9401	2020-09-08 10:07:10 +02:00
Joshua Carp	2934220dc9	Always return a list from S3Hook list methods (#10774 )	2020-09-08 09:49:34 +02:00
Dmitri Kuksik	10ce31127f	Deprecate using global as the default region in Google Dataproc operators and hooks (#10772 ) The region parameter is required for some of Google Dataproc operators and it should be provided by users to avoid creating data-intensive tasks in any default location.	2020-09-08 08:46:29 +02:00
Jarek Potiuk	b746f33fc6	Removes stable tests from quarantine (#10768 ) We've observed the tests for last couple of weeks and it seems most of the tests marked with "quarantine" marker are succeeding in a stable way (https://github.com/apache/airflow/issues/10118) The removed tests have success ratio of > 95% (20 runs without problems) and this has been verified a week ago as well, so it seems they are rather stable. There are literally few that are either failing or causing the Quarantined builds to hang. I manually reviewed the master tests that failed for last few weeks and added the tests that are causing the build to hang. Seems that stability has improved - which might be casued by some temporary problems when we marked the quarantined builds or too "generous" way of marking test as quarantined, or maybe improvement comes from the #10368 as the docker engine and machines used to run the builds in GitHub experience far less load (image builds are executed in separate builds) so it might be that resource usage is decreased. Another reason might be Github Actions stability improvements. Or simply those tests are more stable when run isolation. We might still add failing tests back as soon we see them behave in a flaky way. The remaining quarantined tests that need to be fixed: * test_local_run (often hangs the build) * test_retry_handling_job * test_clear_multiple_external_task_marker * test_should_force_kill_process * test_change_state_for_tis_without_dagrun * test_cli_webserver_background We also move some of those tests to "heisentests" category Those testst run fine in isolation but fail the builds when run with all other tests: * TestImpersonation tests We might find that those heisentest can be fixed but for now we are going to run them in isolation. Also - since those quarantined tests are failing more often the "num runs" to track for those has been decreased to 10 to keep track of 10 last runs only.	2020-09-08 07:36:12 +02:00
Mateusz Kukieła	f14f379716	[AIRFLOW-10672] Refactor BigQueryToGCSOperator to use new method (#10773 ) Makes BigQueryToGCSOperator to use BigQueryHook.insert_job method Committer: Mateusz Kukieła <mateuszkukiela@gmail.com>	2020-09-07 16:18:16 +02:00
Tomek Urbaszek	c8ee455685	Refactor DataprocCreateCluster operator to use simpler interface (#10403 ) DataprocCreateCluster requires now: - cluster config - cluster name - project id In this way users don't have to pass project_id two times (in cluster definition and as parameter). The cluster object is built in create_cluster hook method	2020-09-07 12:21:00 +02:00
Kamil Breguła	ddee0aa4fb	Simplify load connection in LocalFilesystemBackend (#10638 )	2020-09-06 20:56:03 +02:00
Jed Cunningham	59f9a4116a	Add permission "extra_links" for Viewer role and above (#10719 ) This change adds 'can extra links on Airflow' to the Viewer role and above. Currently, only Admins can see extra links by default.	2020-09-06 18:26:08 +02:00
Varun Dhussa	ece685b5b8	Asynchronous execution of Dataproc jobs with a Sensor (#10673 )	2020-09-05 13:11:37 +01:00
Kaxil Naik	7f0271f820	Improve test coverage for ConfObject in dag_run_schema (#10738 ) Adds test to verify that string can be passed to conf and ConfObject._deserialize works.	2020-09-05 08:55:12 +02:00
Kaxil Naik	a1a312ee1b	Fix typo in test_dag_run_schema.py (#10739 )	2020-09-05 08:54:17 +02:00
Kaxil Naik	5b683f09c0	Improve test coverage for test_common_schema.py (#10740 ) Adds test that an error is raised with specific message when unkown object type is passed	2020-09-05 08:53:43 +02:00
Antonio Davide Calì	6e3d7b63d3	Add masterConfig parameter to MLEngineStartTrainingJobOperator (#10578 ) Co-authored-by: antonio-davide-cali <antonio.davide.cali@ikea.com>	2020-09-04 23:58:24 +02:00
Jarek Potiuk	e4de7288a3	Switches to better BATS asserts (#10718 ) BATS has additional libraries of asserts that are much more straightforward and nicer to write tests for bash scripts There is no dockerfile from BATS that contains those, so we had to build our own (but it follows the same structure as #9652 - where we keep our dev docker image sources inside our repository and the generated docker images in "apache/airflow:<tool>-CALVER-TOOLVER format. We have more BATS unit test to add - following #10576 and this change will be of great help.	2020-09-04 22:25:29 +02:00
Daniel Imberman	828f7303b7	Add generate_yaml command to easily test KubernetesExecutor before deploying pods (#10677 ) * Add generate_template command for kubernetes_executor * move import * fix test failure * Address @mik-laj comments * Address @mik-laj comments * Use current dir * add docs * fix test	2020-09-03 18:04:23 -07:00
Kamil Breguła	ab5235ee12	Unify command names in CLI (#10720 ) * Unify command names in CLI * fixup! Unify command names in CLI	2020-09-04 01:25:39 +02:00
Ash Berlin-Taylor	de0d7d52ac	Make test_trigger_rule_dep tests re-runnable (#10712 ) If we run this test (TestTriggerRuleDep::test_get_states_count_upstream_ti specifically) more than once without clearing the DB in between it would fail due to a unique constraint violation.	2020-09-03 17:19:30 +01:00
Ash Berlin-Taylor	a01d986f6a	Don't commit when explicitly passed a session to TI.set_state (#10710 ) The `@provide_session` wrapper will already commit the transaction when returned, unless an explicit session is passed in -- removing this parameter changes the behaviour to be: - If session explicitly passed in: don't commit (caller's responsibility) - If no session passed in, `@provide_session` will commit for us already.	2020-09-03 17:18:32 +01:00
Kaxil Naik	9ac882e6cc	[AIRFLOW-5948] Replace SimpleDag with SerializedDag (#7694 )	2020-09-03 16:52:27 +01:00
Tomek Urbaszek	913397c1c6	Make Cloud Build system tests setup runnable (#10692 ) This change fixes error: open(quickstart.sh): Permission denied that was rised during git add.	2020-09-03 13:20:10 +02:00
Aaditya Sharma	36aa88ffc1	Add jupytercmd and fix task failure when notify set as true in qubole operator (#10599 ) Add jupytercmd in Qubole Operator which fires a JupyterNotebookCommand to the jupyter notebooks running on user's QDS account. Along with this, we have fixed a minor bug that caused the tasks to fail with --notify is set in Qubole Operator. Co-authored-by: Aaditya Sharma <asharma@qubole.com>	2020-09-03 15:00:19 +05:30
Jarek Potiuk	4e09cb53ea	Add packages to function names in bash (#10670 ) (#10696 ) Inspired by the Google Shell Guide where they mentioned separating package names with :: I realized that this was one of the missing pieces in the bash scripts of ours. While we already had packages (in libraries folders) it's been difficult to realise which function is where. With introducing packages - equal to the library file name we are almost at a level of a structured language - and it's easier to find the functions if you are looking for them. Way easier in fact. Part of #10576 (cherry picked from commit `cc551ba793`) (cherry picked from commit 2bba276f0f06a5981bdd7e4f0e7e5ca2fe84f063)	2020-09-02 21:58:37 +02:00
Jarek Potiuk	649ce4ba9d	Implement Google Shell Conventions for breeze script (#10695 ) * Implement Google Shell Conventions for breeze script … (#10651) Part of #10576 First (and the biggest of the series of commits to introduce Google Shell Conventions in our bash scripts. This is about the biggest and the most complex breeze script so it is rather huge but it is difficult to split it into smaller pieces. The rules implemented (from the conventions): * constants and exported variables are CAPITALIZED, where local/temporary variables are lowercase * following the shell guide, once all the variables are set to their final values (either from exported variables, calculation or --switches ) I have a single function that makes all the variables read-only. That helped to clean-up a lot of places where same functions was called several times, or where variables were defined in a few places. Now the behavior should be rather consistent and we should easily catch some duplications * function headers (following the guide) explaining arguments, variables expected, variables modified in the functions used. * setting the variables as read-only also helped to clean-up the "ifs" where we often had ":=}" in variables and != "" or == "". Those are replaced with `=}` and tests are replaced with `-n` and `-z` - also following the shell guide (readonly helped to detect and clean all such cases). This also should be much more robust in the future. * reorganized initialization of those constants and variables - simplified a few places where initialization was overlapping. It should be much more straightforward and clean now * a number of internal function breeze variables are "local" - this is helpful in accidental variables overwriting and keeping stuff localized * trap_add function is separated out to help in cases where we had several traps handling the same signals. (cherry picked from commit `46c8d6714c`) (cherry picked from commit c822fd7b4bf2a9c5a9bb3c6e783cbea9dac37246) * fixup! Implement Google Shell Conventions for breeze script … (#10651)	2020-09-02 21:55:50 +02:00
Kaxil Naik	9a10f83ab0	Revert recent breeze changes (#10651 & #10670 ) (#10694 ) * Revert "Add packages to function names in bash (#10670)" This reverts commit `cc551ba793`. * Revert "Implement Google Shell Conventions for breeze script … (#10651)" This reverts commit `46c8d6714c`.	2020-09-02 17:27:36 +01:00
Kamil Breguła	0d9e421f16	Unify command names in CLI (#10669 ) * Unify command names in CLI	2020-09-02 08:43:41 -04:00
Jarek Potiuk	cc551ba793	Add packages to function names in bash (#10670 ) Inspired by the Google Shell Guide where they mentioned separating package names with :: I realized that this was one of the missing pieces in the bash scripts of ours. While we already had packages (in libraries folders) it's been difficult to realise which function is where. With introducing packages - equal to the library file name we are almost at a level of a structured language - and it's easier to find the functions if you are looking for them. Way easier in fact. Part of #10576	2020-09-01 13:40:06 +02:00
Michał Słowikowski	804548d58f	Add Dataprep operators (#10304 ) Add DataprepGetJobGroupOperator and DataprepRunJobGroupOperator for Dataprep service. Co-authored-by: Tomek Urbaszek <tomasz.urbaszek@polidea.com>	2020-09-01 12:59:13 +02:00
Shoichi Kagawa	f40ac9b151	Add placement_strategy option (#9444 )	2020-09-01 01:50:08 +02:00
Ephraim Anierobi	aa2db70494	Unify error messages and complete type field in response (#10333 ) Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>	2020-08-31 15:36:52 +02:00
Marco Aguiar	e6a0a5374d	Display conf as a JSON in the DagRun list view (#10644 ) Co-authored-by: Marco Aguiar <marco@DESKTOP-8IVSCHM.localdomain>	2020-08-31 15:31:58 +02:00
Jarek Potiuk	46c8d6714c	Implement Google Shell Conventions for breeze script … (#10651 ) Part of #10576 First (and the biggest of the series of commits to introduce Google Shell Conventions in our bash scripts. This is about the biggest and the most complex breeze script so it is rather huge but it is difficult to split it into smaller pieces. The rules implemented (from the conventions): * constants and exported variables are CAPITALIZED, where local/temporary variables are lowercase * following the shell guide, once all the variables are set to their final values (either from exported variables, calculation or --switches ) I have a single function that makes all the variables read-only. That helped to clean-up a lot of places where same functions was called several times, or where variables were defined in a few places. Now the behavior should be rather consistent and we should easily catch some duplications * function headers (following the guide) explaining arguments, variables expected, variables modified in the functions used. * setting the variables as read-only also helped to clean-up the "ifs" where we often had ":=}" in variables and != "" or == "". Those are replaced with `=}` and tests are replaced with `-n` and `-z` - also following the shell guide (readonly helped to detect and clean all such cases). This also should be much more robust in the future. * reorganized initialization of those constants and variables - simplified a few places where initialization was overlapping. It should be much more straightforward and clean now * a number of internal function breeze variables are "local" - this is helpful in accidental variables overwriting and keeping stuff localized * trap_add function is separated out to help in cases where we had several traps handling the same signals.	2020-08-31 13:24:53 +02:00
Masato Ohba	11c00bc820	Fix typos: duplicated "the" (#10647 )	2020-08-30 09:57:24 +02:00
Kamil Breguła	8e0d9f09d9	Add airflow cheat-sheet command (#10619 )	2020-08-28 21:25:29 +02:00
Tomek Urbaszek	5ae82a56da	Fix Google DLP example and improve ops idempotency (#10608 )	2020-08-28 16:35:47 +02:00
Kamil Breguła	3867f76625	Update Google Cloud branding (#10615 )	2020-08-28 12:19:27 +02:00
Kaxil Naik	725bf330ef	Revert Clean up DAG serializations based on last_updated (#7424 ) (#10613 ) This PR reverts the behavior of https://github.com/apache/airflow/pull/7424	2020-08-27 20:56:41 +01:00
Anton Bryzgalov	2e56ee7b22	DockerOperator extra_hosts argument support added (#10546 )	2020-08-27 11:36:04 +01:00
Beni Ben zikry	1e5aa4465c	Spark-on-K8S sensor - add driver logs (#10023 )	2020-08-26 18:14:20 +02:00
Ping Zhang	db378c09b7	[k8s] Store the raw ti key info to pod annotations (#10568 ) The value of annotations can store the raw dag_id, task_id and execution_date so that k8s executor can easily map pod event back to the task instance	2020-08-26 07:53:37 -07:00
Jarek Potiuk	8a7c37281c	Untangle cyclic deps configuration <> secrets (#10559 )	2020-08-26 16:38:57 +02:00
Kaxil Naik	fdd9b6f65b	Enable Black on Providers Packages (#10543 )	2020-08-25 17:39:04 +01:00
Kaxil Naik	4c6b7595de	Fix failing Black test on connexion (#10547 )	2020-08-25 12:57:07 +01:00
Kaxil Naik	7c0d6ab9f4	Enable Black on Connexion API folders (#10545 )	2020-08-25 12:10:20 +01:00
Ephraim Anierobi	d6ce8c8561	Add update mask to patch dag endpoint (#10535 )	2020-08-25 12:56:19 +02:00
Kaxil Naik	d760265452	PyDocStyle: No whitespaces allowed surrounding docstring text (#10533 )	2020-08-25 09:50:21 +01:00
Steven Yu	bfefcce0c9	Updated REST API call so GET requests pass payload in query string instead of request body (#10462 ) * Updated REST API call so GET requests pass payload in query string instead of request body * Updated comparisons to use in to follow better standards * Added whitespace for pylint failure * Update Databricks hooks tests to reflect new payload * Fixed trailing whitespace in unit test Co-authored-by: Steven Yu <steven@databricks.com>	2020-08-25 01:59:36 +02:00
Kaxil Naik	0e0aefb8f0	Fix TestAWSDataSyncOperatorUpdate.__init__ method (#10536 ) `__init` -> `__init__`	2020-08-25 00:39:57 +01:00
Kaxil Naik	6bed074b2d	Remove unreachable code in test_user_command.py (#10526 )	2020-08-25 00:31:06 +01:00
Jarek Potiuk	2f2d8dbfaf	Remove all "noinspection" comments native to IntelliJ (#10525 ) We have already fixed a lot of problems that were marked with those, also IntelluiJ gotten a bit smarter on not detecting false positives as well as understand more pylint annotation. Wherever the problem remained we replaced it with # noqa comments - as it is also well understood by IntelliJ.	2020-08-25 00:01:37 +02:00
Kamil Olszewski	fef73b91d8	Fix impersonation related bug in bigtable tests (#10521 ) Co-authored-by: Kamil Olszewski <kamil.olszewski@polidea.com>	2020-08-24 20:13:09 +01:00
Kamil Olszewski	3734876d98	Implement impersonation in google operators (#10052 ) Co-authored-by: Kamil Olszewski <kamil.olszewski@polidea.com>	2020-08-24 13:47:59 +02:00
Derrick Qin	b0598b5351	Add support for creating multiple replicated clusters in Bigtable hook and operator (#10475 ) * Add support for creating multiple Bigtable replicas * Flake8 fix	2020-08-24 11:44:22 +02:00
Jarek Potiuk	82369fadde	Removed the prerequisite for perf-kit path augmentation (#10492 )	2020-08-23 15:50:25 +02:00
Kaxil Naik	ef8df17348	Fix typo in Facebook Ads Provider (#10484 ) `missings_keys` -> `missing_keys`	2020-08-22 21:54:19 +02:00
Jarek Potiuk	7ee7d7cf3f	Move perf_kit to tests.utils (#10470 ) Perf_kit was a separate folder and it was a problem when we tried to build it from Docker-embedded sources, because there was a hidden, implicit dependency between tests (conftest) and perf. Perf_kit is now moved to tests to be avaiilable in the CI image also when we run tests without the sources mounted. This is changing back in #10441 and we need to move perf_kit for it to work.	2020-08-22 21:53:07 +02:00
Gabriel Montañola	c6358045f9	Fixes S3ToRedshift COPY query (#10436 ) * fix: 🐛 Wrong S3 URI on COPY query The S3 URI on COPY query was appending the target Redshift table to the S3 object key. * test: 💍 Fixed typo on test query The COPY query that the operator used is the same query the test uses.	2020-08-22 21:19:37 +02:00
Kaxil Naik	44a36b9ab3	Use assertEqual instead of assertTrue in tests/utils/test_dates.py for proper diff (#10457 ) assertEqual will show show the proper diff instead of just "False is not True" error	2020-08-22 10:43:26 +02:00
Kaxil Naik	904c1d825a	Test exact match of Executor name (#10465 ) Use `self.assertEqual` instead of `self.assertIn` to do an exact match of string name instead of partial match	2020-08-22 10:35:01 +02:00
Kaxil Naik	4a77211ab8	Remove redudandant checks in test_views.py (#10464 ) - `self.check_content_in_response` already checks that response code is 200 - `self.assertEqual(None, ...)` -> `self.assertIsNone(...)` - Fix typo: "succcess" -> `success`	2020-08-22 10:32:02 +02:00
Tomek Urbaszek	fdd68ec653	Make system test work with 1.10 (#10444 )	2020-08-21 16:45:37 +02:00
Omair Khan	1e371864cc	Add update endpoint for DAG (#9101 ) (#9740 )	2020-08-21 12:28:21 +02:00
Felix Uellendall	2f552233f5	Add AzureBaseHook (#9747 ) - refactor/change azure_container_instance to use AzureBaseHook - add info to operators-and-hooks-ref.rst - add howto docs for connecting to azure - add auth mechanism via json config - add azure conn type	2020-08-21 11:45:23 +02:00
Ignacio Peluffo	27d08b76a2	Amazon SES Hook (#10391 ) * Add Amazon SES hook * Add SES Hook to operators-and-hooks documentation. * Fix arguments for parent class constructor call (PR feedback) * Fix indentation in operators-and-hooks documentation * Fix mypy error for argument on call to parent class constructor * Simplify logic on constructor (PR feedback) * Add custom headers and other relevant options to hook * Change pylint exception rule to apply it only to function instead of module (PR feedback) * Fix spellcheck error * Vendorize airflow.utils.emaail * fixup! Vendorize airflow.utils.emaail Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com>	2020-08-21 09:32:25 +02:00
Craig Chatfield	88c7d2e526	Dataflow operators don't not always create a virtualenv (#10373 )	2020-08-21 02:28:37 +02:00
Daniel Imberman	e195c6a3d2	Make KubernetesExecutor recognize kubernetes_labels (#10412 ) KubernetesExecutor needs to inject `kubernetes_labels` configs into the worker_config	2020-08-20 00:06:56 +01:00
Daniel Imberman	f76938c171	Make Kubernetes tests pass locally (#10407 ) * Make Kubernetes tests pass locally Currently Kuberentes tests only all pass within breeze. This PR makes them read the local path so they can pass in any system. * static tests	2020-08-19 15:49:12 -07:00
Jarek Potiuk	db446f2677	Replaced aliases for common tools with functions. (#10402 ) This allows for all the kinds of verbosity we want, including writing outputs to output files, and it also works out-of-the-box in git-commit non-interactive shell scripts. Also as a side effect we have mocked tools in bats tests, which will allow us to write more comprehensive unit tests for the bash scripts of ours (this is a long overdue task). Part of #10368	2020-08-19 15:23:57 +02:00
Kaxil Naik	3bc37013f6	Add back 'refresh_all' method in airflow/www/views.py (#10328 ) closes https://github.com/apache/airflow/issues/9749	2020-08-19 10:59:36 +01:00
QP Hou	541c47c998	Add basic auth API auth backend (#10356 )	2020-08-19 09:44:17 +01:00
Cooper Gillan	d6f6d53bcd	Expand JenkinsJobTriggerOperator unit tests (#10353 ) Using the parameterized library, add unit test coverage for JenkinsJobTriggerOperator parameters, covering parameters as strings or as a list of strings.	2020-08-18 23:53:27 +01:00
Kamil Breguła	083c3c129b	Simplified GCSTaskHandler configuration (#10365 )	2020-08-18 16:24:26 +02:00
Ping Zhang	439f7dc1d1	Use check_output to capture in celery task (#10310 ) See: https://docs.python.org/3/library/subprocess.html#subprocess.CalledProcessError The check_call does not set output to the subprocess.CalledProcessError so the log.error(e.output) code is always None. By using check_ouput, when there is CalledProcessError, it will correctly log the error output	2020-08-18 12:55:46 +02:00
Jubeen Lee	dea345b05c	Fix AwsGlueJobSensor to stop running after the Glue job finished (#9022 ) * Extract get_job_state and fix poke of AwsGlueJobSensor * Save hook and reuse in GlueJobSensor * Add descriptions for some functions * Fix tests according to changed function definition * Fix too long line * Add type hints and apply review * Fix type error Co-authored-by: JB Lee <jb.lee@sendbird.com>	2020-08-17 18:41:50 +02:00
Omair Khan	1ae5bdf23e	Add test for GCSTaskHandler (#9600 ) (#9861 ) Co-authored-by: Kamil Breguła <kamil.bregula@polidea.com>	2020-08-17 10:53:57 +02:00
Ryan Yuan	382c1011b6	Add Bigtable Update Instance Hook/Operator (#10340 ) Add Bigtable Update Instance Hook/Operator	2020-08-16 05:52:14 +02:00
Tomek Urbaszek	be46d20fb4	Improve idempotency of BigQueryInsertJobOperator (#9590 ) Co-authored-by: Jacob Ferriero <jferriero@google.com>	2020-08-15 10:30:22 +02:00
Kaxil Naik	5c2bb7b0b0	Webserver: Sanitize values passed to origin param (#10334 )	2020-08-15 04:26:48 +01:00
yuqian90	4454224b68	Fix clear future recursive when ExternalTaskMarker is used (#9515 )	2020-08-14 23:39:57 +01:00
mhenc	47387a69e6	Catch Permission Denied exception when getting secret from GCP Secret Manager. (#10326 )	2020-08-14 17:26:05 +02:00
Kaxil Naik	2d4e44c04e	Respect DAG Serialization setting when running sync_perm (#10321 ) We run this on Webserver Startup and when DAG Serialization is enabled we expect that no files are required but because of this bug the files were still looked for.	2020-08-13 21:38:49 +02:00
Jens Larsson	2f0613b0c2	Implement Google BigQuery Table Partition Sensor (#10218 )	2020-08-13 15:23:46 +02:00
Jacob Ferriero	7f76b8b942	Add ClusterPolicyViolation support to airflow local settings (#10282 ) This change will allow users to throw other exceptions (namely `AirflowClusterPolicyViolation`) than `DagCycleException` as part of Cluster Policies. This can be helpful for running checks on tasks / DAGs (e.g. asserting task has a non-airflow owner) and failing to run tasks aren't compliant with these checks. This is meant as a tool for airflow admins to prevent user mistakes (especially in shared Airflow infrastructure with newbies) than as a strong technical control for security/compliance posture.	2020-08-12 23:06:29 +01:00
David Cavaletto	f6734b3b85	Enable Sphinx spellcheck for doc generation (#10280 )	2020-08-12 21:30:37 +01:00
Kaxil Naik	adce6f0296	Use Hash of Serialized DAG to determine DAG is changed or not (#10227 ) closes #10116	2020-08-11 22:31:55 +01:00
Ephraim Anierobi	0ee437547b	Add unittest for WasbTaskHandler (#10284 )	2020-08-11 18:18:49 +02:00
Daniel Imberman	3c374a42c0	Add reconcile_metadata to reconcile_pods (#10266 ) metadata objects require a more complex merge strategy then a simple "merge pods" for merging labels and other features	2020-08-11 07:49:44 -07:00
Kamil Breguła	422e3f1d5d	Add Authentication for Stable API (#10267 )	2020-08-11 16:23:10 +02:00
Jarek Potiuk	19bc97d0ce	Revert "Add Amazon SES hook (#10004 )" (#10276 ) This reverts commit `f06fe616e6`.	2020-08-10 16:30:40 +02:00
Ignacio Peluffo	f06fe616e6	Add Amazon SES hook (#10004 ) - refactor airflow.utils.email and add typing	2020-08-10 11:58:55 +02:00
Michał Słowikowski	ef088314f8	Added DataprepGetJobsForJobGroupOperator (#10246 )	2020-08-09 22:45:40 +02:00
Kamil Breguła	db8d06a696	Disable sentry integration by default (#10212 ) * Disable sentry integration by default	2020-08-09 13:21:41 +02:00
Kamil Breguła	e2ec5ef665	Update example on docs/howto/connection/index.rst (#10236 ) * Upddate example on docs/howto/connection/index.rst * fixup! Upddate example on docs/howto/connection/index.rst	2020-08-09 12:25:15 +02:00
Kamil Breguła	12eed9d960	Add system tests for CloudSecretManagerBackend (#10235 ) * Add system tests for CloudSecretManagerBackend * fixup! Add system tests for CloudSecretManagerBackend	2020-08-08 19:00:41 +02:00
Cooper Gillan	c29533888f	Add labels param to Google MLEngine Operators (#10222 )	2020-08-08 02:47:50 +02:00

1 2 3 4 5 ...

3020 Коммитов