[AIRFLOW-XXX] CHANGELOG and UPDATING for 1.10.3

This commit is contained in:
Ash Berlin-Taylor 2019-04-06 10:04:23 +01:00
Родитель ec631954b8
Коммит e8cd3e23e0
2 изменённых файлов: 533 добавлений и 193 удалений

Просмотреть файл

@ -1,3 +1,327 @@
Airflow 1.10.3, 2019-04-09
--------------------------
New Feature
"""""""""""
- [AIRFLOW-4232] Add ``none_skipped`` trigger rule (#5032)
- [AIRFLOW-3971] Add Google Cloud Natural Language operators (#4980)
- [AIRFLOW-4069] Add Opsgenie Alert Hook and Operator (#4903)
- [AIRFLOW-3552] Fix encoding issue in ImapAttachmentToS3Operator (#5040)
- [AIRFLOW-3552] Add ImapToS3TransferOperator (#4476)
- [AIRFLOW-1526] Add dingding hook and operator (#4895)
- [AIRFLOW-3490] Add BigQueryHook's Ability to Patch Table/View (#4299)
- [AIRFLOW-3918] Add ssh private-key support to git-sync for KubernetesExecutor (#4777)
- [AIRFLOW-3659] Create Google Cloud Transfer Service Operators (#4792)
- [AIRFLOW-3939] Add Google Cloud Translate operator (#4755)
- [AIRFLOW-3541] Add Avro logical type conversion to bigquery hook (#4553)
- [AIRFLOW-4106] instrument staving tasks in pool (#4927)
- [AIRFLOW-2568] Azure Container Instances operator (#4121)
- [AIRFLOW-4107] instrument executor (#4928)
- [AIRFLOW-4033] record stats of task duration (#4858)
- [AIRFLOW-3892] Create Redis pub sub sensor (#4712)
- [AIRFLOW-4124] add get_table and get_table_location in aws_glue_hook and tests (#4942)
- [AIRFLOW-1262] Adds missing docs for email configuration (#4557)
- [AIRFLOW-3701] Add Google Cloud Vision Product Search operators (#4665)
- [AIRFLOW-3766] Add support for kubernetes annotations (#4589)
- [AIRFLOW-3741] Add extra config to Oracle hook (#4584)
- [AIRFLOW-1262] Allow configuration of email alert subject and body (#2338)
- [AIRFLOW-2985] Operators for S3 object copying/deleting (#3823)
- [AIRFLOW-2993] s3_to_sftp and sftp_to_s3 operators (#3828)
- [AIRFLOW-3799] Add compose method to GoogleCloudStorageHook (#4641)
- [AIRFLOW-3218] add support for poking a whole DAG (#4058)
- [AIRFLOW-3315] Add ImapAttachmentSensor (#4161)
- [AIRFLOW-2780] Add IMAP Hook to retrieve email attachments (#4119)
- [AIRFLOW-3556] Add cross join set dependency function (#4356)
Improvement
"""""""""""
- [AIRFLOW-3823] Exclude branch's downstream tasks from the tasks to skip (#4666)
- [AIRFLOW-3274] Add run_as_user and fs_group options for Kubernetes (#4648)
- [AIRFLOW-4247] Template Region on the DataprocOperators (#5046)
- [AIRFLOW-4008] add envFrom for Kubernetes Executor (#4952)
- [AIRFLOW-3947] Flash msg for no DAG-level access error (#4767)
- [AIRFLOW-3287] Moving database clean-up code into the CoreTest.tearDown() (#4122)
- [AIRFLOW-4058] Name models test file to get automatically picked up (#4901)
- [AIRFLOW-3830] Remove DagBag from /dag_details (#4831)
- [AIRFLOW-3596] Clean up undefined template variables. (#4401)
- [AIRFLOW-3573] Remove DagStat table (#4378)
- [AIRFLOW-3623] Fix bugs in Download task logs (#5005)
- [AIRFLOW-4173] Improve SchedulerJob.process_file() (#4993)
- [AIRFLOW-3540] Warn if old airflow.cfg file is found (#5006)
- [AIRFLOW-4000] Return response when no file (#4822)
- [AIRFLOW-3383] Rotate fernet keys. (#4225)
- [AIRFLOW-3003] Pull the krb5 image instead of building (#3844)
- [AIRFLOW-3862] Check types with mypy. (#4685)
- [AIRFLOW-251] Add option SQL_ALCHEMY_SCHEMA parameter to specify schema for metadata (#4199)
- [AIRFLOW-1814] Temple PythonOperator {op_args,op_kwargs} fields (#4691)
- [AIRFLOW-3730] Standarization use of logs mechanisms (#4556)
- [AIRFLOW-3770] Validation of documentation on CI] (#4593)
- [AIRFLOW-3866] Run docker-compose pull silently in CI (#4688)
- [AIRFLOW-3685] Move licence header check (#4497)
- [AIRFLOW-3670] Add stages to Travis build (#4477)
- [AIRFLOW-3937] KubernetesPodOperator support for envFrom configMapRef and secretRef (#4772)
- [AIRFLOW-3408] Remove outdated info from Systemd Instructions (#4269)
- [AIRFLOW-3202] add missing documentation for AWS hooks/operator (#4048)
- [AIRFLOW-3908] Add more Google Cloud Vision operators (#4791)
- [AIRFLOW-2915] Add example DAG for GoogleCloudStorageToBigQueryOperator (#3763)
- [AIRFLOW-3062] Add Qubole in integration docs (#3946)
- [AIRFLOW-3288] Add SNS integration (#4123)
- [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (#3995)
- [AIRFLOW-3049] Add extra operations for Mongo hook (#3890)
- [AIRFLOW-3559] Add missing options to DatadogHook. (#4362)
- [AIRFLOW-1191] Simplify override of spark submit command. (#4360)
- [AIRFLOW-3155] Add ability to filter by a last modified time in GCS Operator (#4008)
- [AIRFLOW-2864] Fix docstrings for SubDagOperator (#3712)
- [AIRFLOW-4062] Improve docs on install extra package commands (#4966)
- [AIRFLOW-3743] Unify different methods of working out AIRFLOW_HOME (#4705)
- [AIRFLOW-4002] Option to open debugger on errors in `airflow test`. (#4828)
- [AIRFLOW-3997] Extend Variable.get so it can return None when var not found (#4819)
- [AIRFLOW-4009] Fix docstring issue in GCSToBQOperator (#4836)
- [AIRFLOW-3980] Unify logger (#4804)
- [AIRFLOW-4076] Correct port type of beeline_default in init_db (#4908)
- [AIRFLOW-4046] Add validations for poke_interval & timeout for Sensor (#4878)
- [AIRFLOW-3744] Abandon the use of obsolete aliases of methods (#4568)
- [AIRFLOW-3865] Add API endpoint to get python code of dag by id (#4687)
- [AIRFLOW-3516] Support to create k8 worker pods in batches (#4434)
- [AIRFLOW-2843] Add flag in ExternalTaskSensor to check if external DAG/task exists (#4547)
- [AIRFLOW-2224] Add support CSV files in MySqlToGoogleCloudStorageOperator (#4738)
- [AIRFLOW-3895] GoogleCloudStorageHook/Op create_bucket takes optional resource params (#4717)
- [AIRFLOW-3950] Improve AirflowSecurityManager.update_admin_perm_view (#4774)
- [AIRFLOW-4006] Make better use of Set in AirflowSecurityManager (#4833)
- [AIRFLOW-3917] Specify alternate kube config file/context when running out of cluster (#4859)
- [AIRFLOW-3911] Change Harvesting DAG parsing results to DEBUG log level (#4729)
- [AIRFLOW-3584] Use ORM DAGs for index view. (#4390)
- [AIRFLOW-2821] Refine Doc "Plugins" (#3664)
- [AIRFLOW-3561] Improve queries (#4368)
- [AIRFLOW-3600] Remove dagbag from trigger (#4407)
- [AIRFLOW-3713] Updated documentation for GCP optional project_id (#4541)
- [AIRFLOW-2767] - Upgrade gunicorn to 19.5.0 to avoid moderate-severity CVE (#4795)
- [AIRFLOW-3795] provide_context param is now used (#4735)
- [AIRFLOW-4012] - Upgrade tabulate to 0.8.3 (#4838)
- [AIRFLOW-3623] Support download logs by attempts from UI (#4425)
- [AIRFLOW-2715] Use region setting when launching Dataflow templates (#4139)
- [AIRFLOW-3932] Update unit tests and documentation for safe mode flag. (#4760)
- [AIRFLOW-3932] Optionally skip dag discovery heuristic. (#4746)
- [AIRFLOW-3258] K8S executor environment variables section. (#4627)
- [AIRFLOW-3931] set network, subnetwork when launching dataflow template (#4744)
- [AIRFLOW-4095] Add template_fields for S3CopyObjectOperator & S3DeleteObjectsOperator (#4920)
- [AIRFLOW-2798] Remove needless code from models.py
- [AIRFLOW-3731] Constrain mysqlclient to <1.4 (#4558)
- [AIRFLOW-3139] include parameters into log.info in SQL operators, if any (#3986)
- [AIRFLOW-3174] Refine Docstring for SQL Operators & Hooks (#4043)
- [AIRFLOW-3933] Fix various typos (#4747)
- [AIRFLOW-3905] Allow using "parameters" in SqlSensor (#4723)
- [AIRFLOW-2761] Parallelize enqueue in celery executor (#4234)
- [AIRFLOW-3540] Respect environment config when looking up config file. (#4340)
- [AIRFLOW-2156] Parallelize Celery Executor task state fetching (#3830)
- [AIRFLOW-3702] Add backfill option to run backwards (#4676)
- [AIRFLOW-3821] Add replicas logic to GCP SQL example DAG (#4662)
- [AIRFLOW-3547] Fixed Jinja templating in SparkSubmitOperator (#4347)
- [AIRFLOW-3647] Add archives config option to SparkSubmitOperator (#4467)
- [AIRFLOW-3802] Updated documentation for HiveServer2Hook (#4647)
- [AIRFLOW-3817] - Corrected task ids returned by BranchPythonOperator to match the dummy operator ids (#4659)
- [AIRFLOW-3782] Clarify docs around celery worker_autoscale in default_airflow.cfg (#4609)
- [AIRFLOW-1945] Add Autoscale config for Celery workers (#3989)
- [AIRFLOW-3590]: Change log message of executor exit status (#4616)
- [AIRFLOW-3591] Fix start date, end date, duration for rescheduled tasks (#4502)
- [AIRFLOW-3709] Validate `allowed_states` for ExternalTaskSensor (#4536)
- [AIRFLOW-3522] Add support for sending Slack attachments (#4332)
- [AIRFLOW-3569] Add "Trigger DAG" button in DAG page (/www only) (#4373)
- [AIRFLOW-3569] Add "Trigger DAG" button in DAG page (/www_rbac only) (#4373)
- [AIRFLOW-3044] Dataflow operators accept templated job_name param (#3887)
- [AIRFLOW-3023] Fix docstring datatypes
- [AIRFLOW-2928] Use uuid4 instead of uuid1 (#3779)
- [AIRFLOW-2988] Run specifically python2 for dataflow (#3826)
- [AIRFLOW-3697] Vendorize nvd3 and slugify (#4513)
- [AIRFLOW-3692] Remove ENV variables to avoid GPL (#4506)
- [AIRFLOW-3907] Upgrade flask and set cookie security flags. (#4725)
- [AIRFLOW-3698] Add documentation for AWS Connection (#4514)
- [AIRFLOW-3616][AIRFLOW-1215] Add aliases for schema with underscore (#4523)
- [AIRFLOW-3375] Support returning multiple tasks with BranchPythonOperator (#4215)
- [AIRFLOW-3742] Fix handling of "fallback" for AirflowConfigParsxer.getint/boolean (#4674)
- [AIRFLOW-3742] Respect the `fallback` arg in airflow.configuration.get (#4567)
- [AIRFLOW-3789] Fix flake8 3.7 errors. (#4617)
- [AIRFLOW-3602] Improve ImapHook handling of retrieving no attachments (#4475)
- [AIRFLOW-3631] Update flake8 and fix lint. (#4436)
Bug fixes
"""""""""
- [AIRFLOW-4248] Fix 'FileExistsError' makedirs race in file_processor_handler (#5047)
- [AIRFLOW-4240] State-changing actions should be POST requests (#5039)
- [AIRFLOW-4246] Flask-Oauthlib needs downstream dependencies pinning due to breaking changes (#5045)
- [AIRFLOW-3887] Downgrade dagre-d3 to 0.4.18 (#4713)
- [AIRFLOW-3419] Fix S3Hook.select_key on Python3 (#4970)
- [AIRFLOW-4127] Correct AzureContainerInstanceHook._get_instance_view's return (#4945)
- [AIRFLOW-4172] Fix changes for driver class path option in Spark Submit (#4992)
- [AIRFLOW-3615] Preserve case of UNIX socket paths in Connections (#4591)
- [AIRFLOW-3417] ECSOperator: pass platformVersion only for FARGATE launch type (#4256)
- [AIRFLOW-3884] Fixing doc checker, no warnings allowed anymore and fixed the current… (#4702)
- [AIRFLOW-2652] implement / enhance baseOperator deepcopy
- [AIRFLOW-4001] Update docs about how to run tests (#4826)
- [AIRFLOW-3699] Speed up Flake8 (#4515)
- [AIRFLOW-4160] Fix redirecting of 'Trigger Dag' Button in DAG Page (#4982)
- [AIRFLOW-3650] Skip running on mysql for the flaky test (#4457)
- [AIRFLOW-3423] Fix mongo hook to work with anonymous access (#4258)
- [AIRFLOW-3982] Fix race condition in CI test (#4968)
- [AIRFLOW-3982] Update DagRun state based on its own tasks (#4808)
- [AIRFLOW-3737] Kubernetes executor cannot handle long dag/task names (#4636)
- [AIRFLOW-3945] Stop inserting row when permission views unchanged (#4764)
- [AIRFLOW-4123] Add Exception handling for _change_state method in K8 Executor (#4941)
- [AIRFLOW-3771] Minor refactor securityManager (#4594)
- [AIRFLOW-987] pass kerberos cli args keytab and principal to kerberos.run() (#4238)
- [AIRFLOW-3736] Allow int value in SqoopOperator.extra_import_options(#4906)
- [AIRFLOW-4063] Fix exception string in BigQueryHook [2/2] (#4902)
- [AIRFLOW-4063] Fix exception string in BigQueryHook (#4899)
- [AIRFLOW-4037] Log response in SimpleHttpOperator even if the response check fails
- [AIRFLOW-4044] The documentation of `query_params` in `BigQueryOperator` is wrong. (#4876)
- [AIRFLOW-4015] Make missing API endpoints available in classic mode
- [AIRFLOW-3153] Send DAG processing stats to statsd (#4748)
- [AIRFLOW-2966] Catch ApiException in the Kubernetes Executor (#4209)
- [AIRFLOW-4129] Escape HTML in generated tooltips (#4950)
- [AIRFLOW-4070] AirflowException -> log.warning for duplicate task dependencies (#4904)
- [AIRFLOW-4054] Fix assertEqualIgnoreMultipleSpaces util & add tests (#4886)
- [AIRFLOW-3239] Fix test recovery further (#4074)
- [AIRFLOW-4053] Fix KubePodOperator Xcom on Kube 1.13.0 (#4883)
- [AIRFLOW-2961] Refactor tests.BackfillJobTest.test_backfill_examples test (#3811)
- [AIRFLOW-3606] Fix Flake8 test & fix the Flake8 errors introduced since Flake8 test was broken (#4415)
- [AIRFLOW-3543] Fix deletion of DAG with rescheduled tasks (#4646)
- [AIRFLOW-2548] Output plugin import errors to web UI (#3930)
- [AIRFLOW-4019] Fix AWS Athena Sensor object has no attribute 'mode' (#4844)
- [AIRFLOW-3758] Fix circular import in WasbTaskHandler (#4601)
- [AIRFLOW-3706] Fix tooltip max-width by correcting ordering of CSS files (#4947)
- [AIRFLOW-4100] Correctly JSON escape data for tree/graph views (#4921)
- [AIRFLOW-3636] Fix a test introduced in #4425 (#4446)
- [AIRFLOW-3977] Add examples of trigger rules in doc (#4805)
- [AIRFLOW-2511] Fix improper failed session commit handling causing deadlocks (#4769)
- [AIRFLOW-3962] Added graceful handling for creation of dag_run of a dag which doesn't have any task (#4781)
- [AIRFLOW-3881] Correct to_csv row number (#4699)
- [AIRFLOW-3875] Simplify SlackWebhookHook code and change docstring (#4696)
- [AIRFLOW-3733] Don't raise NameError in HQL hook to_csv when no rows returned (#4560)
- [AIRFLOW-3734] Fix hql not run when partition is None (#4561)
- [AIRFLOW-3767] Correct bulk insert function (#4773)
- [AIRFLOW-4087] remove sudo in basetaskrunner on_finish (#4916)
- [AIRFLOW-3768] Escape search parameter in pagination controls (#4911)
- [AIRFLOW-4045] Fix hard-coded URLs in FAB-based UI (#4914)
- [AIRFLOW-3123] Use a stack for DAG context management (#3956)
- [AIRFLOW-3060] DAG context manager fails to exit properly in certain circumstances
- [AIRFLOW-3924] Fix try number in alert emails (#4741)
- [AIRFLOW-4083] Add tests for link generation utils (#4912)
- [AIRFLOW-2190] Send correct HTTP status for base_url not found (#4910)
- [AIRFLOW-4015] Add get_dag_runs GET endpoint to "classic" API (#4884)
- [AIRFLOW-3239] Enable existing CI tests (#4131)
- [AIRFLOW-1390] Update Alembic to 0.9 (#3935)
- [AIRFLOW-3885] Fix race condition in scheduler test (#4737)
- [AIRFLOW-3885] ~10x speed-up of SchedulerJobTest suite (#4730)
- [AIRFLOW-3780] Fix some incorrect when base_url is used (#4643)
- [AIRFLOW-3807] Fix Graph View Highlighting of Tasks (#4653)
- [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (#3849)
- [AIRFLOW-2231] Fix relativedelta DAG schedule_interval (#3174)
- [AIRFLOW-2641] Fix MySqlToHiveTransfer to handle MySQL DECIMAL correctly
- [AIRFLOW-3751] Option to allow malformed schemas for LDAP authentication (#4574)
- [AIRFLOW-2888] Add deprecation path for task_runner config change (#4851)
- [AIRFLOW-2930] Fix celery excecutor scheduler crash (#3784)
- [AIRFLOW-2888] Remove shell=True and bash from task launch (#3740)
- [AIRFLOW-3885] ~2.5x speed-up for backfill tests (#4731)
- [AIRFLOW-3885] ~20x speed-up of slowest unit test (#4726)
- [AIRFLOW-2508] Handle non string types in Operators templatized fields (#4292)
- [AIRFLOW-3792] Fix validation in BQ for useLegacySQL & queryParameters (#4626)
- [AIRFLOW-3749] Fix Edit Dag Run page when using RBAC (#4613)
- [AIRFLOW-3801] Fix DagBag collect dags invocation to prevent examples to be loaded (#4677)
- [AIRFLOW-3774] Register blueprints with RBAC web app (#4598)
- [AIRFLOW-3719] Handle StopIteration in CloudWatch logs retrieval (#4516)
- [AIRFLOW-3108] Define get_autocommit method for MsSqlHook (#4525)
- [AIRFLOW-3074] Add relevant ECS options to ECS operator. (#3908)
- [AIRFLOW-3353] Upgrade Redis client (#4834)
- [AIRFLOW-3250] Fix for Redis Hook for not authorised connection calls (#4090)
- [AIRFLOW-2009] Fix dataflow hook connection-id (#4563)
- [AIRFLOW-2190] Fix TypeError when returning 404 (#4596)
- [AIRFLOW-2876] Update Tenacity to 4.12 (#3723)
- [AIRFLOW-3923] Update flask-admin dependency to 1.5.3 to resolve security vulnerabilities from safety (#4739)
- [AIRFLOW-3683] Fix formatting of error message for invalid TriggerRule (#4490)
- [AIRFLOW-2787] Allow is_backfill to handle NULL DagRun.run_id (#3629)
- [AIRFLOW-3780] Fix some incorrect when base_url is used (#4643)
- [AIRFLOW-3639] Fix request creation in Jenkins Operator (#4450)
- [AIRFLOW-3779] Don't install enum34 backport when not needed (#4620)
- [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (#3964)
- [AIRFLOW-2735] Use equality, not identity, check for detecting AWS Batch failures[]
- [AIRFLOW-2706] AWS Batch Operator should use top-level job state to determine status
- [AIRFLOW-XXX] Fix typo in http_operator.py
- [AIRFLOW-XXX] Solve lodash security warning (#4820)
- [AIRFLOW-XXX] Pin version of tornado pulled in by Celery. (#4815)
- [AIRFLOW-XXX] Upgrade FAB to 1.12.3 (#4694)
- [AIRFLOW-XXX] Pin pinodb dependency (#4704)
- [AIRFLOW-XXX] Pin version of Pip in tests to work around pypa/pip#6163 (#4576)
- [AIRFLOW-XXX] Fix spark submit hook KeyError (#4578)
- [AIRFLOW-XXX] Pin psycopg2 due to breaking change (#5036)
- [AIRFLOW-XXX] Pin Sendgrid dep. (#5031)
- [AIRFLOW-XXX] Fix flaky test - test_execution_unlimited_parallelism (#4988)
Misc/Interal
""""""""""""
- [AIRFLOW-4144] add description of is_delete_operator_pod (#4943)
- [AIRFLOW-3476,3477] Move Kube classes out of models.py (#4443)
- [AIRFLOW-3464] Move SkipMixin out of models.py (#4386)
- [AIRFLOW-3463] Move Log out of models.py (#4639)
- [AIRFLOW-3458] Move connection tests (#4680)
- [AIRFLOW-3461] Move TaskFail out of models.py (#4630)
- [AIRFLOW-3462] Move TaskReschedule out of models.py (#4618)
- [AIRFLOW-3474] Move SlaMiss out of models.py (#4608)
- [AIRFLOW-3475] Move ImportError out of models.py (#4383)
- [AIRFLOW-3459] Move DagPickle to separate file (#4374)
- [AIRFLOW-3925] Don't pull docker-images on pretest (#4740)
- [AIRFLOW-4154] Correct string formatting in jobs.py (#4972)
- [AIRFLOW-3458] Deprecation path for moving models.Connection
- [AIRFLOW-3458] Move models.Connection into separate file (#4335)
- [AIRFLOW-XXX] Remove old/non-test files that nose ignores (#4930)
Doc-only changes
""""""""""""""""
- [AIRFLOW-3996] Add view source link to included fragments
- [AIRFLOW-3811] automatic generation of API Reference in docs (#4788)
- [AIRFLOW-3810] Remove duplicate autoclass directive (#4656)
- [AIRFLOW-XXX] Mention that statsd must be installed to gather metrics (#5038)
- [AIRFLOW-XXX] Add contents to cli (#4825)
- [AIRFLOW-XXX] fix check docs failure on CI (#4998)
- [AIRFLOW-XXX] Fix syntax docs errors (#4789)
- [AIRFLOW-XXX] Docs rendering improvement (#4684)
- [AIRFLOW-XXX] Automatically link Jira/GH on doc's changelog page (#4587)
- [AIRFLOW-XXX] Mention Oracle in the Extra Packages documentation (#4987)
- [AIRFLOW-XXX] Drop deprecated sudo option; use default docker compose on Travis. (#4732)
- [AIRFLOW-XXX] Update kubernetes.rst docs (#3875)
- [AIRFLOW-XXX] Improvements to formatted content in documentation (#4835)
- [AIRFLOW-XXX] Add Daniel to committer list (#4961)
- [AIRFLOW-XXX] Add Xiaodong Deng to committers list
- [AIRFLOW-XXX] Add history become ASF top level project (#4757)
- [AIRFLOW-XXX] Move out the examples from integration.rst (#4672)
- [AIRFLOW-XXX] Extract reverse proxy info to a separate file (#4657)
- [AIRFLOW-XXX] Reduction of the number of warnings in the documentation (#4585)
- [AIRFLOW-XXX] Fix GCS Operator docstrings (#4054)
- [AIRFLOW-XXX] Fix Docstrings in Hooks, Sensors & Operators (#4137)
- [AIRFLOW-XXX] Split guide for operators to multiple files (#4814)
- [AIRFLOW-XXX] Split connection guide to multiple files (#4824)
- [AIRFLOW-XXX] Remove almost all warnings from building docs (#4588)
- [AIRFLOW-XXX] Add backreference in docs between operator and integration (#4671)
- [AIRFLOW-XXX] Improve linking to classes (#4655)
- [AIRFLOW-XXX] Mock optional modules when building docs (#4586)
- [AIRFLOW-XXX] Update plugin macros documentation (#4971)
- [AIRFLOW-XXX] Add missing docstring for 'autodetect' in GCS to BQ Operator (#4979)
- [AIRFLOW-XXX] Add missing GCP operators to Docs (#4260)
- [AIRFLOW-XXX] Fixing the issue in Documentation (#3756)
- [AIRFLOW-XXX] Add Hint at user defined macros (#4885)
- [AIRFLOW-XXX] Correct schedule_interval in Scheduler docs (#4157)
- [AIRFLOW-XXX] Improve airflow-jira script to make RelManager's life easier (#4857)
- [AIRFLOW-XXX] Add missing class references to docs (#4644)
- [AIRFLOW-XXX] Fix typo (#4564)
- [AIRFLOW-XXX] Add a doc about fab security (#4595)
- [AIRFLOW-XXX] Speed up DagBagTest cases (#3974)
- [AIRFLOW-XXX] Reduction of the number of warnings in the documentation (#4585)
Airflow 1.10.2, 2019-01-19
--------------------------

Просмотреть файл

@ -56,18 +56,216 @@ deprecated GCP conn_id, you need to explicitly pass their conn_id into
operators/hooks. Otherwise, ``google_cloud_default`` will be used as GCP's conn_id
by default.
### Fixed typo in --driver-class-path in SparkSubmitHook
### Viewer won't have edit permissions on DAG view.
### RedisPy dependency updated to v3 series
### New `dag_discovery_safe_mode` config option
If `dag_discovery_safe_mode` is enabled, only check files for DAGs if
they contain the strings "airflow" and "DAG". For backwards
compatibility, this option is enabled by default.
### Removed deprecated import mechanism
The deprecated import mechanism has been removed so the import of modules becomes more consistent and explicit.
For example: `from airflow.operators import BashOperator`
becomes `from airflow.operators.bash_operator import BashOperator`
### Changes to sensor imports
Sensors are now accessible via `airflow.sensors` and no longer via `airflow.operators.sensors`.
For example: `from airflow.operators.sensors import BaseSensorOperator`
becomes `from airflow.sensors.base_sensor_operator import BaseSensorOperator`
### Renamed "extra" requirements for cloud providers
Subpackages for specific services have been combined into one variant for
each cloud provider. The name of the subpackage for the Google Cloud Platform
has changed to follow style.
If you want to install integration for Microsoft Azure, then instead of
```
pip install 'apache-airflow[azure_blob_storage,azure_data_lake,azure_cosmos,azure_container_instances]'
```
you should execute `pip install 'apache-airflow[azure]'`
If you want to install integration for Amazon Web Services, then instead of
`pip install 'apache-airflow[s3,emr]'`, you should execute `pip install 'apache-airflow[aws]'`
If you want to install integration for Google Cloud Platform, then instead of
`pip install 'apache-airflow[gcp_api]'`, you should execute `pip install 'apache-airflow[gcp]'`.
The old way will work until the release of Airflow 2.1.
### Deprecate legacy UI in favor of FAB RBAC UI
Previously we were using two versions of UI, which were hard to maintain as we need to implement/update the same feature
in both versions. With this change we've removed the older UI in favor of Flask App Builder RBAC UI. No need to set the
RBAC UI explicitly in the configuration now as this is the only default UI.
Please note that that custom auth backends will need re-writing to target new FAB based UI.
As part of this change, a few configuration items in `[webserver]` section are removed and no longer applicable,
including `authenticate`, `filter_by_owner`, `owner_mode`, and `rbac`.
#### Remove run_duration
We should not use the `run_duration` option anymore. This used to be for restarting the scheduler from time to time, but right now the scheduler is getting more stable and therefore using this setting is considered bad and might cause an inconsistent state.
### New `dag_processor_manager_log_location` config option
The DAG parsing manager log now by default will be log into a file, where its location is
controlled by the new `dag_processor_manager_log_location` config option in core section.
### min_file_parsing_loop_time config option temporarily disabled
The scheduler.min_file_parsing_loop_time config option has been temporarily removed due to
some bugs.
### CLI Changes
The ability to manipulate users from the command line has been changed. 'airflow create_user' and 'airflow delete_user' and 'airflow list_users' has been grouped to a single command `airflow users` with optional flags `--create`, `--list` and `--delete`.
Example Usage:
To create a new user:
```bash
airflow users --create --username jondoe --lastname doe --firstname jon --email jdoe@apache.org --role Viewer --password test
```
To list users:
```bash
airflow users --list
```
To delete a user:
```bash
airflow users --delete --username jondoe
```
To add a user to a role:
```bash
airflow users --add-role --username jondoe --role Public
```
To remove a user from a role:
```bash
airflow users --remove-role --username jondoe --role Public
```
### Unification of `do_xcom_push` flag
The `do_xcom_push` flag (a switch to push the result of an operator to xcom or not) was appearing in different incarnations in different operators. It's function has been unified under a common name (`do_xcom_push`) on `BaseOperator`. This way it is also easy to globally disable pushing results to xcom.
See [AIRFLOW-3249](https://jira.apache.org/jira/browse/AIRFLOW-3249) to check if your operator was affected.
## Airflow 1.10.3
### RedisPy dependency updated to v3 series
If you are using the Redis Sensor or Hook you may have to update your code. See
[redis-py porting instructions] to check if your code might be affected (MSET,
MSETNX, ZADD, and ZINCRBY all were, but read the full doc).
[redis-py porting instructions]: https://github.com/andymccurdy/redis-py/tree/3.2.0#upgrading-from-redis-py-2x-to-30
### SLUGIFY_USES_TEXT_UNIDECODE or AIRFLOW_GPL_UNIDECODE no longer required
It is no longer required to set one of the environment variables to avoid
a GPL dependency. Airflow will now always use text-unidecode if unidecode
was not installed before.
### new `sync_parallelism` config option in celery section
The new `sync_parallelism` config option will control how many processes CeleryExecutor will use to
fetch celery task state in parallel. Default value is max(1, number of cores - 1)
### Rename of BashTaskRunner to StandardTaskRunner
BashTaskRunner has been renamed to StandardTaskRunner. It is the default task runner
so you might need to update your config.
`task_runner = StandardTaskRunner`
### Modification to config file discovery
If the `AIRFLOW_CONFIG` environment variable was not set and the
`~/airflow/airflow.cfg` file existed, airflow previously used
`~/airflow/airflow.cfg` instead of `$AIRFLOW_HOME/airflow.cfg`. Now airflow
will discover its config file using the `$AIRFLOW_CONFIG` and `$AIRFLOW_HOME`
environment variables rather than checking for the presence of a file.
### New `dag_discovery_safe_mode` config option
If `dag_discovery_safe_mode` is enabled, only check files for DAGs if
they contain the strings "airflow" and "DAG". For backwards
compatibility, this option is enabled by default.
### Changes in Google Cloud Platform related operators
Most GCP-related operators have now optional `PROJECT_ID` parameter. In case you do not specify it,
the project id configured in
[GCP Connection](https://airflow.apache.org/howto/manage-connections.html#connection-type-gcp) is used.
There will be an `AirflowException` thrown in case `PROJECT_ID` parameter is not specified and the
connection used has no project id defined. This change should be backwards compatible as earlier version
of the operators had `PROJECT_ID` mandatory.
Operators involved:
* GCP Compute Operators
* GceInstanceStartOperator
* GceInstanceStopOperator
* GceSetMachineTypeOperator
* GCP Function Operators
* GcfFunctionDeployOperator
* GCP Cloud SQL Operators
* CloudSqlInstanceCreateOperator
* CloudSqlInstancePatchOperator
* CloudSqlInstanceDeleteOperator
* CloudSqlInstanceDatabaseCreateOperator
* CloudSqlInstanceDatabasePatchOperator
* CloudSqlInstanceDatabaseDeleteOperator
Other GCP operators are unaffected.
### Changes in Google Cloud Platform related hooks
The change in GCP operators implies that GCP Hooks for those operators require now keyword parameters rather
than positional ones in all methods where `project_id` is used. The methods throw an explanatory exception
in case they are called using positional parameters.
Hooks involved:
* GceHook
* GcfHook
* CloudSqlHook
Other GCP hooks are unaffected.
### Changed behaviour of using default value when accessing variables
It's now possible to use `None` as a default value with the `default_var` parameter when getting a variable, e.g.
```python
foo = Variable.get("foo", default_var=None)
if foo is None:
handle_missing_foo()
```
(Note: there is already `Variable.setdefault()` which me be helpful in some cases.)
This changes the behaviour if you previously explicitly provided `None` as a default value. If your code expects a `KeyError` to be thrown, then don't pass the `default_var` argument.
### Removal of `airflow_home` config setting
There were previously two ways of specifying the Airflow "home" directory
(`~/airflow` by default): the `AIRFLOW_HOME` environment variable, and the
`airflow_home` config setting in the `[core]` section.
If they had two different values different parts of the code base would end up
with different values. The config setting has been deprecated, and you should
remove the value from the config file and set `AIRFLOW_HOME` environment
variable if you need to use a non default value for this.
(Since this setting is used to calculate what config file to load, it is not
possible to keep just the config option)
### Change of two methods signatures in `GCPTransferServiceHook`
The signature of the `create_transfer_job` method in `GCPTransferServiceHook`
@ -125,204 +323,22 @@ The default value of `expected_statuses` is SUCCESS so that change is backwards
The class `GoogleCloudStorageToGoogleCloudStorageTransferOperator` has been moved from
`airflow.contrib.operators.gcs_to_gcs_transfer_operator` to `airflow.contrib.operators.gcp_transfer_operator`
the class `S3ToGoogleCloudStorageTransferOperator` has been moved from
the class `S3ToGoogleCloudStorageTransferOperator` has been moved from
`airflow.contrib.operators.s3_to_gcs_transfer_operator` to `airflow.contrib.operators.gcp_transfer_operator`
The change was made to keep all the operators related to GCS Transfer Services in one file.
### New `dag_discovery_safe_mode` config option
The previous imports will continue to work until Airflow 2.0
If `dag_discovery_safe_mode` is enabled, only check files for DAGs if
they contain the strings "airflow" and "DAG". For backwards
compatibility, this option is enabled by default.
### Fixed typo in --driver-class-path in SparkSubmitHook
### Removed deprecated import mechanism
The `driver_classapth` argument to SparkSubmit Hook and Operator was
generating `--driver-classpath` on the spark command line, but this isn't a
valid option to spark.
The deprecated import mechanism has been removed so the import of modules becomes more consistent and explicit.
The argument has been renamed to `driver_class_path` and the option it
generates has been fixed.
For example: `from airflow.operators import BashOperator`
becomes `from airflow.operators.bash_operator import BashOperator`
### Changes to sensor imports
Sensors are now accessible via `airflow.sensors` and no longer via `airflow.operators.sensors`.
For example: `from airflow.operators.sensors import BaseSensorOperator`
becomes `from airflow.sensors.base_sensor_operator import BaseSensorOperator`
### Renamed "extra" requirements for cloud providers
Subpackages for specific services have been combined into one variant for
each cloud provider. The name of the subpackage for the Google Cloud Platform
has changed to follow style.
If you want to install integration for Microsoft Azure, then instead of
```
pip install 'apache-airflow[azure_blob_storage,azure_data_lake,azure_cosmos,azure_container_instances]'
```
you should execute `pip install 'apache-airflow[azure]'`
If you want to install integration for Amazon Web Services, then instead of
`pip install 'apache-airflow[s3,emr]'`, you should execute `pip install 'apache-airflow[aws]'`
If you want to install integration for Google Cloud Platform, then instead of
`pip install 'apache-airflow[gcp_api]'`, you should execute `pip install 'apache-airflow[gcp]'`.
The old way will work until the release of Airflow 2.1.
### Changes in Google Cloud Platform related operators
Most GCP-related operators have now optional `PROJECT_ID` parameter. In case you do not specify it,
the project id configured in
[GCP Connection](https://airflow.apache.org/howto/manage-connections.html#connection-type-gcp) is used.
There will be an `AirflowException` thrown in case `PROJECT_ID` parameter is not specified and the
connection used has no project id defined. This change should be backwards compatible as earlier version
of the operators had `PROJECT_ID` mandatory.
Operators involved:
* GCP Compute Operators
* GceInstanceStartOperator
* GceInstanceStopOperator
* GceSetMachineTypeOperator
* GCP Function Operators
* GcfFunctionDeployOperator
* GCP Cloud SQL Operators
* CloudSqlInstanceCreateOperator
* CloudSqlInstancePatchOperator
* CloudSqlInstanceDeleteOperator
* CloudSqlInstanceDatabaseCreateOperator
* CloudSqlInstanceDatabasePatchOperator
* CloudSqlInstanceDatabaseDeleteOperator
Other GCP operators are unaffected.
### Changes in Google Cloud Platform related hooks
The change in GCP operators implies that GCP Hooks for those operators require now keyword parameters rather
than positional ones in all methods where `project_id` is used. The methods throw an explanatory exception
in case they are called using positional parameters.
Hooks involved:
* GceHook
* GcfHook
* CloudSqlHook
Other GCP hooks are unaffected.
### Deprecate legacy UI in favor of FAB RBAC UI
Previously we were using two versions of UI, which were hard to maintain as we need to implement/update the same feature
in both versions. With this change we've removed the older UI in favor of Flask App Builder RBAC UI. No need to set the
RBAC UI explicitly in the configuration now as this is the only default UI.
Please note that that custom auth backends will need re-writing to target new FAB based UI.
As part of this change, a few configuration items in `[webserver]` section are removed and no longer applicable,
including `authenticate`, `filter_by_owner`, `owner_mode`, and `rbac`.
#### SLUGIFY_USES_TEXT_UNIDECODE or AIRFLOW_GPL_UNIDECODE no longer required
It is no longer required to set one of the environment variables to avoid
a GPL dependency. Airflow will now always use text-unidecode if unidecode
was not installed before.
#### Remove run_duration
We should not use the `run_duration` option anymore. This used to be for restarting the scheduler from time to time, but right now the scheduler is getting more stable and therefore using this setting is considered bad and might cause an inconsistent state.
### Modification to config file discovery
If the `AIRFLOW_CONFIG` environment variable was not set and the
`~/airflow/airflow.cfg` file existed, airflow previously used
`~/airflow/airflow.cfg` instead of `$AIRFLOW_HOME/airflow.cfg`. Now airflow
will discover its config file using the `$AIRFLOW_CONFIG` and `$AIRFLOW_HOME`
environment variables rather than checking for the presence of a file.
### New `dag_processor_manager_log_location` config option
The DAG parsing manager log now by default will be log into a file, where its location is
controlled by the new `dag_processor_manager_log_location` config option in core section.
### new `sync_parallelism` config option in celery section
The new `sync_parallelism` config option will control how many processes CeleryExecutor will use to
fetch celery task state in parallel. Default value is max(1, number of cores - 1)
### Rename of BashTaskRunner to StandardTaskRunner
BashTaskRunner has been renamed to StandardTaskRunner. It is the default task runner
so you might need to update your config.
`task_runner = StandardTaskRunner`
### min_file_parsing_loop_time config option temporarily disabled
The scheduler.min_file_parsing_loop_time config option has been temporarily removed due to
some bugs.
### CLI Changes
The ability to manipulate users from the command line has been changed. 'airflow create_user' and 'airflow delete_user' and 'airflow list_users' has been grouped to a single command `airflow users` with optional flags `--create`, `--list` and `--delete`.
Example Usage:
To create a new user:
```bash
airflow users --create --username jondoe --lastname doe --firstname jon --email jdoe@apache.org --role Viewer --password test
```
To list users:
```bash
airflow users --list
```
To delete a user:
```bash
airflow users --delete --username jondoe
```
To add a user to a role:
```bash
airflow users --add-role --username jondoe --role Public
```
To remove a user from a role:
```bash
airflow users --remove-role --username jondoe --role Public
```
### Unification of `do_xcom_push` flag
The `do_xcom_push` flag (a switch to push the result of an operator to xcom or not) was appearing in different incarnations in different operators. It's function has been unified under a common name (`do_xcom_push`) on `BaseOperator`. This way it is also easy to globally disable pushing results to xcom.
See [AIRFLOW-3249](https://jira.apache.org/jira/browse/AIRFLOW-3249) to check if your operator was affected.
### Changed behaviour of using default value when accessing variables
It's now possible to use `None` as a default value with the `default_var` parameter when getting a variable, e.g.
```python
foo = Variable.get("foo", default_var=None)
if foo is None:
handle_missing_foo()
```
(Note: there is already `Variable.setdefault()` which me be helpful in some cases.)
This changes the behaviour if you previously explicitly provided `None` as a default value. If your code expects a `KeyError` to be thrown, then don't pass the `default_var` argument.
### Removal of `airflow_home` config setting
There were previously two ways of specifying the Airflow "home" directory
(`~/airflow` by default): the `AIRFLOW_HOME` environment variable, and the
`airflow_home` config setting in the `[core]` section.
If they had two different values different parts of the code base would end up
with different values. The config setting has been deprecated, and you should
remove the value from the config file and set `AIRFLOW_HOME` environment
variable if you need to use a non default value for this.
(Since this setting is used to calculate what config file to load, it is not
possible to keep just the config option)
## Airflow 1.10.2