Граф коммитов

5895 Коммитов

Автор SHA1 Сообщение Дата
T. Tanay 4230404abc [AIRFLOW-3643] Add shebang to docs/start_doc_server.sh (#4650)
Since this script uses bash syntax, shebang needs to be added.
2019-02-07 11:32:48 +00:00
mans2singh ba66fa7e77 [AIRFLOW-3802] Updated documentation for HiveServer2Hook (#4647) 2019-02-06 21:44:36 -08:00
mans2singh a1d5b01b10 [AIRFLOW-3817] - Corrected task ids returned by BranchPythonOperator to match the dummy operator ids (#4659) 2019-02-06 21:43:13 -08:00
Andrew Stahlman 2dadee7a24 [AIRFLOW-3813] Add CLI commands to manage roles (#4658)
* [AIRFLOW-3813] Add CLI commands to manage roles

Here is the help text of the new command `airflow roles`:

    usage: airflow roles [-h] [-c] [-l] [role [role ...]]

    positional arguments:
      role          The name of a role

    optional arguments:
      -h, --help    show this help message and exit
      -c, --create  Create a new role
      -l, --list    List roles

Create is reentrant, i.e., it only adds a new role if it does not exist.

* Update docs on role creation
2019-02-06 20:34:50 -08:00
Andrew Stahlman 5275a8ff0e [AIRFLOW-2694] Declare permissions in DAG definition (#4642)
* [AIRFLOW-2694] Declare permissions in DAG definition

This PR adds support for declaratively assigning DAG-level permissions
to a role via the `DAG.__init__()` method.

When the DAG definition is evaluated and the `access_control` argument
is supplied, we update the permissions on the ViewMenu associated with
this DAG according to the following rules:

- If the role does not exist, we raise an exception.
- If the role exists, we ensure that it has the specified set of
  permissions on the DAG
- If any other permissions exist for the DAG that are not specified in
  `access_control`, we revoke them

* Move RBAC constants to break circular dependency

* Add license header

* Sync DAG permissions via CLI and /refresh* endpoints

Move the DAG-level permission syncing logic into
`AirflowSecurityManager.sync_perm_for_dag`, and trigger this method from
the CLI's `sync_perm` command and from the `/refresh` and `/refresh_all`
web endpoints.

* Default access_control to None
2019-02-06 11:49:19 -08:00
Kamil Breguła 6f122f4fc5 [AIRFLOW-XXX] Extract reverse proxy info to a separate file (#4657) 2019-02-06 18:45:46 +00:00
Kamil Breguła 2e19e1842a [AIRFLOW-3810] Remove duplicate autoclass directive (#4656) 2019-02-06 14:35:12 +00:00
BasPH e1d3df1999 [AIRFLOW-3476,3477] Move Kube classes out of models.py (#4443) 2019-02-06 09:55:16 +01:00
Ryo Okubo 6cfadcde7a [AIRFLOW-3814] Add exception details to warning log (#4651)
* Add exception details to warning log

* Fix log format
2019-02-05 21:54:43 -08:00
Kamil Breguła 40c14e5f77 [AIRFLOW-XXX] Add missing class references to docs (#4644) 2019-02-05 20:48:25 +00:00
gseva 7c6ce873f2 [AIRFLOW-XXX] Fixed note in plugins.rst (#4649)
Changing it to rst notation, so it stands out in read the docs.
2019-02-05 10:30:04 -08:00
andyh1203 0f02e45b7e [AIRFLOW-3463] Move Log out of models.py (#4639) 2019-02-01 22:09:41 -08:00
Fokko Driesprong f07f3a8831
[AIRFLOW-XXX] The execution_date is Pendulum 2019-02-01 12:44:02 +01:00
Stefan Seelmann ee5b8c2683 [AIRFLOW-3461] Move TaskFail out of models.py (#4630) 2019-01-31 12:01:52 -08:00
Xiaodong 11f5032527 [AIRFLOW-3782] Clarify docs around celery worker_autoscale in default_airflow.cfg (#4609)
Celery supports `autoscale` by accepting values in format "max_concurrency,min_concurrency".
But the default value in default_airflow.cfg is wrong, and the comment can be clearer.
2019-01-31 14:12:57 +00:00
Kamil Breguła d126d9ef21 [AIRFLOW-3730] Standarization use of logs mechanisms (#4556) 2019-01-31 11:32:23 +00:00
andyh1203 e1c0433127 [AIRFLOW-3471] Move XCom out of models.py (#4629) 2019-01-30 23:37:35 -08:00
Fokko Driesprong 05d43516fb [AIRFLOW-2876] Update Tenacity to 4.12 (#3723)
Tenacity 4.8 is not python 3.7 compatible because it contains
reserved keywords in the code
2019-01-30 22:59:49 -08:00
Tao Feng 0fef65a10f
[AIRFLOW-XXX] Add a doc about fab security (#4595) 2019-01-30 22:50:13 -08:00
Stefan Seelmann 8e6bca1546 [AIRFLOW-3462] Move TaskReschedule out of models.py (#4618) 2019-01-30 14:31:37 -08:00
Andrew Stahlman cd4c61a2de [AIRFLOW-3787] Import/export users from JSON file (#4624)
* [AIRFLOW-3787] Import/export users from JSON file

Provide a CLI command to import or export users from a JSON file. The
CLI arguments are modeled after the import/export commands for Variables
and Pools.

Example Usage:

    airflow users -i users.json
    airflow users -e /tmp/exported-users.json

The import command will create any users that do not yet exist and
update any users that already exist. It never deletes users.

The format of the file produced by an export is compatible with the
import command, i.e., `import(export())` should succeed but have no
side-effects.

* Add input file format to help text
2019-01-30 14:24:19 -08:00
Ash Berlin-Taylor 7ebecd677d [AIRFLOW-3779] Don't install enum34 backport when not needed (#4620)
https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-platform-specific-dependencies

Installing this in more recent versions causes a "AttributeError: module
'enum' has no attribute 'IntFlag'`" in re.py
2019-01-30 11:19:41 -08:00
Drew J. Sonne 82c60a2040 [AIRFLOW-3774] Register blueprints with app (#4598) 2019-01-30 11:10:25 -08:00
Joshua Carp 26d775aa20 [AIRFLOW-3789] Fix flake8 3.7 errors. (#4617) 2019-01-30 16:52:19 +00:00
Stefan Seelmann 2f688f69cb AIRFLOW-3590: Change log message of executor exit status (#4616)
Try to make the log message clearer in the presence of rescheduled tasks -
i.e that the task exited with 0/1, not the status of the task, without having each
executor having to know about reschedule or other states we might introduce.
2019-01-29 21:44:10 +00:00
Ash Berlin-Taylor 0d64fd8aac [AIRFLOW-3742] Respect the `fallback` arg in airflow.configuration.get (#4567)
This argument is part of the API from our parent class, but we didn't
support it because of the various steps we perform in `get()` - this
makes it behave more like the parent class, and can simplify a few
instances in our code (I've only included one that I found here)
2019-01-29 11:56:08 -08:00
Tao Feng fc22c6efad
[AIRFLOW-XXX] Update timezone doc (#4592) 2019-01-29 11:55:13 -08:00
Felix cd9d543b45 [AIRFLOW-3552] Add ImapToS3TransferOperator (#4476)
NOTE: This operator only transfers the latest attachment by name.
2019-01-29 15:05:24 +00:00
zhongjiajie fa09df5707 [AIRFLOW-3734] Fix hql not run when partition is None (#4561) 2019-01-29 12:49:06 +01:00
OmerJog 9f6e5463f8 [AIRFLOW-865] Configure FTP connection mode (#4535) 2019-01-29 11:23:39 +01:00
andyh1203 43e0010271 [AIRFLOW-3474] Move SlaMiss out of models.py (#4608) 2019-01-29 11:19:52 +01:00
Ryan Yuan 03ec4181f3 [AIRFLOW-3762] Add list_jobs to CLI (#4579)
* [AIRFLOW-3762] Add list_jobs to CLI

Add list_jobs to CLI

* [AIRFLOW-3762] Add list_jobs to CLI

Improve test_cli_list_jobs_with_args

*  [AIRFLOW-3762] Add list_jobs to CLI

Directly parse args.limit to list_jobs query

* [AIRFLOW-3762] Add list_jobs to CLI

Format list_jobs code
2019-01-28 13:38:10 -08:00
Anoop Kunjuraman dfffd7a0e9 [AIRFLOW-XXX] Add Capital One to the companies list (#4606) 2019-01-27 21:02:49 -08:00
Sid Anand 8f3982f815
[AIRFLOW-XXX] Add Tinder to the companies list (#4604) 2019-01-27 14:03:09 -08:00
Andrew Stahlman fa21d68d3c [AIRFLOW-3773] Fix /refresh_all endpoint (#4597)
* [AIRFLOW-3773] Fix /refresh_all endpoint

Call `sync_perm_for_dag` for each DAG in the DagBag (`dag_id` is a
required argument).

I looked for a test suite for the web UI, but it seems the existing
tests have all been disabled since the switch to FAB. I've created a new
class for FAB tests and added a test to exercise this `/refresh_all`
endpoint.

* Move tests to www/test_views.py

I didn't realize that we already had test scaffolding in place for
testing the FAB-based UI.
2019-01-27 09:22:06 -08:00
Kamil Breguła 61fb776c3b [AIRFLOW-XXX] Remove profiling link (#4602) 2019-01-27 09:20:59 -08:00
Kamil Breguła 1ab659f4cb [AIRFLOW-XXX] Remove almost all warnings from building docs (#4588) 2019-01-27 11:32:10 +00:00
Xiaodong 59cf865d84 [AIRFLOW-3761] Decommission User & Chart models & Update doc accordingly (#4577)
In master branch, we have already decommissioned the Flask-Admin UI.

In model definitions, User and Chart are only applicable for the
"old" UI based on Flask-Admin.
Hence we should decommission these two models as well.

Related doc are updated in this commit as well.
2019-01-27 00:44:19 -08:00
Tao Feng 5506ef05bc
[AIRFLOW-XXX] Remove images related profiling doc (#4599) 2019-01-26 23:10:41 -08:00
Tao Feng 2f70347bdc
[AIRFLOW-3771] Minor refactor securityManager (#4594) 2019-01-26 22:49:58 -08:00
Andrew Stahlman 40f4370324 [AIRFLOW-2190] Fix TypeError when returning 404 (#4596)
When processing HTTP response headers, gunicorn checks that the name of each
header is a string. Here's the relevant gunicorn code:

From gunicorn/http/wsgi.py, line 257

    def process_headers(self, headers):
        for name, value in headers:
            if not isinstance(name, string_types):
                raise TypeError('%r is not a string' % name)

In Python3, `string_types` is set to the built-in `str`. For Python 2,
it's set to `basestring`. Again, the relevant gunicorn code:

From gunicorn/six.py, line 38:

    if PY3:
        string_types = str,
        ...
    else:
        string_types = basestring,

On Python2 the `b''` syntax returns a `str`, but in Python3 it returns `bytes`.
`bytes` != `str`, so we get the following error when returning a 404 on
Python3:

    File "/usr/local/lib/python3.6/site-packages/airflow/www/app.py", line 166, in root_app
    resp(b'404 Not Found', [(b'Content-Type', b'text/plain')])
    File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 261, in start_response
    self.process_headers(headers)
    File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 268, in process_headers
    raise TypeError('%r is not a string' % name)
    TypeError: b'Content-Type' is not a string

Dropping the `b` prefix in favor of the single-quote string syntax should work
for both Python2 and 3, as demonstrated below:

    Python 3.7.2 (default, Jan 13 2019, 12:50:15)
    [Clang 10.0.0 (clang-1000.11.45.5)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> isinstance('foo', str)
    True

    Python 2.7.15 (default, Jan 12 2019, 21:43:48)
    [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> isinstance('foo', basestring)
    True
2019-01-26 16:13:43 -08:00
Felix aed71a7944 [AIRFLOW-3602] Improve ImapHook handling of retrieving no attachments (#4475) 2019-01-25 21:23:43 +00:00
Jongyoul Lee f27b5252fe [AIRFLOW-3216] HiveServer2Hook need a password with LDAP authentication (#4057) 2019-01-25 21:21:18 +00:00
Ryan Yuan db06d4fd71 [AIRFLOW-3490] Add BigQueryHook's Ability to Patch Table/View (#4299) 2019-01-25 21:18:01 +00:00
yangaws 31e1878ecd [AIRFLOW-3719] Handle StopIteration in CloudWatch logs retrieval (#4516) 2019-01-25 21:15:22 +00:00
Rémy Léone 140101815f [AIRFLOW-3764] Simplify chained comparisons in IF block (#4580) 2019-01-25 21:04:05 +00:00
Kaxil Naik b6f207ff77
[AIRFLOW-XXX] Removes Data Profiling docs as it is not supported in RBAC UI 2019-01-25 21:00:09 +00:00
Ash Berlin-Taylor 34e3485b7f [AIRFLOW-XXX] Mock optional modules when building docs (#4586) 2019-01-25 20:56:57 +00:00
Tao Feng c2f48ed6c8
[AIRFLOW-3745] Fix viewer not able to view dag details (#4569) 2019-01-25 09:41:26 -08:00
Anatoli Babenia 2c7bb17435 [AIRFLOW-XXX] Automatically link Jira/GH on doc's changelog page (#4587) 2019-01-25 13:50:41 +00:00