Граф коммитов

5147 Коммитов

Автор SHA1 Сообщение Дата
Tao feng b81bd08a33 [AIRFLOW-2538] Update faq doc on how to reduce airflow scheduler latency
Make sure you have checked _all_ steps below.

### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2538
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Update the faq doc on how to reduce airflow
scheduler latency. This comes from our internal
production setting which also aligns with Maxime's
email(https://lists.apache.org/thread.html/%3CCAHE
Ep7WFAivyMJZ0N+0Zd1T3nvfyCJRudL3XSRLM4utSigR3dQmai
l.gmail.com%3E).

### Tests
- [ ] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [ ] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [ ] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [ ] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3434 from feng-tao/update_faq
2018-05-31 22:01:59 -07:00
r39132 c6681681d4 closes apache/incubator-airflow#3310 *Fixed in another PR.* 2018-05-31 14:33:28 -07:00
Stefan Seelmann 5f79465bd8 [AIRFLOW-2529] Improve graph view performance and usability
Limit number of dag runs shown in drop down. Add base date
and number of runs widgets known from other views which
allows kind of paging through all dag runs.
2018-05-30 23:11:53 +02:00
Chao-Han Tsai 3ed25a9459 [AIRFLOW-2517] backfill support passing key values through CLI
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
    -
https://issues.apache.org/jira/browse/AIRFLOW-2517
    - In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.

### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
In backfill, we can provide key-value pairs
through CLI and those pairs can be accessed
through macros. This is just like the way
`trigger_dag -c` works [1].

Let's walk through an example.

In the airflow CLI we specify a key-value pair.
```
airflow backfill hello_world -s 2018-02-01 -e
2018-02-08 -c '{"text": "some text"}'
```

In the DAG file, I have a `BashOperator` that
contains a template command and I want
{{ dag_run.conf.text }} resolves to the text I
passed in CLI.
```python
templated_command = """
    echo "ds = {{ ds }}"
    echo "prev_ds = {{
macros.datetime.strftime(prev_execution_date,
"%Y-%m-%d") }}"
    echo "next_ds = {{
macros.datetime.strftime(next_execution_date,
"%Y-%m-%d") }}"
    echo "text_through_conf = {{ dag_run.conf.text }}"
"""

bash_operator = BashOperator(
    task_id='bash_task',
    bash_command=templated_command,
    dag=dag
    )
```
Rendered Bash command in Airflow UI.
<img width="1246" alt="screen shot 2018-05-22 at 4
33 59 pm" src="https://user-images.githubuserconte
nt.com/6065051/40395666-04c41574-5dde-11e8-9ec2-c0
312b7203e6.png">

[1]
https://airflow.apache.org/cli.html#trigger_dag

### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:

### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not
"adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
    - When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.

### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`

Closes #3406 from milton0825/backfill-support-conf
2018-05-30 10:50:06 -07:00
roc 7945854cc4 [AIRFLOW-2532] Support logs_volume_subpath for KubernetesExecutor
The kubernetes section in the configuration file
supports
logs_volume_subpath for KubernetesExecutor.

Closes #3430 from imroc/AIRFLOW-2532
2018-05-30 11:14:13 +02:00
George Leslie-Waksman 0e892ccd70 [AIRFLOW-2466] consider task_id in _change_state_for_tis_without_dagrun
Closes #3360 from gwax/AF2466
2018-05-30 10:59:01 +02:00
Craig Rodrigues 7c1d7db3db [AIRFLOW-2519] Fix CeleryExecutor with SQLAlchemy
When using a CeleryExecutor with SQLAlchemy
specified in broker_url, such as:

broker_url = sqla+mysql://airflow:airflow@localhos
t:3306/airflow

do not pass invalid options to the sqlalchemy
backend.

 - In default_airflow.cfg, comment out
visibility_timeout from
   [celery_broker_transport_options].  The user can
specify the
   correct values in this section for the celery
broker transport
   that they choose.  visibility_timeout is only
valid
   for Redis and SQS celery brokers.

 - Move ssl options from
[celery_broker_transport_options] where
   they were wrongly placed, into the [celery]
section where they
   belong.

Closes #3417 from rodrigc/AIRFLOW-2519
2018-05-30 10:34:41 +02:00
Kevin Yang 7c34354427 [AIRFLOW-2402] Fix RBAC task log
Closes #3319 from yrqls21/kevin_yang_fix_rbac_view
2018-05-29 20:46:01 +01:00
Marcelo Santino 9b661fa613 [AIRFLOW-XXX] Add M4U to user list
Closes #3426 from
msantino/AIRFLOW-2437-add_m4u_to_users_list
2018-05-29 20:38:11 +01:00
Chao-Han Tsai d5d97dc971 [AIRFLOW-2536] docs about how to deal with airflow initdb failure
Add docs to faq.rst to talk about how to deal with
Exception: Global variable
explicit_defaults_for_timestamp needs to be on (1)
for mysql

Closes #3429 from milton0825/fix-docs
2018-05-29 20:29:27 +01:00
Shintaro Murakami 11e670ddbc [AIRFLOW-2530] KubernetesOperator supports multiple clusters
Closes #3425 from mrkm4ntr/airflow-2530
2018-05-28 16:51:22 +02:00
Tatsiana Klionskaya f77a93191d [AIRFLOW-1499] Eliminate duplicate and unneeded code
Closes #2509 from bananarepublic/airflow-1499
2018-05-28 16:27:02 +02:00
Maxime Beauchemin 32d15a3480 [AIRFLOW-2521] backfill - make variable name and logging messages more acurate
[AIRFLOW-2521] backfill - make variable name and
logging messages more accurate

The term kicked_off in logging and the variable
started are used to
refer to `running` task instances. Let's clarify
the variable names and
messages here.

Fixing unit tests

Closes #3416 from mistercrunch/kicked_off_running
2018-05-28 16:25:22 +02:00
Tao feng 45c0c54792 [AIRFLOW-2429] Fix hook, macros folder flake8 error
Closes #3420 from feng-tao/flake8_p4
2018-05-28 16:23:44 +02:00
davideberdin 2f930b6718 [Airflow-XXX] add Prime to company list
Closes #3424 from davideberdin/master
2018-05-28 16:12:01 +02:00
Joy Gao 432ac718b1 Merge pull request #3421 from sekikn/AIRFLOW-2525 2018-05-25 10:18:13 -07:00
Kengo Seki dabf1b962d [AIRFLOW-2525] Fix PostgresHook.copy_expert to work with "COPY FROM"
For now PostgresHook.copy_expert supports
"COPY TO" but not "COPY FROM", because it
opens a file with write mode and doesn't
commit operations. This PR fixes it by
opening a file with read and write mode
and committing operations at last.
2018-05-25 10:52:14 -04:00
Ash Berlin-Taylor ba84b6f4a9 Merge pull request #2701 from mrkm4ntr/airflow-1730 2018-05-25 09:53:51 +01:00
Kengo Seki c97ad43634 [AIRFLOW-2515] Add dependency on thrift_sasl to hive extra
This PR adds a dependency on thrift_sasl to hive
extra
so that HiveServer2Hook.get_conn() works.

Closes #3408 from sekikn/AIRFLOW-2515
2018-05-25 10:45:23 +02:00
Tim Swast 4c0d67f0d0 [AIRFLOW-2523] Add how-to for managing GCP connections
I'd like to have how-to guides for all connection
types, or at least the
different categories of connection types. I found
it difficult to figure
out how to manage a GCP connection, this commit
add a how-to guide for
this.

Also, since creating and editing connections
really aren't all that
different, the PR renames the "creating
connections" how-to to "managing
connections".

Closes #3419 from tswast/howto
2018-05-25 09:37:29 +01:00
Chao-Han Tsai 66f00bbf7b [AIRFLOW-2510] Introduce new macros: prev_ds and next_ds
Closes #3418 from milton0825/introduce-next_ds-
prev_ds
2018-05-25 10:13:49 +02:00
Shintaro Murakami c6deeb2ff4 [AIRFLOW-1730] Unpickle value of XCom queried from DB 2018-05-25 15:22:09 +09:00
Kengo Seki e4e7b55ad7 [AIRFLOW-2518] Fix broken ToC links in integration.rst
Closes #3412 from sekikn/AIRFLOW-2518
2018-05-24 21:55:19 +01:00
Roberth Kulbin 2d50ba4336 [AIRFLOW-1472] Fix SLA misses triggering on skipped tasks.
Closes #3370 from milliburn/airflow-1472-master
2018-05-24 21:14:32 +01:00
Maxime Beauchemin 972086aeba [AIRFLOW-2520] CLI - make backfill less verbose
Used backfill recently and it would log a shit ton
of logging messages
telling me all the tasks that were not ready to
run at every tick.

These messages are not useful and should be muted
by default.

I understand that this may be helpful in the
context of `airflow run`
in the context where dependencies aren't met, so
decided to manage
a flag instead of simply going `logging.debug` on
it.

Closes #3414 from
mistercrunch/backfill_less_verbose
2018-05-24 21:08:35 +01:00
ben.marengo 5747f58499 [AIRFLOW-2107] add time_partitioning to run_query on BigQueryBaseCursor
Closes #3043 from marengaz/query_time_part
2018-05-24 21:04:33 +01:00
Willem van Asperen 49556917d9 [AIRFLOW-1057][AIRFLOW-1380][AIRFLOW-2362][2362] AIRFLOW Update DockerOperator to new API
update import to docker's new API version >=2.0.0
changed dependency for docker package; now docker
rather than docker-py
updated test cases to align to new docker class

Closes #3407 from Noremac201/fixer
2018-05-24 10:25:22 +02:00
Max Payton ce9c7bbdfc [AIRFLOW-2415] Make airflow DAG templating render numbers
Currently, if you have an operator with a template
fields argument, that is a dictionary, e.g.:
template_fields = ([dict_args])

And you populate that dictionary with a field that
an integer in a DAG, e.g.:
...
dict_args = {'ds': '{{ ds }}', num_times: 5}
...

Then ariflow will give you the following error:
{base_task_runner.py:95} INFO - Subtask:
airflow.exceptions.AirflowException: Type '<type
'int'>' used for parameter 'dict_args[num_times]'
is not supported for templating

This fix aims to resolves that issue by
immediately resolving numbers without attempting
to template them

Closes #3410 from
ArgentFalcon/support_numeric_template_fields
2018-05-24 10:03:31 +02:00
Kengo Seki 357e46d27c [AIRFLOW-2473] Fix wrong skip condition for TransferTests
This PR fixes wrong @skipUnlessImported which
decorates
TransferTests and does minor refactoring.

Closes #3411 from sekikn/AIRFLOW-2473
2018-05-24 09:22:34 +02:00
r39132 cedcdb1f86 closes apache/incubator-airflow#3403 *Obsolete PR.* 2018-05-23 14:02:37 -07:00
Kengo Seki 3bdb34e777 [AIRFLOW-2472] Implement MySqlHook.bulk_dump
Implement MySqlHook.bulk_dump since the opposite
operation bulk_load is already implemented.
This PR also addresses some flake8 warnings.

Closes #3385 from sekikn/AIRFLOW-2472
2018-05-23 13:58:04 -07:00
milanvdm f1ac67bdc6 [AIRFLOW-2419] Use default view for subdag operator
Closes #3314 from milanvdm/milanvdm/subdag_view
2018-05-23 11:46:13 -07:00
Kaxil Naik 62b95f8bce [AIRFLOW-2498] Fix Unexpected argument in SFTP Sensor
- The SFTP sensor is using SFTP hook and passing
`sftp_conn_id` to `sftp_conn_id` parameter which
doesn't exist. The solution would be to remove the
parameter name, hence defaulting to first
parameter which in this case would be
`ftp_conn_id`

Closes #3392 from kaxil/AIRFLOW-2498
2018-05-23 13:09:34 +01:00
Tim Swast 084bc91367 [AIRFLOW-2509] Separate config docs into how-to guides
Also moves how-to style instructions for logging
from "integration" page
to a "Writing Logs" how-to.

Closes #3400 from tswast/howto
2018-05-23 10:08:53 +01:00
Tao feng d52e9e6642 [AIRFLOW-2429] Add BaseExecutor back
Closes #3401 from feng-tao/quick_fix_airflow_2429
2018-05-23 00:14:44 +02:00
Tao feng 272952a9dc [AIRFLOW-2429] Fix dag, example_dags, executors flake8 error
Closes #3398 from feng-tao/flake8_p3
2018-05-22 15:31:29 +01:00
Kaxil Naik 1f0a717b65 [AIRFLOW-2502] Change Single triple quotes to double for docstrings
- Changed single triple quotes to double quote
characters to be consistent with the docstring
convention in PEP 257

Closes #3396 from kaxil/AIRFLOW-2502
2018-05-21 23:22:35 +02:00
Kaxil Naik 5d3242cbcf [AIRFLOW-2503] Fix broken links in CONTRIBUTING.md
- Fix broken links in `CONTRIBUTING.md`

Closes #3397 from kaxil/AIRFLOW-2503
2018-05-21 23:20:51 +02:00
Tim Swast 702e62411b [AIRFLOW-2501] Refer to devel instructions in docs contrib guide
Without the devel extra, the docs do not build.
The build fails due to
missing the mock package.

Closes #3395 from tswast/airflow-2501-docs-
contributing
2018-05-21 20:58:00 +01:00
Kaxil Naik 06b62c42b0 [AIRFLOW-2429] Fix contrib folder's flake8 errors
Fix contrib/ folder's flake8 errors

Closes #3394 from kaxil/AIRFLOW-2429
2018-05-21 19:40:07 +01:00
Kengo Seki 1db3073374 [AIRFLOW-2471] Fix HiveCliHook.load_df to use unused parameters
This PR fixes HiveCliHook.load_df to pass
load_file the parameter called create and
recreate, which are currently ignored, as
part of kwargs.

Closes #3390 from sekikn/AIRFLOW-2471
2018-05-21 19:15:51 +01:00
Harish 8bad4c943d Added user in readme
Closes #3387 from harishbisht/master
2018-05-21 19:08:36 +01:00
Craig Rodrigues 4fc61a3518 [AIRFLOW-2495] Update celery to 4.1.1
This also updates the kombu dependency to >=
4.2.0.
This allows broker_url with sqlalchemy urls to
work.

Closes #3388 from rodrigc/AIRFLOW-2495
2018-05-21 16:20:11 +01:00
Tao feng da1f64cf80 [AIRFLOW-2429] Fix api, bin, config_templates folders flake8 error
Closes #3391 from feng-tao/flake8_p2
2018-05-21 16:13:23 +01:00
roc dc78b91967 [AIRFLOW-2493] Mark template_fields of all Operators in the API document as "templated"
Make all the "template_fields" (jinjia template)
of all Operators marked as "templated" in the API
document.

Closes #3386 from imroc/AIRFLOW-2493
2018-05-20 14:03:58 +01:00
Fokko Driesprong b755d35479 [AIRFLOW-2489] Update FlaskAppBuilder to 1.11.1
This will bring Airflow back up to date and will
allow us to run
Flask 0.12.4

Closes #3382 from Fokko/AIRFLOW-2489-update-
dependencies
2018-05-20 12:00:52 +01:00
Kengo Seki 67b351183b [AIRFLOW-2448] Enhance HiveCliHook.load_df to work with datetime
HiveCliHook.load_df can not handle DataFrame
which contains datetime for now.
This PR enhances it to work with datetime,
fixes some bug introduced by AIRFLOW-2441,
and addresses some flake8 issues.

Closes #3364 from sekikn/AIRFLOW-2448
2018-05-19 17:00:47 +01:00
Tao feng e48b8e36af [AIRFLOW-2487] Enhance druid ingestion hook
Closes #3380 from feng-tao/aiflow-2487
2018-05-19 14:42:13 +01:00
roc fff87b5cfd [AIRFLOW-2397] Support affinity policies for Kubernetes executor/operator
KubernetesPodOperator now accept a dict type
parameter called "affinity", which represents a
group of affinity scheduling rules (nodeAffinity,
podAffinity, podAntiAffinity).

API reference: https://kubernetes.io/docs/referenc
e/generated/kubernetes-api/v1.10/#affinity-v1-core

Closes #3369 from imroc/AIRFLOW-2397
2018-05-19 00:47:53 +02:00
Kaxil Naik 8482b208b5 [AIRFLOW-2482] Add test for rewrite method in GCS Hook
- Added mocking test for rewrite method for GCS
hook

Closes #3374 from kaxil/gcs-hook-test-rewrite
2018-05-19 00:44:10 +02:00