For parameters num_executors and executor_cores
add casts to strings to prevent issues when these
parameters are passed as integers (as comments specify).
Also fix minor typo that breaks the use of num-executors param.
Closes#1694 from
danielvdende/spark_sql_operator_bugfixes
Add Pause/Resume toggle button to DAG details
page, so one does not
need to go back and forth to view the details and
do the action.
Closes#1818 from msumit/AIRFLOW-544
* Distinguish between module and non-module plugin
components
* Fix handling of non-module plugin components
* admin views, flask blueprints, and menu links
need to not be
wrapped in modules
* Fix improper use of zope.deprecation.deprecated
* zope.deprecation.deprecated does NOT support
classes as
first parameter
* deprecating classes must be handled by calling
the deprecate
function on the class name
* Added tests for plugin loading
* Updated plugin documentation to match test
plugin
* Updated executors to always load plugins
* More logging
Closes#1738 from gwax/plugin_module_fixes
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
https://issues.apache.org/jira/browse/AIRFLOW-530
Right now, the documentation does not clearly
state that connection names are converted to
uppercase form when searched in the environment
(https://github.com/apache/incubator-airflow/blob/
master/airflow/hooks/base_hook.py#L60-L60).
This is confusing as the best practice in Airflow
seems to be to define connections in lower case
form.
Closes#1811 from danielzohar/connection_env_var
There were couple of more fields in Qubole
Operator which requires
support of Jinja templating, so added these missed
out fields as well
to template_fields. Also added a missing doc
(about notify) and an
example of using macros.
Closes#1808 from msumit/AIRFLOW-525
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-198
Testing Done:
- Local testing of dag operation with
LatestOnlyOperator
- Unit test added
Closes#1752 from gwax/latest_only
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
- [AIRFLOW-523](https://issues.apache.org/jira/bro
wse/AIRFLOW-523)
Testing Done:
- Tests passing in Travis CI
Closes#1807 from PedroMDuarte/add-altx-airflow-
user
SSL can now be enabled by providing certificate
and key in the usual
ways (config file or CLI options). Providing the
cert and key will
automatically enable SSL. The web server port will
not automatically
change.
The Security page in the docs now includes an SSL
section with basic
setup information.
Closes#1760 from caseyching/master
This issue happens because job falls asleep during
heartbeat without
closing a session, which holds a connection. This
turns database
connection into IDLE state, but doesn't releases
it for other clients,
so when connection poll get exhausted, they get
blocked for ~heartbeat
timeframe causing global performance degradation.
Closes#1790 from kxepal/AIRFLOW-191-postgresql-
connection-leak
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-512
Testing Done:
- N/A, but ran core tests: `./run_unit_tests.sh
tests.core:CoreTest -s`
Closes#1800 from dgingrich/master
We have a use case to delete BigQuery tables and views. This patch
adds a delete operator that allows us to do so.
Closes#1798 from illop/BigQueryDeleteOperator
Adds metrics for success/failure rates of each operator, that way
when we e.g. do a new release we will have some
signal if there is a regression in an operator. It
will also be useful if e.g. a user wants to
upgrade their infrastructure and make sure that
all of the operators still work as expected.
Testing Done:
- Local staging and make sure that several
operators successes/failures were accurately
reflected
Closes#1785 from aoen/ddavydov/add_per_operator_s
uccess_fail_metrics
Introduced a toplevel table splitter that also
allows for defining a project_id as a suffix seperated by
the legacy colon or the new dot. If the project is not
defined, the default project_id will be used.
As it's so common of all the BigQuery classes
(Hook, Cursor, ...) the same splitter is used over all
those classes.
The documentation is adapted to allow defining the
suffix seperated as with the legacy colon as well
as with the new SQL dotted notation.
The change is 100% backwards compatible. Unit
tests are added for all scenario's, including negative
and compatibility.
Closes#1781 from alexvanboxel/feature/bq_tablename
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
https://issues.apache.org/jira/browse/AIRFLOW-483
Testing Done:
This fix prevented the stdout from being spammed
by the file content.
Closes#1780 from
skogsbaeck/fix/gcs_download_operator
The Druid hook now has hardcoded
`segmentGranularity` - "DAY", we need it
configurable for different use cases.
mistercrunch aoen plypaul
Closes#1771 from
hongbozeng/hongbo/segment_granularity
Dear Airflow Maintainers,
Could you pls add liligo to the Airflow users list
in the Readme?
Thanks in advance!
Please accept this PR that addresses the following
issues:
-
https://issues.apache.org/jira/browse/AIRFLOW-472Closes#1769 from tromika/liligo
Dear Airflow Maintainers,
Please accept this PR that addresses the following
issues:
- *https://issues.apache.org/jira/browse/AIRFLOW-4
63*
As of now the Airflow image icon on top left
doesn't leads users to anywhere. It should take
users to initial landing page, which is generally
happened on most of the other sites.
Closes#1764 from msumit/AIRFLOW-463
Here is the original PR with Max's LGTM:
https://github.com/aoen/incubator-airflow/pull/1
Since then I have made some fixes but this PR is essentially the same.
It could definitely use more eyes as there are likely still issues.
**Goals**
- Simplify, consolidate, and make consistent the logic of whether or not
a task should be run
- Provide a view/better logging that gives insight into why a task
instance is not currently running (no more viewing the scheduler logs
to find out why a task instance isn't running for the majority of
cases):
![image](https://cloud.githubusercontent.com/assets/1592778/17637621/aa669f5e-6099-11e6-81c2-d988d2073aac.png)
**Notable Functional Changes**
- Webserver view + task_failing_deps CLI command to explain why a given
task instance isn't being run by the scheduler
- Running a backfill in the command line and running a task in the UI
will now display detailed error messages based on which dependencies
were not met for a task instead of appearing to succeed but actually
failing silently
- Maximum task concurrency and pools are not respected by backfills
- Backfill now has the equivalent of the old force flag to run even for
successful tasks
This will break one use case:
Using pools to restrict some resource on airflow executors themselves
(rather than an external resource like a DB), e.g. some task uses 60%
of cpu on a worker so we restrict that task's pool size to 1 to
prevent two of the tasks from running on the same host. When
backfilling a task of this type, now the backfill will wait on the
pool to have slots open up before running the task even though we
don't need to do this if backfilling on a different host outside of
the pool. I think breaking this use case is OK since the use case is a
hack due to not having a proper resource isolation solution (e.g.
mesos should be used in this case instead).
- To make things less confusing for users, there is now a "ignore all
dependencies" option for running tasks, "ignore dependencies" has been
renamed to "ignore task dependencies", and "force" has been renamed to
"ignore task instance state". The new "Ignore all dependencies" flag
will ignore the following:
- task instance's pool being full
- execution date for a task instance being in the future
- a task instance being in the retry waiting period
- the task instance's task ending prior to the task instance's
execution date
- task instance is already queued
- task instance has already completed
- task instance is in the shutdown state
- WILL NOT IGNORE task instance is already running
- SLA miss emails will now include all tasks that did not finish for a
particular DAG run, even if
the tasks didn't run because depends_on_past was not met for a task
- Tasks with pools won't get queued automatically the first time they
reach a worker; if they are ready to run they will be run immediately
- Running a task via the UI or via the command line (backfill/run
commands) will now log why a task could not get run if one if it's
dependencies isn't met. For tasks kicked off via the web UI this
means that tasks don't silently fail to get queued despite a
successful message in the UI.
- Queuing a task into a pool that doesn't exist will now get stopped in
the scheduler instead of a worker
**Follow Up Items**
- Update the docs to reference the new explainer views/CLI command
Closes#1729 from aoen/ddavydov/blockedTIExplainerRebasedMaster