[AIRFLOW-1652] Push DatabricksRunSubmitOperator
metadata into XCOM
Push run_id and run_page_url into xcom so
callbacks and other
tasks can reference this information
address comments
Closes#2641 from andrewmchen/databricks-xcom
PrestoHook.insert_rows() raises
NotImplementedError for now.
But Presto 0.126+ allows specifying column names
in INSERT queries,
so we can leverage DbApiHook.insert_rows() almost
as is.
This PR enables this function.
Closes#3146 from sekikn/AIRFLOW-2234
HiveOperator can only replace variables via jinja
and the replacements
are global to the dag through the context and
user_defined_macros.
It would be much more flexible to open up
hive_conf to the HiveOperator
level so hive scripts can be recycled at the task
level, leveraging
HiveHook already existing hive_conf param and
_prepare_hiveconf
function.
Closes#3136 from wolfier/AIRFLOW-1153
The logs are kept inside of the worker pod. By
attaching a persistent
disk we keep the logs and make them available for
the webserver.
- Remove the requirements.txt since we dont want
to maintain another
dependency file
- Fix some small casing stuff
- Removed some unused code
- Add missing shebang lines
- Started on some docs
- Fixed the logging
Closes#3252 from Fokko/airflow-2357-pd-for-logs
A bug existed when default_args did contain
start_date
but it was set to None, failing to instantiate the
DAG.
Closes#3256 from bolkedebruin/AIRFLOW-2351
Fix issue with backfill jobs of dags, where tasks
in the
removed state are not run but still considered to
be pending,
causing an indefinite loop.
Closes#3176 from ji-han/AIRFLOW-
2270_dag_backfill_removed_tasks
`airflow connections -l` uses 'tabulate' package
with
fancy_grid format, which outputs box drawing
characters.
It can occur UnicodeEncodeError with pipe or
redirect,
since the default encoding for Python 2.x is
ascii.
This PR fixes it and contains some flask8 related
fixes.
Closes#3244 from sekikn/AIRFLOW-2344
To improve efficiency and usability, this PR adds
S3 Select functionarity to S3ToHiveTransfer.
It also contains some minor fixes for documents
and comments.
Closes#3243 from sekikn/AIRFLOW-2300
Handle too old resource versions and throw exceptions on errors
- K8s API errors will now throw Airflow exceptions
- Add scheduler uuid to worker pod labels to match the two
* Added in executor_config to the task_instance table and the base_operator table
* Fix test; bump up number of examples
* Fix up comments from PR
* Exclude the kubernetes example dag from a test
* Fix dict -> KubernetesExecutorConfig
* fixed up executor_config comment and type hint
Add kubernetes config section in airflow.cfg and Inject GCP secrets upon executor start. (#17)
Update Airflow to Pass configuration to k8s containers, add some Py3 … (#9)
* Update Airflow to Pass configuration to k8s containers, add some Py3 compat., create git-sync pod
* Undo changes to display-source config setter for to_dict
* WIP Secrets and Configmaps
* Improve secrets support for multiple secrets. Add support for registry secrets. Add support for RBAC service accounts.
* Swap order of variables, overlooked very basic issue
* Secret env var names must be upper
* Update logging
* Revert spothero test code in setup.py
* WIP Fix tests
* Worker should be using local executor
* Consolidate worker setup and address code review comments
* reconfigure airflow script to use new secrets method
Make sure you have checked _all_ steps below.
### JIRA
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2347
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.
### Description
- [x] Here are some details about my PR, including
screenshots of any UI changes:
Add a company to the README
### Tests
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason: N/A -- documentation update only
### Commits
- [x] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
### Documentation
- [x] In case of new functionality, my PR adds
documentation that describes how to use it.
- When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.
### Code Quality
- [x] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#3242 from
r39132/Add_banco_Formaturas_to_readme
As of 0.17.0 dask distributed has support for
TLS/SSL.
[dask] Added TLS/SSL support for the dask-
distributed scheduler.
As of 0.17.0 dask distributed has support for
TLS/SSL.
Add a test for tls under dask distributed
Closes#2683 from mariusvniekerk/dask-ssl
Make sure you have checked _all_ steps below.
- [x] My PR addresses the following [Airflow JIRA]
(https://issues.apache.org/jira/browse/AIRFLOW/)
issues and references them in the PR title. For
example, "\[AIRFLOW-XXX\] My Airflow PR"
-
https://issues.apache.org/jira/browse/AIRFLOW-2335
- In case you are fixing a typo in the
documentation you can prepend your commit with
\[AIRFLOW-XXX\], code changes always need a JIRA
issue.
- [x] Here are some details about my PR, including
screenshots of any UI changes:
There is an issue with travis pulling jdk8 that is
preventing CI jobs from running. This blocks
further development of the project.
Reference: https://github.com/travis-ci/travis-
ci/issues/9512#issuecomment-382235301
- [x] My PR adds the following unit tests __OR__
does not need testing for this extremely good
reason:
This PR can't be unit tested since it is just
configuration. However, the fact that unit tests
run successfully should show that it works.
- [ ] My commits all reference JIRA issues in
their subject lines, and I have squashed multiple
commits if they address the same issue. In
addition, my commits follow the guidelines from
"[How to write a good git commit
message](http://chris.beams.io/posts/git-
commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not
"adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
- [ ] In case of new functionality, my PR adds
documentation that describes how to use it.
- When adding new operators/hooks/sensors, the
autoclass documentation generation needs to be
added.
- [ ] Passes `git diff upstream/master -u --
"*.py" | flake8 --diff`
Closes#3236 from dimberman/AIRFLOW-
2335_travis_issue
Currently, S3FileTransformOperator downloads the
whole file from S3
before transforming and uploading it. Adding
extraction feature using
S3 Select to this operator improves its efficiency
and usablitily.
Closes#3227 from sekikn/AIRFLOW-2299
Currently, data is ordered by first column in
descending order
Header row comes as first only if the first column
is integer
This fix puts header as first row regardless of
first column data type
Closes#3180 from sathyaprakashg/AIRFLOW-2254
The command versions of config parameters were
overriden by the
default config. E.g sql_alchemy_conn got the
default value even
when sql_alchemy_conn_cmd was specified.
Closes#3029 from cjgu/airflow-610