Instead of parsing the DAG definition files in the same process as the
scheduler, this change parses the files in a child process. This helps
to isolate the scheduler from bad user code.
Closes#1636 from plypaul/plypaul_schedule_by_file_rebase_master
Add export option under 'with selected' menu to export selected variables to json.
Add upload file option at top of page to import variables from json file.
The decision was made to avoid flask admin's default export functionality because it
does not handle the possibility of serialized jsons as variable values well.
The import variables field should be made to look nicer.
login is broken in GHE for new users:
```
[2016-07-26 22:11:43,077] {github_enterprise_auth.py:199} ERROR -
Traceback (most recent call last):
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/airflow/contrib/auth/backends/github_enterprise_auth.py", line 188, in oauth_callback
'Null response from GHE, denying access.'
airflow.contrib.auth.backends.github_enterprise_auth.AuthenticationError: Null response from GHE, denying access.
[2016-07-26 22:12:12,313] {app.py:1423} ERROR - Exception on / [GET]
Traceback (most recent call last):
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/_compat.py", line 33, in reraise
raise value
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/virtualenvs/airflow/lib/python3.5/site-packages/airflow/contrib/auth/backends/github_enterprise_auth.py", line 215, in oauth_callback
login_user(GHEUser(user))
NameError: name 'login_user' is not defined```
This PR adds optional resource requirements for tasks for use with
resource managers such as Yarn and Mesos.
Considerations:
- I chose to force users to encapsulate resources in a resources object
e.g. Resources(cpu=1) instead of just cpu=1 in their dag attributes.
This creates the pain of having to import Resources for almost every
DAG. I think this is kind of important for scoping/namespacing which we
should start doing.
- Once resources are used by executors we need to add documentation for
these new resources (and examples)
Testing Done:
- New/existing unit tests
plypaul artwr mistercrunch jlowin bolkedebruin criccomini
Closes#1669 from aoen/ddavydov/ddavydov/augment_tasks_with_resources
The VariableJsonAccessor and VariableAccessor were missing the __repr__
function that leads to a VariableError when printing out the context
being passed to for example a PythonOperator.
This patch adds a hook and operator that allows execution of Spark-
sql queries. The hook is a wrapper around the Spark-sql binary.
Closes#1644 from danielvdende/spark_sql_operator
tests/dags/README.md has a code example such as
"dag = dagbag.get(dag_id)", but DagBag doesn't have a method called get.
That should be fixed as get_dag.
Closes#1654 from aoen/ddavydov/dont_skip_db_state_check_for_subdag
Always check DB state and not just the local state for backfill jobs for
determining which task instances have not yet completed execution.
This is to avoid potential race conditions with e.g. two backfill jobs
running the same task instance.
Dear Airflow Maintainers,
Please accept this PR that addresses the following issues:
- https://issues.apache.org/jira/browse/AIRFLOW-264
CC: Original PR by Jparks2532
https://github.com/apache/incubator-airflow/pull/1384
Add workload management to the hive hook and operator.
Edited operator_helper to avoid KeyError on retrieving conf values.
Refactored hive_cli command preparation in a separate private
method.
Added a small helper to flatten one level of an iterator to a list.
Closes#1614 from artwr/artwr_fixing_hive_queue_PR
Closes#1652 from mtagle/fix_bq_table_upsert
By default, bigquery will only return 50 tables when you ask for a list
of all the tables in a datatset. If you are trying to upsert a table
that exists, but you have more than 50 tables, the run_table_upsert
method may conclude that the table doesn't exist, and try to insert it,
and bigquery will error saying that the table does exist.
This fix checks if the response has pagination data, and looks at all
the pages, rather than just the first one, to see if the table exists.