2015-10-22 02:36:16 +03:00
|
|
|
API Reference
|
|
|
|
=============
|
2014-10-13 09:05:34 +04:00
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
Operators
|
|
|
|
---------
|
2015-06-29 01:57:13 +03:00
|
|
|
Operators allow for generation of certain types of tasks that become nodes in
|
2015-02-02 22:12:04 +03:00
|
|
|
the DAG when instantiated. All operators derive from BaseOperator and
|
2015-06-29 01:57:13 +03:00
|
|
|
inherit many attributes and methods that way. Refer to the BaseOperator
|
|
|
|
documentation for more details.
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-04-05 09:27:00 +03:00
|
|
|
There are 3 main types of operators:
|
|
|
|
|
2015-06-29 01:57:13 +03:00
|
|
|
- Operators that performs an **action**, or tell another system to
|
2015-04-05 09:27:00 +03:00
|
|
|
perform an action
|
2015-06-29 01:57:13 +03:00
|
|
|
- **Transfer** operators move data from one system to another
|
|
|
|
- **Sensors** are a certain type of operator that will keep running until a
|
|
|
|
certain criterion is met. Examples include a specific file landing in HDFS or
|
2015-04-21 17:41:17 +03:00
|
|
|
S3, a partition appearing in Hive, or a specific time of the day. Sensors
|
2015-04-05 09:27:00 +03:00
|
|
|
are derived from ``BaseSensorOperator`` and run a poke
|
|
|
|
method at a specified ``poke_interval`` until it returns ``True``.
|
|
|
|
|
2015-07-16 21:01:16 +03:00
|
|
|
BaseOperator
|
|
|
|
''''''''''''
|
|
|
|
All operators are derived from ``BaseOperator`` and acquire much
|
|
|
|
functionality through inheritance. Since this is the core of the engine,
|
|
|
|
it's worth taking the time to understand the parameters of ``BaseOperator``
|
|
|
|
to understand the primitive features that can be leveraged in your
|
|
|
|
DAGs.
|
|
|
|
|
2015-08-06 22:07:03 +03:00
|
|
|
|
|
|
|
.. autoclass:: airflow.models.BaseOperator
|
2015-07-16 21:01:16 +03:00
|
|
|
|
|
|
|
|
2015-10-06 02:30:22 +03:00
|
|
|
BaseSensorOperator
|
|
|
|
'''''''''''''''''''
|
|
|
|
All sensors are derived from ``BaseSensorOperator``. All sensors inherit
|
|
|
|
the ``timeout`` and ``poke_interval`` on top of the ``BaseOperator``
|
|
|
|
attributes.
|
|
|
|
|
|
|
|
.. autoclass:: airflow.operators.sensors.BaseSensorOperator
|
|
|
|
|
|
|
|
|
2015-07-16 21:01:16 +03:00
|
|
|
Operator API
|
|
|
|
''''''''''''
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
.. automodule:: airflow.operators
|
|
|
|
:show-inheritance:
|
2015-04-21 17:41:17 +03:00
|
|
|
:members:
|
2015-03-08 01:24:11 +03:00
|
|
|
BashOperator,
|
2015-06-26 03:48:03 +03:00
|
|
|
BranchPythonOperator,
|
2015-11-14 04:08:51 +03:00
|
|
|
TriggerDagRunOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
DummyOperator,
|
|
|
|
EmailOperator,
|
|
|
|
ExternalTaskSensor,
|
2015-09-01 00:22:40 +03:00
|
|
|
GenericTransfer,
|
2015-03-08 01:24:11 +03:00
|
|
|
HdfsSensor,
|
|
|
|
Hive2SambaOperator,
|
|
|
|
HiveOperator,
|
|
|
|
HivePartitionSensor,
|
2015-08-03 00:21:32 +03:00
|
|
|
HiveToDruidTransfer,
|
2015-06-03 00:14:11 +03:00
|
|
|
HiveToMySqlTransfer,
|
2015-07-08 16:15:30 +03:00
|
|
|
SimpleHttpOperator,
|
|
|
|
HttpSensor,
|
2015-11-16 21:43:13 +03:00
|
|
|
MetastorePartitionSensor,
|
2015-07-28 05:17:39 +03:00
|
|
|
MsSqlOperator,
|
|
|
|
MsSqlToHiveTransfer,
|
2015-03-12 02:48:15 +03:00
|
|
|
MySqlOperator,
|
2015-04-05 09:27:00 +03:00
|
|
|
MySqlToHiveTransfer,
|
2016-08-10 01:38:57 +03:00
|
|
|
NamedHivePartitionSensor,
|
2015-04-21 17:41:17 +03:00
|
|
|
PostgresOperator,
|
2015-02-03 09:09:49 +03:00
|
|
|
PrestoCheckOperator,
|
|
|
|
PrestoIntervalCheckOperator,
|
|
|
|
PrestoValueCheckOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
PythonOperator,
|
2015-06-03 00:14:11 +03:00
|
|
|
S3KeySensor,
|
2015-04-21 17:41:17 +03:00
|
|
|
S3ToHiveTransfer,
|
2015-10-24 23:46:18 +03:00
|
|
|
ShortCircuitOperator,
|
2015-08-02 04:49:40 +03:00
|
|
|
SlackAPIOperator,
|
|
|
|
SlackAPIPostOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
SqlSensor,
|
|
|
|
SubDagOperator,
|
2015-11-06 21:06:20 +03:00
|
|
|
TimeSensor,
|
|
|
|
WebHdfsSensor
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2016-05-04 08:13:35 +03:00
|
|
|
.. autoclass:: airflow.operators.docker_operator.DockerOperator
|
|
|
|
|
2015-09-19 17:53:45 +03:00
|
|
|
|
2015-10-15 00:59:14 +03:00
|
|
|
Community-contributed Operators
|
2015-09-19 17:53:45 +03:00
|
|
|
'''''''''''''''''''''''''''''''
|
|
|
|
|
|
|
|
.. automodule:: airflow.contrib.operators
|
|
|
|
:show-inheritance:
|
|
|
|
:members:
|
2016-02-01 09:11:36 +03:00
|
|
|
SSHExecuteOperator,
|
2015-09-19 18:26:14 +03:00
|
|
|
VerticaOperator,
|
|
|
|
VerticaToHiveTransfer
|
2015-09-19 17:53:45 +03:00
|
|
|
|
2016-06-29 23:39:15 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.bigquery_operator.BigQueryOperator
|
|
|
|
.. autoclass:: airflow.contrib.operators.bigquery_to_gcs.BigQueryToCloudStorageOperator
|
2016-11-23 21:49:57 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.ecs_operator.ECSOperator
|
2017-04-05 10:56:23 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.file_to_wasb.FileToWasbOperator
|
2016-06-29 23:39:15 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.gcs_download_operator.GoogleCloudStorageDownloadOperator
|
2016-06-02 00:57:33 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.QuboleOperator
|
2016-05-04 08:13:35 +03:00
|
|
|
.. autoclass:: airflow.contrib.operators.hipchat_operator.HipChatAPIOperator
|
|
|
|
.. autoclass:: airflow.contrib.operators.hipchat_operator.HipChatAPISendRoomNotificationOperator
|
|
|
|
|
2015-09-17 02:20:04 +03:00
|
|
|
.. _macros:
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
Macros
|
|
|
|
---------
|
2015-02-03 09:09:49 +03:00
|
|
|
Here's a list of variables and macros that can be used in templates
|
|
|
|
|
|
|
|
|
|
|
|
Default Variables
|
|
|
|
'''''''''''''''''
|
|
|
|
The Airflow engine passes a few variables by default that are accessible
|
|
|
|
in all templates
|
|
|
|
|
|
|
|
================================= ====================================
|
|
|
|
Variable Description
|
|
|
|
================================= ====================================
|
|
|
|
``{{ ds }}`` the execution date as ``YYYY-MM-DD``
|
2016-02-11 17:31:01 +03:00
|
|
|
``{{ ds_nodash }}`` the execution date as ``YYYYMMDD``
|
|
|
|
``{{ yesterday_ds }}`` yesterday's date as ``YYYY-MM-DD``
|
|
|
|
``{{ yesterday_ds_nodash }}`` yesterday's date as ``YYYYMMDD``
|
|
|
|
``{{ tomorrow_ds }}`` tomorrow's date as ``YYYY-MM-DD``
|
|
|
|
``{{ tomorrow_ds_nodash }}`` tomorrow's date as ``YYYYMMDD``
|
2015-12-01 19:30:02 +03:00
|
|
|
``{{ ts }}`` same as ``execution_date.isoformat()``
|
|
|
|
``{{ ts_nodash }}`` same as ``ts`` without ``-`` and ``:``
|
2015-10-15 00:59:14 +03:00
|
|
|
``{{ execution_date }}`` the execution_date, (datetime.datetime)
|
2017-01-29 14:41:36 +03:00
|
|
|
``{{ prev_execution_date }}`` the previous execution date (if available) (datetime.datetime)
|
|
|
|
``{{ next_execution_date }}`` the next execution date (datetime.datetime)
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ dag }}`` the DAG object
|
|
|
|
``{{ task }}`` the Task object
|
2015-10-15 00:59:14 +03:00
|
|
|
``{{ macros }}`` a reference to the macros package, described below
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ task_instance }}`` the task_instance object
|
|
|
|
``{{ end_date }}`` same as ``{{ ds }}``
|
2015-10-15 00:59:14 +03:00
|
|
|
``{{ latest_date }}`` same as ``{{ ds }}``
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ ti }}`` same as ``{{ task_instance }}``
|
2015-10-15 00:59:14 +03:00
|
|
|
``{{ params }}`` a reference to the user-defined params dictionary
|
2016-06-21 20:29:19 +03:00
|
|
|
``{{ var.value.my_var }}`` global defined variables represented as a dictionary
|
|
|
|
``{{ var.json.my_var.path }}`` global defined variables represented as a dictionary
|
|
|
|
with deserialized JSON object, append the path to the
|
|
|
|
key within the JSON object
|
2015-10-15 00:59:14 +03:00
|
|
|
``{{ task_instance_key_str }}`` a unique, human-readable key to the task instance
|
2015-02-03 09:09:49 +03:00
|
|
|
formatted ``{dag_id}_{task_id}_{ds}``
|
2015-07-23 06:45:29 +03:00
|
|
|
``conf`` the full configuration object located at
|
|
|
|
``airflow.configuration.conf`` which
|
|
|
|
represents the content of your
|
|
|
|
``airflow.cfg``
|
2015-10-22 02:36:16 +03:00
|
|
|
``run_id`` the ``run_id`` of the current DAG run
|
2016-02-21 20:49:44 +03:00
|
|
|
``dag_run`` a reference to the DagRun object
|
2015-11-21 20:39:23 +03:00
|
|
|
``test_mode`` whether the task instance was called using
|
|
|
|
the CLI's test subcommand
|
2015-02-03 09:09:49 +03:00
|
|
|
================================= ====================================
|
|
|
|
|
2015-06-29 01:57:13 +03:00
|
|
|
Note that you can access the object's attributes and methods with simple
|
2015-04-21 17:41:17 +03:00
|
|
|
dot notation. Here are some examples of what is possible:
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ task.owner }}``, ``{{ task.task_id }}``, ``{{ ti.hostname }}``, ...
|
2015-06-29 01:57:13 +03:00
|
|
|
Refer to the models documentation for more information on the objects'
|
2015-02-03 09:09:49 +03:00
|
|
|
attributes and methods.
|
|
|
|
|
2016-06-21 20:29:19 +03:00
|
|
|
The ``var`` template variable allows you to access variables defined in Airflow's
|
|
|
|
UI. You can access them as either plain-text or JSON. If you use JSON, you are
|
|
|
|
also able to walk nested structures, such as dictionaries like:
|
|
|
|
``{{ var.json.my_dict_var.key1 }}``
|
|
|
|
|
2015-02-03 09:09:49 +03:00
|
|
|
Macros
|
|
|
|
''''''
|
2016-04-03 16:54:01 +03:00
|
|
|
Macros are a way to expose objects to your templates and live under the
|
2015-11-19 19:40:16 +03:00
|
|
|
``macros`` namespace in your templates.
|
|
|
|
|
|
|
|
A few commonly used libraries and methods are made available.
|
|
|
|
|
2015-11-30 19:25:55 +03:00
|
|
|
|
2015-11-19 19:40:16 +03:00
|
|
|
================================= ====================================
|
|
|
|
Variable Description
|
|
|
|
================================= ====================================
|
2015-11-30 19:25:55 +03:00
|
|
|
``macros.datetime`` The standard lib's ``datetime.datetime``
|
|
|
|
``macros.timedelta`` The standard lib's ``datetime.timedelta``
|
|
|
|
``macros.dateutil`` A reference to the ``dateutil`` package
|
2015-11-19 19:40:16 +03:00
|
|
|
``macros.time`` The standard lib's ``time``
|
|
|
|
``macros.uuid`` The standard lib's ``uuid``
|
|
|
|
``macros.random`` The standard lib's ``random``
|
|
|
|
================================= ====================================
|
|
|
|
|
2015-11-30 19:25:55 +03:00
|
|
|
|
2015-11-19 19:40:16 +03:00
|
|
|
Some airflow specific macros are also defined:
|
2015-02-03 09:09:49 +03:00
|
|
|
|
|
|
|
.. automodule:: airflow.macros
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-02-03 09:09:49 +03:00
|
|
|
:members:
|
2015-02-02 22:12:04 +03:00
|
|
|
|
|
|
|
.. automodule:: airflow.macros.hive
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-02-02 22:12:04 +03:00
|
|
|
:members:
|
|
|
|
|
|
|
|
.. _models_ref:
|
|
|
|
|
2014-10-13 09:05:34 +04:00
|
|
|
Models
|
|
|
|
------
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-10-15 00:59:14 +03:00
|
|
|
Models are built on top of the SQLAlchemy ORM Base class, and instances are
|
2014-10-13 09:05:34 +04:00
|
|
|
persisted in the database.
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-01-17 02:55:11 +03:00
|
|
|
.. automodule:: airflow.models
|
2014-10-20 20:40:43 +04:00
|
|
|
:show-inheritance:
|
2015-01-17 19:26:11 +03:00
|
|
|
:members: DAG, BaseOperator, TaskInstance, DagBag, Connection
|
|
|
|
|
|
|
|
Hooks
|
|
|
|
-----
|
|
|
|
.. automodule:: airflow.hooks
|
|
|
|
:show-inheritance:
|
2015-08-03 00:21:32 +03:00
|
|
|
:members:
|
2015-10-20 20:10:52 +03:00
|
|
|
DbApiHook,
|
2015-08-03 00:21:32 +03:00
|
|
|
HiveCliHook,
|
|
|
|
HiveMetastoreHook,
|
|
|
|
HiveServer2Hook,
|
|
|
|
HttpHook,
|
|
|
|
DruidHook,
|
|
|
|
MsSqlHook,
|
|
|
|
MySqlHook,
|
|
|
|
PostgresHook,
|
|
|
|
PrestoHook,
|
|
|
|
S3Hook,
|
2015-11-06 21:06:20 +03:00
|
|
|
SqliteHook,
|
|
|
|
WebHDFSHook
|
2014-10-13 09:05:34 +04:00
|
|
|
|
2015-10-06 02:30:22 +03:00
|
|
|
Community contributed hooks
|
2015-09-19 17:53:45 +03:00
|
|
|
'''''''''''''''''''''''''''
|
|
|
|
|
|
|
|
.. automodule:: airflow.contrib.hooks
|
|
|
|
:show-inheritance:
|
|
|
|
:members:
|
2015-12-13 05:30:15 +03:00
|
|
|
BigQueryHook,
|
2016-01-13 02:16:20 +03:00
|
|
|
GoogleCloudStorageHook,
|
2015-09-19 17:53:45 +03:00
|
|
|
VerticaHook,
|
2015-11-30 14:51:05 +03:00
|
|
|
FTPHook,
|
2015-10-29 22:01:56 +03:00
|
|
|
SSHHook,
|
|
|
|
CloudantHook
|
2015-09-19 17:53:45 +03:00
|
|
|
|
2016-06-29 23:39:15 +03:00
|
|
|
.. autoclass:: airflow.contrib.hooks.gcs_hook.GoogleCloudStorageHook
|
|
|
|
|
2014-10-13 09:05:34 +04:00
|
|
|
Executors
|
|
|
|
---------
|
2015-04-21 17:41:17 +03:00
|
|
|
Executors are the mechanism by which task instances get run.
|
2015-01-18 02:27:01 +03:00
|
|
|
|
2015-01-17 02:55:11 +03:00
|
|
|
.. automodule:: airflow.executors
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-10-06 02:30:22 +03:00
|
|
|
:members: LocalExecutor, CeleryExecutor, SequentialExecutor
|
|
|
|
|
2015-10-15 00:59:14 +03:00
|
|
|
Community-contributed executors
|
2015-10-06 02:30:22 +03:00
|
|
|
'''''''''''''''''''''''''''''''
|
|
|
|
|
2016-06-29 23:39:15 +03:00
|
|
|
.. autoclass:: airflow.contrib.executors.mesos_executor.MesosExecutor
|