2014-10-13 09:05:34 +04:00
|
|
|
Code / API
|
|
|
|
==========
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
Operators
|
|
|
|
---------
|
2015-06-29 01:57:13 +03:00
|
|
|
Operators allow for generation of certain types of tasks that become nodes in
|
2015-02-02 22:12:04 +03:00
|
|
|
the DAG when instantiated. All operators derive from BaseOperator and
|
2015-06-29 01:57:13 +03:00
|
|
|
inherit many attributes and methods that way. Refer to the BaseOperator
|
|
|
|
documentation for more details.
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-04-05 09:27:00 +03:00
|
|
|
There are 3 main types of operators:
|
|
|
|
|
2015-06-29 01:57:13 +03:00
|
|
|
- Operators that performs an **action**, or tell another system to
|
2015-04-05 09:27:00 +03:00
|
|
|
perform an action
|
2015-06-29 01:57:13 +03:00
|
|
|
- **Transfer** operators move data from one system to another
|
|
|
|
- **Sensors** are a certain type of operator that will keep running until a
|
|
|
|
certain criterion is met. Examples include a specific file landing in HDFS or
|
2015-04-21 17:41:17 +03:00
|
|
|
S3, a partition appearing in Hive, or a specific time of the day. Sensors
|
2015-04-05 09:27:00 +03:00
|
|
|
are derived from ``BaseSensorOperator`` and run a poke
|
|
|
|
method at a specified ``poke_interval`` until it returns ``True``.
|
|
|
|
|
2015-07-16 21:01:16 +03:00
|
|
|
BaseOperator
|
|
|
|
''''''''''''
|
|
|
|
All operators are derived from ``BaseOperator`` and acquire much
|
|
|
|
functionality through inheritance. Since this is the core of the engine,
|
|
|
|
it's worth taking the time to understand the parameters of ``BaseOperator``
|
|
|
|
to understand the primitive features that can be leveraged in your
|
|
|
|
DAGs.
|
|
|
|
|
2015-08-06 22:07:03 +03:00
|
|
|
|
|
|
|
.. autoclass:: airflow.models.BaseOperator
|
2015-07-16 21:01:16 +03:00
|
|
|
|
|
|
|
|
|
|
|
Operator API
|
|
|
|
''''''''''''
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
.. automodule:: airflow.operators
|
|
|
|
:show-inheritance:
|
2015-04-21 17:41:17 +03:00
|
|
|
:members:
|
2015-03-08 01:24:11 +03:00
|
|
|
BashOperator,
|
2015-06-26 03:48:03 +03:00
|
|
|
BranchPythonOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
DummyOperator,
|
|
|
|
EmailOperator,
|
|
|
|
ExternalTaskSensor,
|
2015-09-01 00:22:40 +03:00
|
|
|
GenericTransfer,
|
2015-03-08 01:24:11 +03:00
|
|
|
HdfsSensor,
|
|
|
|
Hive2SambaOperator,
|
|
|
|
HiveOperator,
|
|
|
|
HivePartitionSensor,
|
2015-08-03 00:21:32 +03:00
|
|
|
HiveToDruidTransfer,
|
2015-06-03 00:14:11 +03:00
|
|
|
HiveToMySqlTransfer,
|
2015-07-08 16:15:30 +03:00
|
|
|
SimpleHttpOperator,
|
|
|
|
HttpSensor,
|
2015-07-28 05:17:39 +03:00
|
|
|
MsSqlOperator,
|
|
|
|
MsSqlToHiveTransfer,
|
2015-03-12 02:48:15 +03:00
|
|
|
MySqlOperator,
|
2015-04-05 09:27:00 +03:00
|
|
|
MySqlToHiveTransfer,
|
2015-04-21 17:41:17 +03:00
|
|
|
PostgresOperator,
|
2015-02-03 09:09:49 +03:00
|
|
|
PrestoCheckOperator,
|
|
|
|
PrestoIntervalCheckOperator,
|
|
|
|
PrestoValueCheckOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
PythonOperator,
|
2015-06-03 00:14:11 +03:00
|
|
|
S3KeySensor,
|
2015-04-21 17:41:17 +03:00
|
|
|
S3ToHiveTransfer,
|
2015-08-02 04:49:40 +03:00
|
|
|
SlackAPIOperator,
|
|
|
|
SlackAPIPostOperator,
|
2015-03-08 01:24:11 +03:00
|
|
|
SqlSensor,
|
|
|
|
SubDagOperator,
|
|
|
|
TimeSensor
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-09-19 17:53:45 +03:00
|
|
|
|
|
|
|
Community Contributed Operators
|
|
|
|
'''''''''''''''''''''''''''''''
|
|
|
|
|
|
|
|
.. automodule:: airflow.contrib.operators
|
|
|
|
:show-inheritance:
|
|
|
|
:members:
|
2015-09-19 18:26:14 +03:00
|
|
|
VerticaOperator,
|
|
|
|
VerticaToHiveTransfer
|
2015-09-19 17:53:45 +03:00
|
|
|
|
2015-09-17 02:20:04 +03:00
|
|
|
.. _macros:
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
Macros
|
|
|
|
---------
|
2015-02-03 09:09:49 +03:00
|
|
|
Here's a list of variables and macros that can be used in templates
|
|
|
|
|
|
|
|
|
|
|
|
Default Variables
|
|
|
|
'''''''''''''''''
|
|
|
|
The Airflow engine passes a few variables by default that are accessible
|
|
|
|
in all templates
|
|
|
|
|
|
|
|
================================= ====================================
|
|
|
|
Variable Description
|
|
|
|
================================= ====================================
|
|
|
|
``{{ ds }}`` the execution date as ``YYYY-MM-DD``
|
2015-03-05 09:18:24 +03:00
|
|
|
``{{ yesterday_ds }}`` yesterday's date as ``YYYY-MM-DD``
|
|
|
|
``{{ tomorrow_ds }}`` tomorrow's date as ``YYYY-MM-DD``
|
|
|
|
``{{ ds }}`` the execution date as ``YYYY-MM-DD``
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ execution_date }}`` the execution_date, (datateime.datetime)
|
|
|
|
``{{ dag }}`` the DAG object
|
|
|
|
``{{ task }}`` the Task object
|
|
|
|
``{{ macros }}`` a reference to the macros package, described bellow
|
|
|
|
``{{ task_instance }}`` the task_instance object
|
|
|
|
``{{ ds_nodash }}`` the execution date as ``YYYYMMDD``
|
|
|
|
``{{ end_date }}`` same as ``{{ ds }}``
|
|
|
|
``{{ lastest_date }}`` same as ``{{ ds }}``
|
|
|
|
``{{ ti }}`` same as ``{{ task_instance }}``
|
|
|
|
``{{ params }}`` a reference to the user defined params dictionary
|
|
|
|
``{{ task_instance_key_str }}`` a unique, human readable key to the task instance
|
|
|
|
formatted ``{dag_id}_{task_id}_{ds}``
|
2015-07-23 06:45:29 +03:00
|
|
|
``conf`` the full configuration object located at
|
|
|
|
``airflow.configuration.conf`` which
|
|
|
|
represents the content of your
|
|
|
|
``airflow.cfg``
|
2015-02-03 09:09:49 +03:00
|
|
|
================================= ====================================
|
|
|
|
|
2015-06-29 01:57:13 +03:00
|
|
|
Note that you can access the object's attributes and methods with simple
|
2015-04-21 17:41:17 +03:00
|
|
|
dot notation. Here are some examples of what is possible:
|
2015-02-03 09:09:49 +03:00
|
|
|
``{{ task.owner }}``, ``{{ task.task_id }}``, ``{{ ti.hostname }}``, ...
|
2015-06-29 01:57:13 +03:00
|
|
|
Refer to the models documentation for more information on the objects'
|
2015-02-03 09:09:49 +03:00
|
|
|
attributes and methods.
|
|
|
|
|
|
|
|
Macros
|
|
|
|
''''''
|
|
|
|
These macros live under the ``macros`` namespace in your templates.
|
|
|
|
|
|
|
|
.. automodule:: airflow.macros
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-02-03 09:09:49 +03:00
|
|
|
:members:
|
2015-02-02 22:12:04 +03:00
|
|
|
|
|
|
|
.. automodule:: airflow.macros.hive
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-02-02 22:12:04 +03:00
|
|
|
:members:
|
|
|
|
|
|
|
|
.. _models_ref:
|
|
|
|
|
2014-10-13 09:05:34 +04:00
|
|
|
Models
|
|
|
|
------
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-06-29 01:57:13 +03:00
|
|
|
Models are built on top of th SQLAlchemy ORM Base class, and instances are
|
2014-10-13 09:05:34 +04:00
|
|
|
persisted in the database.
|
|
|
|
|
2015-02-02 22:12:04 +03:00
|
|
|
|
2015-01-17 02:55:11 +03:00
|
|
|
.. automodule:: airflow.models
|
2014-10-20 20:40:43 +04:00
|
|
|
:show-inheritance:
|
2015-01-17 19:26:11 +03:00
|
|
|
:members: DAG, BaseOperator, TaskInstance, DagBag, Connection
|
|
|
|
|
|
|
|
Hooks
|
|
|
|
-----
|
|
|
|
.. automodule:: airflow.hooks
|
|
|
|
:show-inheritance:
|
2015-08-03 00:21:32 +03:00
|
|
|
:members:
|
|
|
|
HiveCliHook,
|
|
|
|
HiveMetastoreHook,
|
|
|
|
HiveServer2Hook,
|
|
|
|
HttpHook,
|
|
|
|
DruidHook,
|
|
|
|
MsSqlHook,
|
|
|
|
MySqlHook,
|
|
|
|
PostgresHook,
|
|
|
|
PrestoHook,
|
|
|
|
S3Hook,
|
|
|
|
SqliteHook
|
2014-10-13 09:05:34 +04:00
|
|
|
|
2015-09-19 17:53:45 +03:00
|
|
|
Community Contributed Hooks
|
|
|
|
'''''''''''''''''''''''''''
|
|
|
|
|
|
|
|
.. automodule:: airflow.contrib.hooks
|
|
|
|
:show-inheritance:
|
|
|
|
:members:
|
|
|
|
VerticaHook,
|
2015-09-19 18:26:14 +03:00
|
|
|
FTPHook
|
2015-09-19 17:53:45 +03:00
|
|
|
|
2014-10-13 09:05:34 +04:00
|
|
|
Executors
|
|
|
|
---------
|
2015-04-21 17:41:17 +03:00
|
|
|
Executors are the mechanism by which task instances get run.
|
2015-01-18 02:27:01 +03:00
|
|
|
|
2015-01-17 02:55:11 +03:00
|
|
|
.. automodule:: airflow.executors
|
2015-04-21 17:41:17 +03:00
|
|
|
:show-inheritance:
|
2015-09-17 12:17:58 +03:00
|
|
|
:members: LocalExecutor, CeleryExecutor, SequentialExecutor, MesosExecutor
|