115 строки
5.5 KiB
Plaintext
115 строки
5.5 KiB
Plaintext
# INSTALL / BUILD instructions for Apache Airflow
|
|
|
|
This ia a generic installation method that requires a number of dependencies to be installed.
|
|
|
|
Depending on your system you might need different prerequisites, but the following
|
|
systems/prerequisites are known to work:
|
|
|
|
Linux (Debian Buster and Linux Mint Tricia):
|
|
|
|
sudo apt install build-essentials python3.6-dev python3.7-dev python-dev openssl \
|
|
sqlite sqlite-dev default-libmysqlclient-dev libmysqld-dev postgresq
|
|
|
|
MacOS (Mojave/Catalina):
|
|
|
|
brew install sqlite mysql postgresql
|
|
|
|
# [required] fetch the tarball and untar the source move into the directory that was untarred.
|
|
|
|
# [optional] run Apache RAT (release audit tool) to validate license headers
|
|
# RAT docs here: https://creadur.apache.org/rat/. Requires Java and Apache Rat
|
|
java -jar apache-rat.jar -E ./.rat-excludes -d .
|
|
|
|
# [optional] Airflow pulls in quite a lot of dependencies in order
|
|
# to connect to other services. You might want to test or run Airflow
|
|
# from a virtual env to make sure those dependencies are separated
|
|
# from your system wide versions
|
|
|
|
python3 -m venv PATH_TO_YOUR_VENV
|
|
source PATH_TO_YOUR_VENV/bin/activate
|
|
|
|
NOTE!!
|
|
|
|
On November 2020, new version of PIP (20.3) has been released with a new, 2020 resolver. This resolver
|
|
does not yet work with Apache Airflow and might lead to errors in installation - depends on your choice
|
|
of extras. In order to install Airflow you need to either downgrade pip to version 20.2.4
|
|
``pip install --upgrade pip==20.2.4`` or, in case you use Pip 20.3, you need to add option
|
|
``--use-deprecated legacy-resolver`` to your pip install command.
|
|
|
|
# [required] building and installing by pip (preferred)
|
|
pip install .
|
|
|
|
# or directly
|
|
python setup.py install
|
|
|
|
# You can also install recommended version of the dependencies by using
|
|
# constraint-python<PYTHON_MAJOR_MINOR_VERSION>.txt files as constraint file. This is needed in case
|
|
# you have problems with installing the current requirements from PyPI.
|
|
# There are different constraint files for different python versions. For example"
|
|
|
|
pip install . \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt"
|
|
|
|
|
|
By default `pip install` in Airflow 2.0 installs only the provider packages that are needed by the extras and
|
|
install them as packages from PyPI rather than from local sources:
|
|
|
|
pip install .[google,amazon] \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt"
|
|
|
|
|
|
You can upgrade just airflow, without paying attention to provider's dependencies by using 'no-providers'
|
|
constraint files. This allows you to keep installed provider packages.
|
|
|
|
pip install . --upgrade \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-no-providers-3.6.txt"
|
|
|
|
|
|
You can also install airflow in "editable mode" (with -e) flag and then provider packages are
|
|
available directly from the sources (and the provider packages installed from PyPI are UNINSTALLED in
|
|
order to avoid having providers in two places. And `provider.yaml` files are used to discover capabilities
|
|
of the providers which are part of the airflow source code.
|
|
|
|
You can read more about `provider.yaml` and community-managed providers in
|
|
https://airflow.apache.org/docs/apache-airflow-providers/index.html for developing custom providers
|
|
and in ``CONTRIBUTING.rst`` for developing community maintained providers.
|
|
|
|
This is useful if you want to develop providers:
|
|
|
|
pip install -e . \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt"
|
|
|
|
You can als skip installing provider packages from PyPI by setting INSTALL_PROVIDERS_FROM_SOURCE to "true".
|
|
In this case Airflow will be installed in non-editable mode with all providers installed from the sources.
|
|
Additionally `provider.yaml` files will also be copied to providers folders which will make the providers
|
|
discoverable by Airflow even if they are not installed from packages in this case.
|
|
|
|
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install . \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt"
|
|
|
|
Airflow can be installed with extras to install some additional features (for example 'async' or 'doc' or
|
|
to install automatically providers and all dependencies needed by that provider:
|
|
|
|
pip install .[async,google,amazon] \
|
|
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-master/constraints-3.6.txt"
|
|
|
|
The list of available extras:
|
|
|
|
# START EXTRAS HERE
|
|
|
|
airbyte, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.druid,
|
|
apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot, apache.spark,
|
|
apache.sqoop, apache.webhdfs, async, atlas, aws, azure, cassandra, celery, cgroups, cloudant,
|
|
cncf.kubernetes, crypto, dask, databricks, datadog, devel, devel_all, devel_ci, devel_hadoop,
|
|
dingding, discord, doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api,
|
|
github_enterprise, google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, jdbc, jenkins,
|
|
jira, kerberos, kubernetes, ldap, microsoft.azure, microsoft.mssql, microsoft.winrm, mongo, mssql,
|
|
mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, papermill, password, pinot, plexus,
|
|
postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment, sendgrid, sentry,
|
|
sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, telegram, trino, vertica,
|
|
virtualenv, webhdfs, winrm, yandex, zendesk
|
|
|
|
# END EXTRAS HERE
|
|
|
|
# For installing Airflow in development environments - see CONTRIBUTING.rst
|