update requirements, introduce constraints file and clean up for 2.3.3

This commit is contained in:
Harold Woo 2022-11-21 15:57:11 -08:00 коммит произвёл Mikaël Ducharme
Родитель d0e27aeef1
Коммит d2505e9ef3
11 изменённых файлов: 1984 добавлений и 640 удалений

Просмотреть файл

@ -24,7 +24,7 @@ jobs:
test:
docker:
- image: python:3.7
- image: python:3.8.12
working_directory: ~/mozilla/telemetry-airflow
steps:
- checkout
@ -35,7 +35,7 @@ jobs:
verify-requirements:
docker:
- image: python:3.7
- image: python:3.8.12
steps:
- checkout
- run:

Просмотреть файл

@ -2,7 +2,7 @@
# linux kernel than provided by CircleCI, per
# https://forums.docker.com/t/multiple-projects-stopped-building-on-docker-hub-operation-not-permitted/92570/6
# and https://forums.docker.com/t/multiple-projects-stopped-building-on-docker-hub-operation-not-permitted/92570/11
FROM python:3.7-slim-buster
FROM python:3.8.12-slim-buster
MAINTAINER Harold Woo <hwoo@mozilla.com>
# Due to AIRFLOW-6854, Python 3.7 is chosen as the base python version.

Просмотреть файл

@ -4,18 +4,18 @@
help:
@echo "Welcome to the Telemetry Airflow\n"
@echo "The list of commands for local development:\n"
@echo " build Builds the docker images for the docker-compose setup"
@echo " build-linux Builds the docker images using the current user ID and group ID"
@echo " clean Stops and removes all docker containers"
@echo " migrate Runs the Django database migrations"
@echo " redis-cli Opens a Redis CLI"
@echo " run Run a airflow command"
@echo " secret Create a secret to be used for a config variable"
@echo " shell Opens a Bash shell"
@echo " up Runs the whole stack, served under http://localhost:8000/"
@echo " gke Create a sandbox gke cluster for testing"
@echo " clean-gke Delete the sandbox gke cluster"
@echo " stop Stops the docker containers"
@echo " build Builds the docker images for the docker-compose setup"
@echo " build-linux Builds the docker images using the current user ID and group ID"
@echo " clean Stops and removes all docker containers"
@echo " pip-compile Compile dependencies from 'requirements.in' into 'requirements.txt'"
@echo " pip-install-local Install pip project requirements to your local environment"
@echo " migrate Runs the Django database migrations"
@echo " redis-cli Opens a Redis CLI"
@echo " shell Opens a Bash shell"
@echo " up Runs the whole stack, served under http://localhost:8000/"
@echo " gke Create a sandbox gke cluster for testing"
@echo " clean-gke Delete the sandbox gke cluster"
@echo " stop Stops the docker containers"
build:
docker-compose build
@ -23,24 +23,23 @@ build:
build-linux:
docker-compose build --build-arg AIRFLOW_UID="$$(id -u)" --build-arg AIRFLOW_GID="$$(id -g)"
pip-compile:
pip-compile
clean: stop
docker-compose rm -f
rm -rf logs/*
if [ -f airflow-worker.pid ]; then rm airflow-worker.pid; fi
pip-install-local: pip-compile
pip install -r requirements.txt
shell:
docker-compose run web bash
redis-cli:
docker-compose run redis redis-cli -h redis
run:
docker-compose run -e DEV_USERNAME -e AWS_SECRET_ACCESS_KEY -e AWS_ACCESS_KEY_ID web airflow $(COMMAND)
secret:
@docker-compose run web python -c \
"from cryptography.fernet import Fernet; print Fernet.generate_key()"
stop:
docker-compose down
docker-compose stop
@ -56,6 +55,3 @@ clean-gke:
test:
python -m pytest tests/
compile-requirements:
venv/bin/pip-compile requirements.in --output-file requirements.txt

Просмотреть файл

@ -24,23 +24,30 @@ Some links relevant to users and developers of WTMO:
This app is built and deployed with
[docker](https://docs.docker.com/) and
[docker-compose](https://docs.docker.com/compose/).
Dependencies are managed with
[pip-tools](https://pypi.org/project/pip-tools/) `pip-compile`.
You'll also need to install MySQL to build the database container.
### Updating Python dependencies
### Installing dependencies locally
**_⚠ Make sure you use the right Python version. Refer to Dockerfile for current supported Python Version ⚠_**
Add new Python dependencies into `requirements.in`. Run the following commands with the same Python
version specified by the Dockerfile.
You can install the project dependencies locally to run tests with Pytest. We use the
[official Airflow constraints](https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#constraints-files) file to simplify
Airflow dependency management. Install dependencies locally using the following command:
```bash
# As of time of writing, python3.7
pip install pip-tools
pip-compile
make pip-install-local
```
### Build Container
### Updating Python dependencies
An Airflow container can be built with
Add new Python dependencies into `requirements.in` and execute `make pip-install-local`
### Build Container
**_⚠ See [Local Deployment](#local-deployment) section below for Linux and macOS specific instructions ⚠_**
Build Airflow image with
```bash
make build
@ -52,16 +59,6 @@ Airflow database migration is no longer a separate step for dev but is run by th
## Testing
A single task, e.g. `spark`, of an Airflow dag, e.g. `example`, can be run with an execution date, e.g. `2018-01-01`, in the `dev` environment with:
```bash
make run COMMAND="tasks test example spark 20180101"
```
```bash
docker logs -f telemetryairflow_scheduler_1
```
### Adding dummy credentials
Tasks often require credentials to access external credentials. For example, one may choose to store

617
constraints.txt Normal file
Просмотреть файл

@ -0,0 +1,617 @@
#
# This constraints file was automatically generated on 2022-07-05T21:12:10Z
# via "eager-upgrade" mechanism of PIP. For the "v2-3-test" branch of Airflow.
# This variant of constraints install uses the HEAD of the branch version for 'apache-airflow' but installs
# the providers from PIP-released packages at the moment of the constraint generation.
#
# Those constraints are actually those that regular users use to install released version of Airflow.
# We also use those constraints after "apache-airflow" is released and the constraints are tagged with
# "constraints-X.Y.Z" tag to build the production image for that version.
#
# Editable install with no version control (apache-airflow==2.3.3)
APScheduler==3.6.3
Authlib==0.15.5
Babel==2.10.3
Deprecated==1.2.13
Flask-AppBuilder==4.1.2
Flask-Babel==2.0.0
Flask-Bcrypt==1.0.1
Flask-Caching==2.0.0
Flask-JWT-Extended==4.4.2
Flask-Login==0.6.1
Flask-SQLAlchemy==2.5.1
Flask-Session==0.4.0
Flask-WTF==0.15.1
Flask==2.1.2
GitPython==3.1.27
HeapDict==1.0.1
JPype1==1.4.0
JayDeBeApi==1.2.3
Jinja2==3.1.2
Mako==1.2.1
Markdown==3.3.7
MarkupSafe==2.1.1
PyGithub==1.55
PyHive==0.6.5
PyJWT==2.4.0
PyNaCl==1.5.0
PyYAML==6.0
Pygments==2.12.0
Rx==3.2.0
SQLAlchemy-JSONField==1.0.0
SQLAlchemy-Utils==0.38.2
SQLAlchemy==1.4.27
SecretStorage==3.3.2
Sphinx==5.0.2
Unidecode==1.3.4
WTForms==2.3.3
Werkzeug==2.1.2
adal==1.2.7
aiohttp==3.8.1
aiosignal==1.2.0
alabaster==0.7.12
alembic==1.8.0
aliyun-python-sdk-core==2.13.36
aliyun-python-sdk-kms==2.15.0
amqp==5.1.1
analytics-python==1.4.0
ansiwrap==0.8.4
anyio==3.6.1
apache-airflow-providers-airbyte==3.0.0
apache-airflow-providers-alibaba==2.0.0
apache-airflow-providers-amazon==4.0.0
apache-airflow-providers-apache-beam==4.0.0
apache-airflow-providers-apache-cassandra==3.0.0
apache-airflow-providers-apache-drill==2.0.0
apache-airflow-providers-apache-druid==3.0.0
apache-airflow-providers-apache-hdfs==3.0.0
apache-airflow-providers-apache-hive==3.0.0
apache-airflow-providers-apache-kylin==3.0.0
apache-airflow-providers-apache-livy==3.0.0
apache-airflow-providers-apache-pig==3.0.0
apache-airflow-providers-apache-pinot==3.0.0
apache-airflow-providers-apache-spark==3.0.0
apache-airflow-providers-apache-sqoop==3.0.0
apache-airflow-providers-arangodb==2.0.0
apache-airflow-providers-asana==2.0.0
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-cloudant==3.0.0
apache-airflow-providers-cncf-kubernetes==4.1.0
apache-airflow-providers-databricks==3.0.0
apache-airflow-providers-datadog==3.0.0
apache-airflow-providers-dbt-cloud==2.0.0
apache-airflow-providers-dingding==3.0.0
apache-airflow-providers-discord==3.0.0
apache-airflow-providers-docker==3.0.0
apache-airflow-providers-elasticsearch==4.0.0
apache-airflow-providers-exasol==3.0.0
apache-airflow-providers-facebook==3.0.0
apache-airflow-providers-ftp==3.0.0
apache-airflow-providers-github==2.0.0
apache-airflow-providers-google==8.1.0
apache-airflow-providers-grpc==3.0.0
apache-airflow-providers-hashicorp==3.0.0
apache-airflow-providers-http==3.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-influxdb==2.0.0
apache-airflow-providers-jdbc==3.0.0
apache-airflow-providers-jenkins==3.0.0
apache-airflow-providers-jira==3.0.0
apache-airflow-providers-microsoft-azure==4.0.0
apache-airflow-providers-microsoft-mssql==3.0.0
apache-airflow-providers-microsoft-psrp==2.0.0
apache-airflow-providers-microsoft-winrm==3.0.0
apache-airflow-providers-mongo==3.0.0
apache-airflow-providers-mysql==3.0.0
apache-airflow-providers-neo4j==3.0.0
apache-airflow-providers-odbc==3.0.0
apache-airflow-providers-openfaas==3.0.0
apache-airflow-providers-opsgenie==4.0.0
apache-airflow-providers-oracle==3.1.0
apache-airflow-providers-pagerduty==3.0.0
apache-airflow-providers-papermill==3.0.0
apache-airflow-providers-plexus==3.0.0
apache-airflow-providers-postgres==5.0.0
apache-airflow-providers-presto==3.0.0
apache-airflow-providers-qubole==3.0.0
apache-airflow-providers-redis==3.0.0
apache-airflow-providers-salesforce==4.0.0
apache-airflow-providers-samba==4.0.0
apache-airflow-providers-segment==3.0.0
apache-airflow-providers-sendgrid==3.0.0
apache-airflow-providers-sftp==3.0.0
apache-airflow-providers-singularity==3.0.0
apache-airflow-providers-slack==5.0.0
apache-airflow-providers-snowflake==3.0.0
apache-airflow-providers-sqlite==3.0.0
apache-airflow-providers-ssh==3.0.0
apache-airflow-providers-tableau==3.0.0
apache-airflow-providers-telegram==3.0.0
apache-airflow-providers-trino==3.0.0
apache-airflow-providers-vertica==3.0.0
apache-airflow-providers-yandex==3.0.0
apache-airflow-providers-zendesk==4.0.0
apache-beam==2.40.0
apispec==3.3.2
appdirs==1.4.4
argcomplete==2.0.0
arrow==1.2.2
asana==1.0.0
asn1crypto==1.5.1
astroid==2.11.6
asttokens==2.0.5
async-timeout==4.0.2
asynctest==0.13.0
atlasclient==1.0.0
attrs==20.3.0
aws-sam-translator==1.46.0
aws-xray-sdk==2.10.0
azure-batch==12.0.0
azure-common==1.1.28
azure-core==1.24.2
azure-cosmos==4.3.0
azure-datalake-store==0.0.52
azure-identity==1.10.0
azure-keyvault-secrets==4.4.0
azure-kusto-data==0.0.45
azure-mgmt-containerinstance==1.5.0
azure-mgmt-core==1.3.1
azure-mgmt-datafactory==1.1.0
azure-mgmt-datalake-nspkg==3.0.1
azure-mgmt-datalake-store==0.5.0
azure-mgmt-nspkg==3.0.2
azure-mgmt-resource==21.1.0
azure-nspkg==3.0.2
azure-servicebus==7.7.0
azure-storage-blob==12.8.1
azure-storage-common==2.1.0
azure-storage-file==2.1.0
backcall==0.2.0
backoff==1.10.0
backports.zoneinfo==0.2.1
bcrypt==3.2.2
beautifulsoup4==4.11.1
billiard==3.6.4.0
black==22.6.0
bleach==5.0.1
blinker==1.4
boto3==1.24.23
boto==2.49.0
botocore==1.27.23
bowler==0.9.0
cached-property==1.5.2
cachelib==0.9.0
cachetools==4.2.2
cassandra-driver==3.25.0
cattrs==1.10.0
celery==5.2.7
certifi==2020.12.5
cffi==1.15.1
cfgv==3.3.1
cfn-lint==0.61.1
cgroupspy==0.2.2
charset-normalizer==2.0.12
click-default-group==1.2.2
click-didyoumean==0.3.0
click-plugins==1.1.1
click-repl==0.2.0
click==8.1.3
clickclick==20.10.2
cloudant==2.15.0
cloudpickle==2.1.0
colorama==0.4.5
colorlog==4.8.0
commonmark==0.9.1
connexion==2.14.0
coverage==6.4.1
crcmod==1.7
cron-descriptor==1.2.30
croniter==1.3.5
cryptography==36.0.2
curlify==2.2.1
cx-Oracle==8.3.0
dask==2022.6.1
databricks-sql-connector==2.0.2
datadog==0.44.0
db-dtypes==1.0.2
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.1.1
distlib==0.3.4
distributed==2022.6.1
dnspython==2.2.1
docker==5.0.3
docopt==0.6.2
docutils==0.19
ecdsa==0.17.0
elasticsearch-dbapi==0.2.9
elasticsearch-dsl==7.4.0
elasticsearch==7.13.4
email-validator==1.2.1
entrypoints==0.4
eventlet==0.33.1
execnet==1.9.0
executing==0.8.3
facebook-business==14.0.0
fastavro==1.5.2
fastjsonschema==2.15.3
filelock==3.7.1
fissix==21.11.13
flake8-colors==0.1.9
flake8==4.0.1
flake8_implicit_str_concat==0.3.0
flaky==3.7.0
flower==1.1.0
freezegun==1.2.1
frozenlist==1.3.0
fsspec==2022.5.0
future==0.18.2
gcsfs==2022.5.0
geomet==0.2.1.post1
gevent==21.12.0
gitdb==4.0.9
google-ads==17.0.0
google-api-core==2.8.2
google-api-python-client==1.12.11
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.5.2
google-auth==2.9.0
google-cloud-aiplatform==1.15.0
google-cloud-appengine-logging==1.1.2
google-cloud-audit-log==0.2.2
google-cloud-automl==2.7.3
google-cloud-bigquery-datatransfer==3.6.2
google-cloud-bigquery-storage==2.13.2
google-cloud-bigquery==2.34.4
google-cloud-bigtable==1.7.2
google-cloud-build==3.8.3
google-cloud-container==2.10.8
google-cloud-core==2.3.1
google-cloud-datacatalog==3.8.1
google-cloud-dataplex==1.0.1
google-cloud-dataproc-metastore==1.5.1
google-cloud-dataproc==4.0.3
google-cloud-dlp==1.0.2
google-cloud-kms==2.11.2
google-cloud-language==1.3.2
google-cloud-logging==3.1.2
google-cloud-memcache==1.3.2
google-cloud-monitoring==2.9.2
google-cloud-orchestration-airflow==1.3.2
google-cloud-os-login==2.6.2
google-cloud-pubsub==2.13.0
google-cloud-redis==2.8.1
google-cloud-resource-manager==1.5.1
google-cloud-secret-manager==1.0.2
google-cloud-spanner==1.19.3
google-cloud-speech==1.3.4
google-cloud-storage==1.44.0
google-cloud-tasks==2.9.1
google-cloud-texttospeech==1.0.3
google-cloud-translate==1.7.2
google-cloud-videointelligence==1.16.3
google-cloud-vision==1.0.2
google-cloud-workflows==1.6.3
google-crc32c==1.3.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.3
graphql-core==3.2.1
graphviz==0.20
greenlet==1.1.2
grpc-google-iam-v1==0.12.4
grpcio-gcp==0.2.2
grpcio-status==1.47.0
grpcio==1.47.0
gssapi==1.7.3
gunicorn==20.1.0
h11==0.12.0
hdfs==2.7.0
hmsclient==0.1.1
httpcore==0.15.0
httplib2==0.20.4
httpx==0.23.0
humanize==4.2.3
hvac==0.11.2
identify==2.5.1
idna==3.3
ijson==3.1.4
imagesize==1.4.1
importlib-metadata==4.12.0
importlib-resources==5.8.0
incremental==21.3.0
inflection==0.5.1
influxdb-client==1.30.0
iniconfig==1.1.1
ipdb==0.13.9
ipython==8.4.0
isodate==0.6.1
itsdangerous==2.1.2
jedi==0.18.1
jeepney==0.8.0
jira==3.2.0
jmespath==0.10.0
jschema-to-python==1.2.3
json-merge-patch==0.2
jsondiff==2.0.0
jsonpatch==1.32
jsonpath-ng==1.5.3
jsonpickle==2.2.0
jsonpointer==2.3
jsonschema==4.6.1
junit-xml==1.9
jupyter-client==7.3.4
jupyter-core==4.10.0
keyring==23.6.0
kombu==5.2.4
krb5==0.3.0
kubernetes==23.6.0
kylinpy==2.8.4
lazy-object-proxy==1.7.1
ldap3==2.9.1
linkify-it-py==2.0.0
locket==1.0.0
lockfile==0.12.2
looker-sdk==22.4.0
lxml==4.9.1
markdown-it-py==2.1.0
marshmallow-enum==1.5.1
marshmallow-oneofschema==3.0.1
marshmallow-sqlalchemy==0.26.1
marshmallow==3.17.0
matplotlib-inline==0.1.3
mccabe==0.6.1
mdit-py-plugins==0.3.0
mdurl==0.1.1
mongomock==4.0.0
monotonic==1.6
more-itertools==8.13.0
moreorless==0.4.0
moto==3.1.16
msal-extensions==1.0.0
msal==1.18.0
msgpack==1.0.4
msrest==0.7.1
msrestazure==0.6.4
multi-key-dict==2.0.3
multidict==6.0.2
mypy-boto3-rds==1.24.23
mypy-boto3-redshift-data==1.24.11.post3
mypy-extensions==0.4.3
mypy==0.910
mysql-connector-python==8.0.29
mysqlclient==2.1.1
nbclient==0.6.6
nbformat==5.4.0
neo4j==4.4.4
nest-asyncio==1.5.5
networkx==2.8.4
nodeenv==1.7.0
ntlm-auth==1.5.0
numpy==1.22.4
oauthlib==3.2.0
openapi-schema-validator==0.2.3
openapi-spec-validator==0.4.0
opsgenie-sdk==2.1.5
oracledb==1.0.1
orjson==3.7.6
oscrypto==1.3.0
oss2==2.15.0
packaging==21.3
pandas-gbq==0.17.6
pandas==1.4.3
papermill==2.3.4
parameterized==0.8.1
paramiko==2.11.0
parso==0.8.3
partd==1.2.0
pathspec==0.9.0
pbr==5.9.0
pdpyras==4.5.0
pendulum==2.1.2
pexpect==4.8.0
pickleshare==0.7.5
pinotdb==0.3.12
pipdeptree==2.2.1
pipx==1.1.0
pkginfo==1.8.3
platformdirs==2.5.2
pluggy==1.0.0
ply==3.11
plyvel==1.4.0
portalocker==2.4.0
pre-commit==2.19.0
presto-python-client==0.8.2
prison==0.2.1
prometheus-client==0.14.1
prompt-toolkit==3.0.30
proto-plus==1.19.6
protobuf==3.20.0
psutil==5.9.1
psycopg2-binary==2.9.3
ptyprocess==0.7.0
pure-eval==0.2.2
pure-sasl==0.6.2
py4j==0.10.9.5
py==1.11.0
pyOpenSSL==22.0.0
pyarrow==6.0.1
pyasn1-modules==0.2.8
pyasn1==0.4.8
pycodestyle==2.8.0
pycountry==22.3.5
pycparser==2.21
pycryptodome==3.15.0
pycryptodomex==3.15.0
pydata-google-auth==1.4.0
pydot==1.4.2
pydruid==0.6.3
pyenchant==3.2.2
pyexasol==0.24.0
pyflakes==2.4.0
pykerberos==1.2.4
pymongo==3.12.3
pymssql==2.2.5
pyodbc==4.0.32
pyparsing==3.0.9
pypsrp==0.8.1
pyrsistent==0.18.1
pysftp==0.2.9
pyspark==3.3.0
pyspnego==0.5.2
pytest-asyncio==0.18.3
pytest-cov==3.0.0
pytest-forked==1.4.0
pytest-httpx==0.21.0
pytest-instafail==0.4.2
pytest-rerunfailures==9.1.1
pytest-timeouts==1.2.1
pytest-xdist==2.5.0
pytest==6.2.5
python-arango==7.4.1
python-daemon==2.3.0
python-dateutil==2.8.2
python-http-client==3.3.7
python-jenkins==1.7.0
python-jose==3.3.0
python-ldap==3.4.0
python-nvd3==0.15.0
python-slugify==6.1.2
python-telegram-bot==13.13
pytz-deprecation-shim==0.1.0.post0
pytz==2022.1
pytzdata==2020.1
pywinrm==0.4.3
pyzmq==23.2.0
qds-sdk==1.16.1
readme-renderer==35.0
redis==3.5.3
redshift-connector==2.0.908
requests-file==1.5.1
requests-kerberos==0.14.0
requests-mock==1.9.3
requests-ntlm==1.1.0
requests-oauthlib==1.3.1
requests-toolbelt==0.9.1
requests==2.28.0
responses==0.21.0
rfc3986==1.5.0
rich-click==1.5.1
rich==12.4.4
rsa==4.8
s3transfer==0.6.0
sarif-om==1.0.4
sasl==0.3.1
scramp==1.4.1
scrapbook==0.5.0
semver==2.13.0
sendgrid==6.9.7
sentinels==1.0.0
sentry-sdk==1.6.0
setproctitle==1.2.3
simple-salesforce==1.11.6
six==1.16.0
slack-sdk==3.17.2
smbprotocol==1.9.0
smmap==5.0.0
snakebite-py3==3.0.5
sniffio==1.2.0
snowballstemmer==2.2.0
snowflake-connector-python==2.7.9
snowflake-sqlalchemy==1.3.4
sortedcontainers==2.4.0
soupsieve==2.3.2.post1
sphinx-airflow-theme==0.0.9
sphinx-argparse==0.3.1
sphinx-autoapi==1.8.4
sphinx-copybutton==0.5.0
sphinx-jinja==2.0.2
sphinx-rtd-theme==1.0.0
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-httpdomain==1.8.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-redoc==1.6.0
sphinxcontrib-serializinghtml==1.1.5
sphinxcontrib-spelling==7.6.0
spython==0.2.1
sqlalchemy-bigquery==1.4.4
sqlalchemy-drill==1.1.2
sqlalchemy-redshift==0.8.9
sqlparse==0.4.2
sshpubkeys==3.3.1
sshtunnel==0.4.0
stack-data==0.3.0
starkbank-ecdsa==2.0.3
statsd==3.3.0
swagger-ui-bundle==0.0.9
tableauserverclient==0.19.0
tabulate==0.8.10
tblib==1.7.0
tenacity==8.0.1
termcolor==1.1.0
text-unidecode==1.3
textwrap3==0.9.2
thrift-sasl==0.4.3
thrift==0.16.0
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
tornado==6.2
towncrier==21.9.0
tqdm==4.64.0
traitlets==5.3.0
trino==0.314.0
twine==4.0.1
types-Deprecated==1.2.9
types-Markdown==3.3.29
types-PyMySQL==1.0.19
types-PyYAML==6.0.9
types-boto==2.49.16
types-certifi==2021.10.8.3
types-croniter==1.3.0
types-cryptography==3.3.21
types-docutils==0.18.3
types-freezegun==1.1.10
types-paramiko==2.11.2
types-protobuf==3.19.22
types-python-dateutil==2.8.18
types-python-slugify==6.1.0
types-pytz==2022.1.1
types-redis==4.3.3
types-requests==2.28.0
types-setuptools==57.4.18
types-six==1.16.17
types-tabulate==0.8.11
types-termcolor==1.1.5
types-toml==0.10.7
types-urllib3==1.26.15
typing_extensions==4.3.0
tzdata==2022.1
tzlocal==4.2
uamqp==1.5.3
uc-micro-py==1.0.1
unicodecsv==0.14.1
uritemplate==3.0.1
urllib3==1.26.9
userpath==1.8.0
vertica-python==1.1.0
vine==5.0.0
virtualenv==20.15.1
volatile==2.1.0
watchtower==2.0.1
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.3.3
wrapt==1.14.1
xmltodict==0.13.0
yamllint==1.26.3
yandexcloud==0.171.0
yarl==1.7.2
zeep==4.1.0
zenpy==2.0.24
zict==2.2.0
zipp==3.8.0
zope.event==4.5.0
zope.interface==5.4.0

Просмотреть файл

@ -1,55 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from os import environ"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"date parameter is missing!\n"
]
}
],
"source": [
"print environ.get('date', 'date parameter is missing!')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

Просмотреть файл

@ -1,2 +0,0 @@
#!/bin/bash
echo $date

Просмотреть файл

@ -1,48 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'foobar' is not defined",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-1-14758f1afd44>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mfoobar\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mNameError\u001b[0m: name 'foobar' is not defined"
]
}
],
"source": [
"foobar"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

Просмотреть файл

@ -1,212 +0,0 @@
# ---------------------------------------------------------------------------
#
# This file configures the New Relic Python Agent.
#
# The path to the configuration file should be supplied to the function
# newrelic.agent.initialize() when the agent is being initialized.
#
# The configuration file follows a structure similar to what you would
# find for Microsoft Windows INI files. For further information on the
# configuration file format see the Python ConfigParser documentation at:
#
# http://docs.python.org/library/configparser.html
#
# For further discussion on the behaviour of the Python agent that can
# be configured via this configuration file see:
#
# http://newrelic.com/docs/python/python-agent-configuration
#
# ---------------------------------------------------------------------------
# Here are the settings that are common to all environments.
[newrelic]
# You must specify the license key associated with your New
# Relic account. This key binds the Python Agent's data to your
# account in the New Relic service.
# license_key = use NEW_RELIC_LICENSE_KEY environment variable
# The application name. Set this to be the name of your
# application as you would like it to show up in New Relic UI.
# The UI will then auto-map instances of your application into a
# entry on your home dashboard page.
# app_name = use NEW_RELIC_APP_NAME environment variable
# When "true", the agent collects performance data about your
# application and reports this data to the New Relic UI at
# newrelic.com. This global switch is normally overridden for
# each environment below.
monitor_mode = true
# Sets the name of a file to log agent messages to. Useful for
# debugging any issues with the agent. This is not set by
# default as it is not known in advance what user your web
# application processes will run as and where they have
# permission to write to. Whatever you set this to you must
# ensure that the permissions for the containing directory and
# the file itself are correct, and that the user that your web
# application runs as can write to the file. If not able to
# write out a log file, it is also possible to say "stderr" and
# output to standard error output. This would normally result in
# output appearing in your web server log.
#log_file = /tmp/newrelic-python-agent.log
# Sets the level of detail of messages sent to the log file, if
# a log file location has been provided. Possible values, in
# increasing order of detail, are: "critical", "error", "warning",
# "info" and "debug". When reporting any agent issues to New
# Relic technical support, the most useful setting for the
# support engineers is "debug". However, this can generate a lot
# of information very quickly, so it is best not to keep the
# agent at this level for longer than it takes to reproduce the
# problem you are experiencing.
log_level = info
# The Python Agent communicates with the New Relic service using
# SSL by default. Note that this does result in an increase in
# CPU overhead, over and above what would occur for a non SSL
# connection, to perform the encryption involved in the SSL
# communication. This work is though done in a distinct thread
# to those handling your web requests, so it should not impact
# response times. You can if you wish revert to using a non SSL
# connection, but this will result in information being sent
# over a plain socket connection and will not be as secure.
ssl = true
# High Security Mode enforces certain security settings, and
# prevents them from being overridden, so that no sensitive data
# is sent to New Relic. Enabling High Security Mode means that
# SSL is turned on, request parameters are not collected, and SQL
# can not be sent to New Relic in its raw form. To activate High
# Security Mode, it must be set to 'true' in this local .ini
# configuration file AND be set to 'true' in the server-side
# configuration in the New Relic user interface. For details, see
# https://docs.newrelic.com/docs/subscriptions/high-security
# high_security = true
high_security = false
# The Python Agent will attempt to connect directly to the New
# Relic service. If there is an intermediate firewall between
# your host and the New Relic service that requires you to use a
# HTTP proxy, then you should set both the "proxy_host" and
# "proxy_port" settings to the required values for the HTTP
# proxy. The "proxy_user" and "proxy_pass" settings should
# additionally be set if proxy authentication is implemented by
# the HTTP proxy. The "proxy_scheme" setting dictates what
# protocol scheme is used in talking to the HTTP proxy. This
# would normally always be set as "http" which will result in the
# agent then using a SSL tunnel through the HTTP proxy for end to
# end encryption.
# proxy_scheme = http
# proxy_host = hostname
# proxy_port = 8080
# proxy_user =
# proxy_pass =
# Capturing request parameters is off by default. To enable the
# capturing of request parameters, first ensure that the setting
# "attributes.enabled" is set to "true" (the default value), and
# then add "request.parameters.*" to the "attributes.include"
# setting. For details about attributes configuration, please
# consult the documentation.
# attributes.include = request.parameters.*
# The transaction tracer captures deep information about slow
# transactions and sends this to the UI on a periodic basis. The
# transaction tracer is enabled by default. Set this to "false"
# to turn it off.
transaction_tracer.enabled = true
# Threshold in seconds for when to collect a transaction trace.
# When the response time of a controller action exceeds this
# threshold, a transaction trace will be recorded and sent to
# the UI. Valid values are any positive float value, or (default)
# "apdex_f", which will use the threshold for a dissatisfying
# Apdex controller action - four times the Apdex T value.
transaction_tracer.transaction_threshold = apdex_f
# When the transaction tracer is on, SQL statements can
# optionally be recorded. The recorder has three modes, "off"
# which sends no SQL, "raw" which sends the SQL statement in its
# original form, and "obfuscated", which strips out numeric and
# string literals.
transaction_tracer.record_sql = obfuscated
# Threshold in seconds for when to collect stack trace for a SQL
# call. In other words, when SQL statements exceed this
# threshold, then capture and send to the UI the current stack
# trace. This is helpful for pinpointing where long SQL calls
# originate from in an application.
transaction_tracer.stack_trace_threshold = 0.5
# Determines whether the agent will capture query plans for slow
# SQL queries. Only supported in MySQL and PostgreSQL. Set this
# to "false" to turn it off.
transaction_tracer.explain_enabled = true
# Threshold for query execution time below which query plans
# will not not be captured. Relevant only when "explain_enabled"
# is true.
transaction_tracer.explain_threshold = 0.5
# Space separated list of function or method names in form
# 'module:function' or 'module:class.function' for which
# additional function timing instrumentation will be added.
transaction_tracer.function_trace =
# The error collector captures information about uncaught
# exceptions or logged exceptions and sends them to UI for
# viewing. The error collector is enabled by default. Set this
# to "false" to turn it off.
error_collector.enabled = true
# To stop specific errors from reporting to the UI, set this to
# a space separated list of the Python exception type names to
# ignore. The exception name should be of the form 'module:class'.
error_collector.ignore_errors = celery.exceptions:Retry celery.exceptions:RetryTaskError
# Browser monitoring is the Real User Monitoring feature of the UI.
# For those Python web frameworks that are supported, this
# setting enables the auto-insertion of the browser monitoring
# JavaScript fragments.
browser_monitoring.auto_instrument = false
# A thread profiling session can be scheduled via the UI when
# this option is enabled. The thread profiler will periodically
# capture a snapshot of the call stack for each active thread in
# the application to construct a statistically representative
# call tree.
thread_profiler.enabled = true
# This timeout is how long the agent should wait for registration
# before proceeding when performing lazy registration of the agent.
# This represents the maximum time it will wait. If registration takes
# less time than specified by the timeout it will continue immediately.
startup_timeout = 10.0
# ---------------------------------------------------------------------------
#
# The application environments. These are specific settings which
# override the common environment settings. The settings related to a
# specific environment will be used when the environment argument to the
# newrelic.agent.initialize() function has been defined to be either
# "development", "test", "staging" or "production".
#
[newrelic:development]
monitor_mode = false
[newrelic:test]
monitor_mode = false
[newrelic:staging]
monitor_mode = true
[newrelic:production]
monitor_mode = true
# ---------------------------------------------------------------------------

Просмотреть файл

@ -1,45 +1,22 @@
boto3==1.15.18
botocore<1.19.0,>=1.18.0
kombu==4.6.10 # CeleryExecutor issues with 1.10.2 supposedly fixed in 1.10.5 airflow, but still observed issues on 1.10.7
importlib-metadata>=1.7
argcomplete==1.12.2
pandas-gbq==0.14.1
# removed hdfs
apache-airflow[amazon,celery,postgres,apache.hive,jdbc,async,password,crypto,github_enterprise,datadog,statsd,mysql,google_auth,cncf.kubernetes]==2.1.4
cryptography>=3.2
mozlogging
retrying
newrelic
redis
hiredis
requests
jsonschema
flask-admin
Flask-OAuthlib
Authlib~=0.15.3
Flask-AppBuilder>=3.3.0
pytz
pytest
werkzeug>=1.0.1,~=1.0
# The next requirements are for kubernetes-client/python
urllib3>=1.24.2 # MIT
ipaddress>=1.0.17;python_version=="2.7" # PSF
websocket-client>=0.32.0,!=0.40.0,!=0.41.*,!=0.42.* # LGPLv2+
# Pin to older version, newer version has issues
JPype1==0.7.1
shelljob==0.5.6
# Fix no inspection available issue
# https://github.com/apache/airflow/issues/8211
SQLAlchemy>=1.3.18
# Airflow 2 no longer installs http provider by default, until chardet becomes an optional dependency of requests
# Official Airflow constraints file
# Doc: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#constraints-files
# File: https://raw.githubusercontent.com/apache/airflow/constraints-2.3.3/constraints-3.8.txt
--constraint ./constraints.txt
# Airflow dependencies
apache-airflow[amazon,apache.hive,async,celery,cncf.kubernetes,crypto,github_enterprise,google_auth,jdbc,mysql,password,postgres,redis,statsd]==2.3.3
apache-airflow-providers-google
apache-airflow-providers-http
airflow-provider-fivetran
apache-airflow-providers-slack
# Upgrade google dataproc provider to fix beta client clusterConfig and mismatch issues
apache-airflow-providers-google==5.0.0
# 2.4.0 is broken for dataproc cluster create/delete
# 2.6.0 and 3.0.0 are newer but not compatible with apache-airflow-providers-google
# yet until maybe v7.0.0 bc 'google.cloud.dataproc_v1beta2' is deprecated
google-cloud-dataproc==2.5.0
xmltodict==0.12.0
google-cloud-pubsub==2.11.0
airflow-provider-fivetran==1.1.2
# Misc
mozlogging
pytest
# Required for backfill UI
flask-admin
shelljob==0.5.6
# Required for /app/dags/fivetran_acoustic.py, /app/dags/utils/acoustic/acoustic_client.py
xmltodict

Разница между файлами не показана из-за своего большого размера Загрузить разницу