We have fairly complex python version detection in our CI scripts.
They have to handle several cases:
1) Running builds on DockerHub (we cannot pass different environment
variables there, so we detect python version based on the image
name being build (airflow:master-python3.7 -> PYTHON_VERSION=3.7)
2) Running builds on Travis CI. We use python version determined
from default python3 version available on the path. This way we
do not have to specify PYTHON_VERSION separately in each job,
we just specify which host python version is used for that job.
This makes a nice UI experience where you see python version in
Travis UI.
3) Running builds locally via scripts where we can pass PYTHON_VERSION
as environment variable.
4) Running builds locally for the first time with Breeze. By default
we determine the version based on default python3 version we have
in the host system (3.5, 3.6 or 3.7) and we use this one.
5) Selecting python version with Breeze's --python switch. This will
override python version but it will also store the last used version
of python in .build directory so that it is automatically used next
time.
This change adds necessary explanations to the code that works for
all the cases and fixes some of the edge-cases we had. It also
extracts the code to common directory.
When you use Breeze and you specify "--force-pull" flag, the latest image from
DockerHub is pulled before you attempt to rebuild the image. Currently in
breeze we used an optimised version of "fix-permission" script that only fixes
permissions of several files in the context (the workaround for different
roup write umask setting in DockerHub). However we did not have package.json
and package-lock.json on the list so those files were always seen as "changed"
and npm was reinstalled for the first time (and only the first time) when the
image was force pulled.
After fixing this, the of --force-pull breeze commands will be much faster -
skipping the whole npm package reinstallation. Depending on your network speed,
this might be between 20 seconds to several minutes as npm ci command wipes out
everything and downloads a lot of packages.
StreamLogWriter logs other logs that could contain escapes codes thus in web UI
the log is obfuscated. To fix it I added method to remove those codes. This commit
also improves handling of dictionary as an argument for formatted string.
* AIRFLOW-5049 Add validation for src_fmt_configs in bigquery hook
Adds validation for the src_fmt_configs arguments in the bigquery hook. Otherwise wrong src_fmt_configs would be silently ignored which is non-desireable.
* [AIRFLOW-5049] Update - Add validation for src_fmt_configs in bigquery hook
Adds a common method for validating the src_ftm_configs