Switch to MySQL 5.7 in tests.
Fixes the utf8mb4 encoding issue where utf8mb4 encoding
produces too long keys for mysql to handle in XCom table.
You can optionally specify a separate option to set
encoding differently for the columns that are part of the
index - dag_id, task_id and key.
Each stage of the CI tests needs to pull our `ci` image. By removing
java from it we can save 1-2minutes from each test stage. This is part
of that work.
This PR removes initdb from system tests setup as it seems unneccessary operation.
Also some automatic code changes has been added before building providers package.
fixup! [AIRFLOW-6980] Improve system tests and building providers package
fixup! [AIRFLOW-6980] Improve system tests and building providers package
fixup! fixup! [AIRFLOW-6980] Improve system tests and building providers package
This script will run the SchedulerJob for the specified dags "to completion".
That is it creates a fixed number of DAG runs for the specified DAGs (from
the configured dag path/example dags etc), disable the scheduler from
creating more, and then monitor them for completion. When the file task of
the final dag run is completed the scheduler will be terminated.
The aim of this script is to have a benchmark for real-world scheduler
performance -- i.e. total time take to run N dag runs to completion.
It is recommended to repeat the test at least 3 times so that you can get
somewhat-accurate variance on the reported timing numbers.
Originally Breeze was used to run unit and integration tests, recently system
tests and finally we make it a bit more friendly to test your DAGs there. You
can now install any older airflow version in Breeze via
--install-airflow-version switch and "files/dags" folder is mounted to
"/files/dags" and this folder is used to read the dags from.
This change introduces sub-commands in breeze tool.
It is much needed as we have many commands now
and it was difficult to separate commands from flags.
Also --help output was very long and unreadable.
With this change help it is much easier to discover
what breeze can do for you as well as navigate with it.
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Co-authored-by: Kamil Breguła <mik-laj@users.noreply.github.com>
* adding singularity operator and tests
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* removing encoding pragmas and fixing up dockerfile to pass linting
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* make workdir in /tmp because AIRFLOW_SOURCES not defined yet
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* curl needs to follow redirects with -L
Signed-off-by: Vanessa Sochat <vsochat@stanford.edu>
* moving files to where they are supposed to be, more changes to mock, no clue
Signed-off-by: vsoch <vsochat@stanford.edu>
* removing trailing whitespace, moving example_dag for singularity, adding licenses to empty init files
Signed-off-by: vsoch <vsochat@stanford.edu>
* ran isort on example dags file
Signed-off-by: vsoch <vsochat@stanford.edu>
* adding missing init in example_dags folder for singularity
Signed-off-by: vsoch <vsochat@stanford.edu>
* removing code from __init__.py files for singularity operator to fix documentation generation
Signed-off-by: vsoch <vsochat@stanford.edu>
* forgot to update link to singularity in operators and hooks ref
Signed-off-by: vsoch <vsochat@stanford.edu>
* command must have been provided on init of singularity operator instance
Signed-off-by: vsoch <vsochat@stanford.edu>
* I guess I'm required to have a task_id?
Signed-off-by: vsoch <vsochat@stanford.edu>
* try adding working_dir to singularity operator type definitions
Signed-off-by: vsoch <vsochat@stanford.edu>
* disable too many arguments for pylint of singularity operator init
Signed-off-by: vsoch <vsochat@stanford.edu>
* move pylint disable up to line 64 - doesnt catch at end of statement like other examples
Signed-off-by: vsoch <vsochat@stanford.edu>
* two spaces before inline comment
Signed-off-by: vsoch <vsochat@stanford.edu>
* I dont see task_id as a param for other providers, removing for singularity operator
Signed-off-by: vsoch <vsochat@stanford.edu>
* adding debug print
Signed-off-by: vsoch <vsochat@stanford.edu>
* allow for return of just image and/or lines
Signed-off-by: vsoch <vsochat@stanford.edu>
* dont understand how mock works, but the image should exist after its pulled....
Signed-off-by: vsoch <vsochat@stanford.edu>
* try removing shutil, the client should handle pull folder instead
Signed-off-by: vsoch <vsochat@stanford.edu>
* try changing pull-file to same uri that is expected to be pulled
Signed-off-by: vsoch <vsochat@stanford.edu>
* import of AirflowException moved to exceptions
Signed-off-by: vsoch <vsochat@stanford.edu>
* DAG module was moved to airflow.models
Signed-off-by: vsoch <vsochat@stanford.edu>
* ensure pull is called with pull_folder
Signed-off-by: vsoch <vsochat@stanford.edu>
We will run system test on back-ported operators for 1.10* series of airflow
and for that we need to have support for running system tests using pytest's
markers and reading environment variables passed from HOST machine (to pass
credentials).
This is the first step to automate system tests execution.
This PR does two things:
1. It enables the mypy cache (default folder name .mypy_cache) so that
repeated runs locally are quicker
2. It _disables_ passing only the changed files in.
Point 2 seems counter-intuitave, but in my testing running with all
files (airflow docs tests) was about twice as fast as without. My
hypothesis for why this happens is that when mypy is checking file x, it
has to check dependencies/imports for it too, and when we have
pass_filenames set runs multiple processes in parallel, and each of them
have to do this work!
Timings before and after:
- Before:
For all files
```
❯ time pre-commit run mypy -a
Run mypy.................................................................Passed
pre-commit run mypy -a 0.31s user 0.07s system 2% cpu 17.140 total
```
With only a single file
```
❯ time pre-commit run mypy --files airflow/configuration.py
Run mypy.................................................................Passed
pre-commit run mypy --files airflow/configuration.py 0.30s user 0.07s system 5% cpu 6.724 total
```
- After:
With a clean cache (`rm -rf .mypy_cache`):
```
$ time pre-commit run mypy
Run mypy.................................................................Passed
pre-commit run mypy -a 0.26s user 0.10s system 2% cpu 17.226 total
```
Clean cache with single file:
```
$ time pre-commit run mypy --file airflow/version.py
Run mypy.................................................................Passed
pre-commit run mypy --file airflow/version.py 0.23s user 0.07s system 4% cpu 7.091 total
```
Repeated run (cache folder exists):
```
$ time pre-commit run mypy --file airflow/version.py
Run mypy.................................................................Passed
pre-commit run mypy --file airflow/version.py 0.23s user 0.05s system 6% cpu 4.178 total
```
and for all files
```
airflow ❯ time pre-commit run mypy -a
Run mypy.................................................................Passed
pre-commit run mypy -a 0.25s user 0.09s system 6% cpu 4.833 total
```
* [AIRFLOW-6590] Use batch db operations in jobs
The PR changes numerous single selects / updates in base,
scheduler, and backfill jobs to bulk operations.
* fixup! [AIRFLOW-6590] Use batch db operations in jobs
* fixup! fixup! [AIRFLOW-6590] Use batch db operations in jobs
Don't try to find changed files unless we are building a pull request.
This only caused a problem on build of tags, but we were also doing this
for master/branch builds, but it was always saying finding no files
changed.
By checking this early we can make the other conditions in this function
simpler.
* [AIRFLOW-6770] Run particular test using breeze CLI bug fix
* [AIRFLOW-6770] Fix typo in travis config
* [AIRFLOW-6770] Fix variable name and remove unnecessary travis command
* [AIRFLOW-6766] Fix "cannot import ensure_text" error for pre-commit
As of today Travis bundles six version 1.11.0 with their python
3.6 image and it misses ensure_text method. Bumping to 1.14+
solves the problem.