Граф коммитов

11616 Коммитов

Автор SHA1 Сообщение Дата
Jarek Potiuk edbf49c645 Prepare ad-hoc release of the four previously excluded providers (#14655)
Documentation update for the four previously excluded providers that
got extra fixes/bumping to the latest version of the libraries.

* apache.beam
* apache.druid
* microsoft.azure
* snowflake

(cherry picked from commit b753c7fa60)
2021-04-15 14:00:31 +01:00
Jed Cunningham 41e0e19918 Fix support for long dag_id and task_id in KubernetesExecutor (#14703)
The key used to remove a task from executor.running is reconstituted
from pod annotations, so make sure the full dag_id and task_id are in
the annotations.

(cherry picked from commit b5e7ada345)
2021-04-15 14:00:31 +01:00
Jarek Potiuk e25a8b09c5 Synchronize Provider templates with master 2021-04-15 14:00:31 +01:00
Jarek Potiuk d41db1d060 Fixes problem with two different files mdsumed with the same name (#14998)
* Fixes problem with two different files mdsumed with the same name

When we check whether we should rebuild image, we check if the
md5sum of some important files changed - which would trigger
question whether to rebuild the image or not (because of
changed dependencies which need to be installed). This
happens for example when package.json or yarn.lock changes.

Previously, all the important files had distinct names, so
we stored the md5 hashes of those files with just filenames +.md5sum
but they were flattened to a single directory. Unfortunately,
as of #14927 (merged with failing build) we had two package.json
and two yarn.locks and it caused overwriting of md5hash of one
by the other. This triggered unnecessary rebuilding of the image
in CI part which resulted in failure (because of Apache Beam
dependency problem).

This PR fixes it by adding parent directory to the name of
the md5sum file (so we have www-package.json and ui-package.json)
now. Those important files change very rarely so this incident
should not happen again but we added some comments preventing
it.

* Update scripts/ci/libraries/_initialization.sh

Co-authored-by: Felix Uellendall <feluelle@users.noreply.github.com>
(cherry picked from commit 775ee51d0e)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 53962045c9 Fixes broken asset compilation in Docker images (#14995)
The change #14911 had a bug - when PYTHON_MINOR_MAJOR_VERSION
was removed from the imge args, the replacement `python -m site`
expression missed `/airflow/` suffix. Unfortunately it was not
flagged as an error because the recompiling script silently
skipped recompilation step in such case.

This change:

* fixes the error
* removes the silent-skipping if check (the recompilation will
  fail in case it is wrongly set)
* adds check at image verification whether dist/manifest.json is
  present.

Fixes: #14991
2021-04-15 14:00:31 +01:00
Kaxil Naik 584a45a7f9 Fix failing doc build (#14986)
- For some reason pymongo's stable inventory fetch is redirecting, I can reproduce it locally too if I try to access https://pymongo.readthedocs.io/en/stable/objects.inv from by browser.

(cherry picked from commit f6a1774555)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 6984dbc8b8 Skips provider package builds and provider tests for non-master
This PR skips building Provider packages for branches different
than master. Provider packages are always released from master
but never from any other branch so there is no point in running
the package building and tests there

(cherry picked from commit df368f17df361af699dc868af9481ddc3abf0416)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 9d9f68e562 Much easier to use and better documented Docker image (#14911)
Previously you had to specify AIRFLOW_VERSION_REFERENCE and
AIRFLOW_CONSTRAINTS_REFERENCE to point to the right version
of Airflow. Now those values are auto-detected if not specified
(but you can still override them)

This change allowed to simplify and restructure the Dockerfile
documentation - following the recent change in separating out
the docker-stack, production image building documentation has
been improved to reflect those simplifications. It should be
much easier to grasp by the novice users now - very clear
distinction and separation is made between the two types of
building your own images - customizing or extending - and it
is now much easier to follow examples and find out how to
build your own image. The criteria on which approach to
choose were put first and forefront.

Examples have been reviewed, fixed and put in a logical
sequence. From the most basic ones to the most advanced,
with clear indication where the basic aproach ends and where
the "power-user" one starts. The examples were also separated
out to separate files and included from there - also the
example Docker images and build commands are executable
and tested automatically in CI, so they are guaranteed
to work.

Finally The build arguments were split into sections - from most
basic to most advanced and each section links to appropriate
example section, showing how to use those parameters.

Fixes: #14848
Fixes: #14255
2021-04-15 14:00:31 +01:00
Jarek Potiuk 89a2eb0412 Fixes default group of Airflow user. (#14944)
The production image did not have root group set as default for
the airflow user. This was not a big problem unless you extended
the image - in which case you had to change the group manually
when copying the images in order to keep the image OpenShift
compatible (i.e. runnable with any user and root group).

This PR fixes it by changing default group of airflow user
to root, which also works when you extend the image.

```
Connected.
airflow@53f70b1e3675:/opt/airflow$ ls
dags  logs
airflow@53f70b1e3675:/opt/airflow$ cd dags/
airflow@53f70b1e3675:/opt/airflow/dags$ ls -l
total 4
-rw-r--r-- 1 airflow root 1648 Mar 22 23:16 test_dag.py
airflow@53f70b1e3675:/opt/airflow/dags$
```
2021-04-15 14:00:31 +01:00
Kamil Breguła 945bbb19d2 Create a documentation package for Docker image (#14846)
(cherry picked from commit a18cbc4e91)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 54737a4b7c Quarantine test_clit_tasks - they have a lot of errors 2021-04-15 14:00:31 +01:00
Jarek Potiuk 822d883c75 Optimizes image verification steps. (#14780)
So far we had matrix of builds that verified images - each
image was verified by separate matrix-based job and those
verifications were run after all images were alredy available.

This step optimizes it. Those steps are run in the same job
as "waiting for image", also they run in parallel which will
make them a bit faster.

This verification is fast and it can be run on any machine
in parallel without any problems.

(cherry picked from commit c59ab1ddcd)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 0b881c566f Running tests in parallel (#14915)
This is by far the biggest improvements of the test execution time
we can get now when we are using self-hosted runners.

This change drives down the time of executing all tests on
self-hosted runners from ~ 50 minutes to ~ 13 minutes due to heavy
parallelisation we can implement for different test types and the
fact that our machines for self-hosted runners are far more
capable - they have more CPU, more memory and the fact that
we are using tmpfs for everything.

This change will also drive the cost of our self-hosted runners
down. Since we have auto-scaling infrastructure we will simply need
the machines to run tests for far shorter time. Since the number
of test jobs we run on those self hosted runners is substantial
(10 jobs), we are going to save ~ 6 build hours per one PR/merged
commit!

This also allows the developers to use the power of their
development machines - when you use
`./scripts/ci/testing/ci_run_airflow_testing.sh` the script
detects how many CPU cores are available and it will run as
many parallel test types as many cores you have.

Also in case of Integration tests - they require more memory to run
all the integrations, so in case there is less than ~ 32 GB of RAM
available to Docker, the integration tests are run sequentially
at the end. This drives stability up for machines with lower memory.

On one personal PC (64GB RAM, 8 CPUS/16 cores, fast SSD) the full
test suite execution went down from 30 minutes to 5 minutes.

There is a continuous progress information printed every 10 seconds when
either parallel or sequential tests are run, and the full output is
shown at the end - failed tests are marked in red groups, and succesful are
marked in green groups. This makes it easier to see and analyse errors.

(cherry picked from commit 01a5d36e6b)
2021-04-15 14:00:31 +01:00
Jarek Potiuk e43f799a5b Adds resource check when running Breeze (#14908)
When Breeze is run, it requires some resources in the docker
engine, otherwise it will produce strange errors.

This PR adds resource check when running breeze - it will print
human-friendly size of CPU/Memory/Disk available for docker
engine and red error (still allowing Docker to run) when the
resources are not enough.

Fixes: #14899
(cherry picked from commit 3dd42a5a3f)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 12ad529a3a Remove Backport Providers (#14886)
We are removing support for Backport Providers now.

The last release was sent yesterday- as planned, on 17 March 2021 - the
last release of the Backport Providers.

As agreed before, and documented here:
https://github.com/apache/airflow/blob/master/dev/PROJECT_GUIDELINES.md#support-for-backport-providers

> Backport providers within 1.10.x, will be supported for critical fixes
for three months (March 17, 2021) from Airflow 2.0.0 release date (Dec
17, 2020).

For the future reference, if anyone would like to build backport
providers with cherry-picking any fixes, the branch to start from is
`legacy-backport-cutoff-point`. The documentation and tools to build the
backports are there, but there will be no more community releases for
backports.

Good Bye Backport Providers.

(cherry picked from commit 68e4c4dcb0)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 55bc60231a Running tests in parallel (#14531)
This is by far the biggest improvements of the test execution time
we can get now when we are using self-hosted runners.

This change drives down the time of executing all tests on
self-hosted runners from ~ 50 minutes to ~ 13 minutes due to heavy
parallelisation we can implement for different test types and the
fact that our machines for self-hosted runners are far more
capable - they have more CPU, more memory and the fact that
we are using tmpfs for everything.

This change will also drive the cost of our self-hosted runners
down. Since we have auto-scaling infrastructure we will simply need
the machines to run tests for far shorter time. Since the number
of test jobs we run on those self hosted runners is substantial
(10 jobs), we are going to save ~ 6 build hours per one PR/merged
commit!

This also allows the developers to use the power of their
development machines - when you use
`./scripts/ci/testing/ci_run_airflow_testing.sh` the script
detects how many CPU cores are available and it will run as
many parallel test types as many cores you have.

Also in case of Integration tests - they require more memory to run
all the integrations, so in case there is less than ~ 32 GB of RAM
available to Docker, the integration tests are run sequentially
at the end. This drives stability up for machines with lower memory.

On one personal PC (64GB RAM, 8 CPUS/16 cores, fast SSD) the full
test suite execution went down from 30 minutes to 5 minutes.

There is a continuous progress information printed every 10 seconds when
either parallel or sequential tests are run, and the full output is
shown at the end - failed tests are marked in red groups, and succesful are
marked in green groups. This makes it easier to see and analyse errors.

(cherry picked from commit 5539069ea5)
2021-04-15 14:00:31 +01:00
Jarek Potiuk 3bd021b4b3 Adds missing variable for force pull base image variable (#14901)
The variable's default value was not set, thus failing pre-commit
scripts in case image was not built recently.

(cherry picked from commit f2c3403580)
2021-04-15 14:00:31 +01:00
Kaxil Naik c3536f499a Fixes unbound variable on MacOS (#14877)
Without it I get:

```
$ ./breeze
./breeze: line 28: @: unbound variable
```

(cherry picked from commit 16f43605f3)
2021-04-15 14:00:31 +01:00
Kaxil Naik c2d602db5a Add Airflow 2.0.1 to ``breeze-complete`` and BREEZE.rst (#14876)
2.0.1 was missing from the breeze-complete list and the docs

(cherry picked from commit ebc22fec70)
2021-04-15 14:00:30 +01:00
Jarek Potiuk a81cbc28bf Fixes some of the flaky tests in test_scheduler_job (#14792)
The Parallel tests from #14531 created a good opportunity to
reproduce some of the race conditions that cause some of the
scheduler job test to be flaky.

This change is an attempt to fix three of the flaky tests
there by removing side effects between tests. The previous
implementation did not take into account that scheduler job
processes might still be running when the test finishes and
the tests could have unintended side effects - especially
when they were run on a busy machine.

This PR adds mechanism that stops all running
schedulerJob processes in tearDown before cleaning
the database.

Fixes: #14778
Fixes: #14773
Fixes: #14772
Fixes: #14771
Fixes: #11571
Fixes: #12861
Fixes: #11676
Fixes: #11454
Fixes: #11442
Fixes: #11441
(cherry picked from commit 45cf89ce51)
2021-04-15 14:00:30 +01:00
Jarek Potiuk a4d056f043 Replaces 1.10.14 with 1.10.15 where needed (#14866)
(cherry picked from commit 3f61df11e7)
2021-04-15 14:00:30 +01:00
Jarek Potiuk 6a003ddb41 When `breeze stop` is called all integrations are enabled (#14825)
Sometimes when an integration got broken, it could be broken
permanently due to left-over volume. This was because when
breeze stop was called the integration were not enabled and
some volumes were not deleted.

As result, breeze sometimes could not start with integrations, due to
misterious 'unhealthy' condition of one of the integrations.

This change enables all integrations automatically when
breeze stop is called so that all volumes are removed.

(cherry picked from commit 5f774fae53)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor 72bec72ff4 Prepare to switch master branch for main. (#14688)
There are many more references to "master" (even in our own repo) than
this, but this commit is the first step: to that process.

It makes CI run on the main branch (once it exists), re-words a few
cases where we can to easily not refer to master anymore.

This doesn't yet re-name the `constraints-master` or `master-*` images -
that will be done in a future PR.

(We don't be able to entirely eliminate "master" from our repo as we
refer to a lot of other GitHub repos that we can't change.)

(cherry picked from commit 0dea083fcb)
2021-04-15 14:00:30 +01:00
Jarek Potiuk 0e8fb85aca Fixes recent scripting breeze fix to work also with zsh (#14787)
The BASH variable introduced in #14579 is not set when the main shell is
zsh on MacOS (which is the default) this PR changes it so that the
output of `command -v bash` is used instead.

Fixes #14754

(cherry picked from commit c613384feb)
2021-04-15 14:00:30 +01:00
Jarek Potiuk 899a75ddfa Fixes case where output log is missing for image waiting (#14784)
(cherry picked from commit 4408866b4b)
2021-04-15 14:00:30 +01:00
Jarek Potiuk 401c57ff08 Only rebuilds base python image when upgrading to newer deps (#14783)
The base python image is only updated when manually triggered and
in case of checking for upgraded dependencies in master build.

While automated upgrade to latest Python image is good for
security, it can cause a number of problems when run automatically
in the CI:

* cache invalidation - thus longer builds
* sudden test failures

This happened in the past already quite a number of times so it
is time to switch to a bit different mode. Python images will only
be automatically upgraded in those cases:

1) When Master CI build is run in scheduled nightly build - to check
   that tests still pass for latest version of the image

2) When manually refreshed with --force-pull-base-python-image

3) When DockerHub official images (from tags) are built.

The procedure to refresh the images manually in our CI has been
added to the documentation.

(cherry picked from commit 4762396b8b)
2021-04-15 14:00:30 +01:00
Jarek Potiuk ffe84968dc Better diagnostics for image waiting (#14779)
Sometimes (very rarely) some 'wait for image' pulling steps
loop forever (while other steps from parallell jobs pulling the
same image have no problems).

Example here:

Failed step here:

* https://github.com/apache/airflow/runs/2106723280?check_suite_focus=true#step:5:349

Another similar step in parallel job had no problems with retrieving the
same image earlier:

* https://github.com/apache/airflow/runs/2106723269?check_suite_focus=true#step:5:119

Both images pulled the same image:

docker.pkg.github.com/apache/airflow/master-python3.6-ci-v2:651461418

This change adds diagnostics information that might provide more
information in case this happens again so that we can understand
what is going on and mitigate the issue.

(cherry picked from commit 4cde47b339)
2021-04-15 14:00:30 +01:00
Jarek Potiuk cd4d8b1009 Fixes force-pulling base python images (#14736)
Sometimes base python image patchlevel might case failure of tests.
This happens for example with test_views.py tests fixed in #14719
where CVE fix in all python versions caused our tests to fail.

There was an error in our scripts - when --force-pull-images
were used, the base python version was not updated to the
latest version even if there was a newer one and it caused our
images to bounce few times between two latest patchlevels
when they were manually refreshed.

This change fixes it so that the base python image is always used
when

a) FORCE_PULL_IMAGES is true or
b) UPGRADE_TO_NEWER_DEPENDENCIES is != false

This will cause python upgrade in two cases:
 - when images are rebuilt with --force-pull-images locally
 - when images are upgraded with newer dependencies on master

(cherry picked from commit 61b448221d)
2021-04-15 14:00:30 +01:00
Jarek Potiuk f991734586 Remove Heisentest category and quarantine test_backfill_depends_on_past (#14756)
The whole Backfill class was in Heisentest but only one of those tests
is problematic nowi: test_backfill_depends_on_past. Therfore it makes
sense to remove the class from heisentests and move the
depends_on_past to quarantine.

It turned out that this is the last "Heisentest" and with the
isolation we have now coming in parallel tests, it turns out that
Heisentests are not really good way thinking about the tests - running
them in isolation does not often help, it only makes it more difficult
to flag the tests as flaky.

The quarantine test_backfill_depends_on_past ihas been captured in
the #14755 issue - and hopefully we will make an effort to
de-quarantine some of those tests soon.

(cherry picked from commit 4ce952e7c2)
2021-04-15 14:00:30 +01:00
Jarek Potiuk 11d915e910 Fixed runs-on for non-apache repository (#14737)
The change #14718 by mistake left the 'self-hosted" runs-on in case of
push or schedule. This caused failures on non-apache repositories.

(cherry picked from commit 945a5b927a)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor 223a58f5e3 Remove un-needed/left over environment variables in ci.yml (#14732)
AIRFLOW_COMMITERS was left over from a previous PR and should have never
been included.

EVENT_NAME and TARGET_COMMIT_SHA aren't needed as GitHub already provide
us GITHUB_EVENT_NAME and GITHUB_SHA with the same values.

(cherry picked from commit c9a7314fce)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor 2894b4e712 Reduce duplication in pre_commit_check_order_setup.py script (#14731)
Now that we are using Py3.6+ we can rely on dictionary key order to be
fixed (it was always fixed in 3.6, just not explicitly documented as
such from 3.7) -- as a result we can load the source, rather than try to
parse it with regexes

(cherry picked from commit 775f807005)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor 6ca12b6c72 Don't use author_association for self-hosted vs public runner decision. (#14718)
Using this has two draw-backs for us.

1. MEMBER applies to _anyone in the org_, not just members/commiters to
   this repo
2. The value of this setting depends upon the user's "visiblity" in the
   org. I.e. if they hide their membership of the org, the
   author_association will show up as "CONTRIBUTOR" instead.

Both of these combined mean we should instead use an alternative list.

We can't use a secret as the `secrets.` context is not available in the runs-on
stanza, so we have to have a hard-coded list in the workflow file :( This is as
secure as the runner still checks the author against it's own list.

(cherry picked from commit 4213487746)
2021-04-15 14:00:30 +01:00
John Bampton 22509d2c2a Fix grammar and remove duplicate words (#14647)
* chore: fix grammar and remove duplicate words

(cherry picked from commit 6dc24c95e3)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor 98edfb90ee Fix breeze redirection on linux/Ubuntu 20.04 (#14626)
The version of `script` on ubunutu 20.04 (and likely others) has doesn't
have the `--log-out` cli arg, but that is the default mode for the
"un-named" arg anyway.

(cherry picked from commit 8116bed095)
2021-04-15 14:00:30 +01:00
Kaxil Naik e1302b32ac Remove redundant step in CodeQL GitHub Actions step (#14600)
From https://github.com/apache/airflow/runs/2031451066#step:4:9

Warning: 1 issue was detected with this workflow: git checkout HEAD^2 is no longer necessary.
Please remove this step as Code Scanning recommends analyzing the merge commit for best results.

I also double-check it with https://github.com/github/codeql-action

(cherry picked from commit 4efd5845eb)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor f3b34cd0cd Restore correct terminal with to interactive breeze usage (#14579)
The change to redirect all breeze output to a file had the effect of
making stdin no longer a TTY, which meant that the docker container had
a fixed with of 80 columns.

The fix is to replace the use of `tee` with `scripts`, which does what
we want -- captures stdout+stderr, but makes the running program believe
that it is still attached to a terminal (if it ever was).

(cherry picked from commit 4c1e3c8a16)
2021-04-15 14:00:30 +01:00
Kamil Breguła e96575cfe6 Use airflow db check command in entrypoint_prod.sh (#14530)
* Use airflow db check in entrypoint_prod.sh

* fixup! Use airflow db check in entrypoint_prod.sh

Co-authored-by: Kamil Breguła <kamilbregula@apache.org>
(cherry picked from commit e273366ff8)
2021-04-15 14:00:30 +01:00
John Bampton d18813e8f1 chore: fix case of GitHub (#14525)
(cherry picked from commit 1c694318ac)
2021-04-15 14:00:30 +01:00
John Bampton 555c1fc445 Add pre-commit check to sort and remove duplicates from the spelling wordlist (#13170)
(cherry picked from commit 53b9062f9e)
2021-04-15 14:00:30 +01:00
Xiaodong DENG f06d3d6bf4 Multiple minor doc fixes (#14917)
(cherry picked from commit ed872a64f1)
2021-04-15 14:00:30 +01:00
Kaxil Naik ace855413d Sort lists, sets and tuples in Serialized DAGs (#14909)
Currently we check if the dag changed or not via dag_hash.
The problem is since the insertion order is not guaranteed, it produces
a different hash and hence results in a DB write unncessarily.

This commit fixes it.

(cherry picked from commit 4531168e90)
2021-04-15 14:00:30 +01:00
Kaxil Naik be9881b132 Simplify cleaning string passed to origin param (#14738) (#14905)
Looks like "trying to be smart approach" in https://github.com/apache/airflow/pull/14738
does not work on old Python versions. The "smart" part being if semicolon exists in URL
only those specific query argument were removed. While this solves the issue for Py 3.6.13
 it didn't fix for 3.6.12 (although it minimzed it).

Python 3.6.12:

```python
>>> parse_qsl("r=3;a=b")
[('r', '3'), ('a', 'b')]
```

Python 3.6.13:

```python
>>> parse_qsl("r=3;a=b")
[('r', '3;a=b')]
```

This commit simplifies it and check if the url contains `;`, it just redirects to
`/home`.

(cherry picked from commit 178dee9a5e)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor f5fb9c4e96 Use libyaml C library when available. (#14577)
This makes loading local providers 1/3 quicker -- from 2s down from 3s
on my local SSD.

The `airflow.utils.yaml` module can be used in place of the normal yaml
module, with the bonus that `safe_load` will use libyaml where available
instead of always using the pure python version.

This shaves 3 minutes off the "WWW" tests - down to 8 minutes from
11 minutes.

I have not used this module in tests/docs code etc, as I don't want to
force importing `airflow` (and everything in currently brings in) in to
those contexts.

(cherry picked from commit 7daebefd15)
2021-04-15 14:00:30 +01:00
Ash Berlin-Taylor d5aabd0e35 Don't create unittest.cfg when not running in unit test mode (#14420)
Right now on airflow startup it would _always_ create a unittest.cfg,
even if unit test mode was not enabled.

This PR changes it so that this file is only created when unit tests
mode is enabled (via the environment variable or in airflow.cfg).

I have also refactored the mess of top-level code that was interspersed
between functions to all be at one place -- at the end of the file.

The bulk of the config loading code now lives in the `initialize_config`
function.

Adding lazy module attributes to airflow.configuration via the Pep562
class caused some weird behaviour where AirflowConfigParser became
unpickleable because of some "leaking" closed-over scope. (Pickling this
class is used almost exclusively by PythonVirtualEnvOperator). The fix
here is to add custom get/set state methods to not store more than we
need.

(cherry picked from commit b16b9ee689)
2021-04-15 14:00:30 +01:00
Jun 337edeee55 Fix error when running tasks with Sentry integration enabled. (#13929)
Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
(cherry picked from commit 0e8698d3ed)
2021-04-15 14:00:30 +01:00
Kaxil Naik ab8c55878e Webserver: Sanitize string passed to origin param (#14738)
Follow-up of #12459 & #10334

Since https://github.com/python/cpython/pull/24297/files (bpo-42967)
also removed ';' as query argument separator, we remove query arguments
with semicolons.

(cherry picked from commit 409c249121)
2021-04-15 14:00:30 +01:00
Kaxil Naik b3350610b8 Fix tests in tests/www/test_views.py (#14719)
One of tests fixed in (#14710) had an usused variable - `expected_url`,
copy/paste failure. This commit fixes it and adds a condition too to
only replace url if it contains a semi-colon

(cherry picked from commit 52604a3444)
2021-04-15 14:00:30 +01:00
Kaxil Naik 194597f2e6 Fix tests for all urllib versions with only '&' as separator (#14710)
Turns out #14698 did not fix the issue as Master failed again. After
digging a bit more I found that the CVE was fixed in all
Python versions: 3.6.13, 3.7.10 & 3.8.8

The solution in this PR/commit checks the `parse_qsl` behavior with
following tests:

```
❯ docker run -it python:3.8-slim bash
root@41120dfd035e:/# python
Python 3.8.8 (default, Feb 19 2021, 18:07:06)
>>> from urllib.parse import parse_qsl
>>> parse_qsl(";a=b")
[(';a', 'b')]
>>>
```

```
❯ docker run -it python:3.8.7-slim bash
root@68e527725610:/# python
Python 3.8.7 (default, Feb  9 2021, 08:21:15)
>>> from urllib.parse import parse_qsl
>>> parse_qsl(";a=b")
[('a', 'b')]
>>>
```

(cherry picked from commit 7bd9d477dd)
2021-04-15 14:00:30 +01:00
Kaxil Naik 4033041ab9 Separate out tests to cater of changes in Python 3.8.8 (#14698)
https://github.com/python/cpython/pull/24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- 061cd236de#
- 49952e79b0

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

(cherry picked from commit ffe3bd2957)
2021-04-15 14:00:30 +01:00