To pick up the newer kernel/security updates. Only takes effect when
people destroy/recreate their VM, so also adds a `dist-upgrade` to
upgrade existing boxes. (The older Bento box had a broken kernel
config so `dist-upgrade` can't upgrade the kernel, but it's better
than nothing.)
Also switches the Hyper-V provider to the Bento images for parity,
since Bento now create Hyper-V variants too.
The `box` name cannot be factored out of the provider blocks due to:
https://github.com/hashicorp/vagrant/issues/9452
Since memcached server and the client headers are now unused.
Whilst the `zlib1g-dev` package is no longer needed by pylibmc, it's
still required as a sub-dependency of `libmysqlclient-dev`, so has
deliberately not been added to the `apt-get purge` cleanup step.
Since:
* Redis has additional features we need (eg for bug 1409679)
* the Redis server, python client and Django backend are more
actively maintained (so have persistent connections/pooling that
actually works, which gives a big perf win)
* we can use Heroku's own Redis addon rather than relying on a
third-party's (to hopefully prevent a repeat of the certificate
expiration downtime)
This commit:
* Switches pylibmc/django-pylibmc to redis-py/django-redis, and
configures the new backend according to:
http://niwinz.github.io/django-redis/latest/https://github.com/andymccurdy/redis-py
* Uses redis-py's native support for TLS to connect to the Heroku
Redis server's stunnel port directly, avoiding the complexity of
using a buildpack to create an stunnel between the dyno and server:
https://devcenter.heroku.com/articles/securing-heroku-redis#connecting-directly-to-stunnel
* Uses explicit `REDIS_URL` values on Travis/Vagrant rather than
relying on the django-environ `default` value (for parity with
how we configure `DATABASE_URL` and others).
* Removes the pylibmc connection-closing workaround from `wsgi.py`.
Note: Whilst the Heroku docs suggest using `django-redis-cache`, it's
not as actively maintained, so we're using `django-redis` instead:
https://devcenter.heroku.com/articles/heroku-redishttps://github.com/niwinz/django-redishttps://github.com/sebleier/django-redis-cache
Before this is merged, Heroku Redis instances will need to be created
for stage/production (prototype done) - likely on plan `premium-3`:
https://elements.heroku.com/addons/heroku-redis
We'll also need to pick an eviction policy:
https://devcenter.heroku.com/articles/heroku-redis#maxmemory-policy
Once deployed, the `memcachier-tls-buildpack` buildpack will no longer
be required and can be removed from prototype/stage/prod, along with
the `TREEHERDER_MEMCACHED` environment variable and Memcachier addon.
Uses our custom `clear_cache` Django management command instead of
restarting memcached, since (a) for memcached has the same effect,
(b) will continue to work even after we switch to Redis (which instead
persists cache contents across server restarts).
Removes the `threstartmemcached` command since IMO it's no simpler
than just running `./manage.py clear_cache` on its own.
On platforms that are not Windows, the Vagrantfile enables a second
virtual network adapter to allow NFS to work. However this breaks
the fix in bug 1417099, since it only expected one interface to be
returned by the grepped `ip` output. We could have adjusted that fix
to instead iterate through the results, however it's easier to just
use a wildcard with `iptables` instead, since we know all ethernet
adapters start with the letter `e`.
As of yarn 1.0+ if the version of node or yarn doesn't match that
specified in `package.json`, yarn commands will fail to run and
complain about incompatible versions. Unfortunately the error message
doesn't really make it clear that the user has the wrong version
installed, and instead gives the impression something is broken with
Treeherder. In addition, much of the time the exact version isn't
important, so the user doesn't need to go to the trouble of changing
their node installation.
As such, we just disable the yarn engines check for now.
The bento image has renamed the network interface from `enp0s3` back
to `eth0`, effectively reverting the upstream Ubuntu 16.04 "predictable
network interface names" feature. We now have to calculate the name
dynamically, since we don't know which version of the bento image
will be installed locally when people use the Treeherder Vagrant
environment.
Since the `.profile` script is sourced in the login shell, so using
strict mode can cause the shell to exit and so `vagrant ssh` to not
work. Instead we fix the SC2164 using the method here:
https://github.com/koalaman/shellcheck/wiki/SC2164
Since in most cases people will use it from the host anyway, and if
they don't it can lead to committing changes under the author
`vagrant <vagrant@vagrant.vm>`.
To prevent errors when Vagrant provisioning after a new nodejs
version is released, due to version mismatch between package.json
(used by Heroku) and the APT package used by Vagrant. The other
option is pinning the version in Vagrant too, however we might as
well wait until the Docker based development environment where
state management/invalidation is simpler.
By default MySQL#s InnoDB flushes to disk after every transaction to
reduce the chance of data loss in the case of a crash. During testing
and development this is not necessary and only serves to limit I/O
throughput. By switching to mode `0` (flushing every second), test
runtimes are reduced by 20-30%.
See:
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit
Since it's mostly pointless locally, given that the DB is rarely
large enough to cause a slow query and that it's unlikely that devs
check the logs anyway.
To prevent confusing pytest `ImportMismatchError` exceptions if the
paths hardcoded in pyc files (which are git-ignored) leftover from
the past, don't match the paths used now.
To reduce the risk that one of the commands we run is broken by things
that need cleaning up (eg pyc files). An unnecessary `|| true` has also
been removed from the `rm -f` moved command.
Now that we're using MySQL 5.7 server, there is no longer a conflict
between our desired choice of client library (which had to be 5.7 to
avoid security vulnerabilities) and the server version.
The library is still vendored on Heroku for now (in `bin/pre_compile`),
but that can stop too, once we switch to the Heroku-16 stack which is
based on Ubuntu 16.04 (bug 1365567).
The Vagrant environment is running Ubuntu 16.04, which has a native
mysql-server-5.7 package available, so doesn't need a custom APT
repository to be set.
Travis is still on Ubuntu 14.04 (since they don't offer 16.04 yet),
so has to use the upstream mysql.com repository instead.
In the future I'll be switching both Travis and local development to
use the same Docker images to avoid these kind of differences.
After this lands, we'll want to open a PR against the Terraform configs
to switch the treeherder-dev RDS instance to MySQL 5.7, before doing
the same for stage/prod.
At some point soon we'll want to switch Heroku stage/production to
their new Heroku-16 (Ubuntu 16.04) stack, so it makes sense to trial
it locally first. Their Heroku-16 docker image will make switching
to docker simpler too.
Notable changes to the provision steps:
* We're using the Bento boxes rather than the 'official' Canonical
ones, at the recommendation of Vagrant:
https://www.vagrantup.com/docs/boxes.html#official-boxes
* openjdk-8-jre-headless is now available from the ubuntu.com
repository, so doesn't require a PPA.
* There isn't an official mysql-server-5.6 package (only 5.7+), so
we have to start using a PPA.
* Services must now be managed using systemd rather than SysV init.
* The `eth0` network interface has changed name:
https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/
* The `vendor-libmysqlclient.sh` script will still be run on Ubuntu
14.04 on Travis/Heroku for now, so has to be compatible with both.
* The previous temporary cleanup steps can be removed since the
vagrant destroy means a clean slate.
Since the third-party PPA is no longer maintained, and doesn't have
Python 2.7.13 which is what is being used on Heroku. Instead we copy
the Heroku Python buildpack installation method for greater parity:
18c404f72d/bin/steps/python
Adds the virtualenv directory to `PATH` in `env.sh`, to save having
to activate the virtualenv (which is about to be removed) or set
`PATH` in each of the wrapper bin scripts.
By default webservers like Django's runserver, gunicorn or the
Webpack devserver only bind to the loopback adapter (127.0.0.1) and
so are not accessible from outside the Vagrant / virtualbox VM,
since port forwarding only forwards traffic to the non-loopback
adapters.
Previously varnish (which listened on `0.0.0.0`) was reverse
proxying traffic to runserver/gunicorn, however we need to now do so
for webpack-dev-server on another port too. Doing both with varnish
adds complexity, and we don't actually need any of varnish's other
features, so ideally want to stop using it.
Rather than having to override each webserver to bind to all
adapters (using the IP `0.0.0.0`), it's possible to forward traffic
to the loopback adapter using iptables NAT PREROUTING rules. This
is still secure so long as the Vagrantfile port forwarding uses a
`host_ip` of `127.0.0.1`. To prevent this "Martian packet" traffic
from being blocked, `route_localnet` must also be set to `1`. See:
https://unix.stackexchange.com/questions/111433/iptables-redirect-outside-requests-to-127-0-0-1
By default neither sysctl or iptables settings are persisted across
reboots, and fixing that requires more complexity (eg installing the
iptables-persistent package and handling config changes during
provision). As such, it's just easier to re-run the commands on each
login since they take <30ms.
* ES 5.x now requires JDK 8, and Ubuntu 14.04 only ships with openjdk 7,
so a third party PPA must be used in Vagrant:
https://launchpad.net/~openjdk-r/+archive/ubuntu/ppa
* ES 5.x changed the way it manages the heap size, such that:
- The variables for controlling it have changed (now set via eg
`ES_JAVA_OPTS="-Xms1g -Xmx1g"` or in the jvm.options file). See:
https://www.elastic.co/guide/en/elasticsearch/reference/5.2/heap-size.html
- The default heap size has increased from ~(min:256MB, max:1GB) to
(min: 2GB, max: 2GB) which causes OOM in the VM, unless either
lowered back down or the VM RAM increased.
* The Python ES clients must be updated to the latest releases:
https://elasticsearch-py.readthedocs.io/en/master/#compatibility
* The previous test failures were fixed in #2403.
* The Vagrant provision now also waits for Elasticsearch to be ready
before trying to run the Django migrations, since Elasticsearch can
take a while to start (and always has). This prevents failures when
the pip/yarn install steps are no-ops (when already up to date),
causing the Django migration to run immediately after the ES install
step.
The pip output is pretty noisy by default and `-q` is much too quiet.
As such the filtering with `sed` is about all we can do (and is what the
Heroku Python buildpack does too).
In a later PR I'm going to be removing the virtualenv and global
Python/pip installs entirely (in favour of vendoring the same Python
binary that Heroku uses), which will simplify the steps being added here
and in the next few commits.