WhiteNoise 3.0 now supports serving Brotli-compressed files to browsers
whose `Accept-Encoding` includes `br`. Note: Both Firefox and Chrome
only support Brotli over HTTPS.
To take advantage of this, the Brotli package just needs to be available
when the compression tool (`python -m whitenoise.compress`) is run. See:
http://whitenoise.evans.io/en/latest/changelog.html#brotli-compression-supporthttp://whitenoise.evans.io/en/latest/django.html#brotli-compression
The WhiteNoise docs say to use an unofficial PyPI package (brotlipy),
however this has a dependency on libffi (via cffi) and the official repo
now has it's own Python wrapper that does not. As such, this commit
instead uses the official Brotli package from GitHub, whilst we wait for
the official PyPI release (https://github.com/google/brotli/issues/72).
The Brotli install works fine on stage/prod/Heroku/Travis. The Vagrant
environment was missing g++, which is now installed during provision.
As of pip 8, peep has now been integrated into pip.
Migrating from peep to this native feature has several advantages:
* It avoids the complexity/learning curve of using a wrapper around pip.
* It means we do not need to fork the official Heroku Python buildpack
(which handles pip installation of requirements files) in order to use
hash verification on Heroku. (Once the buildpack updates to pip 8.)
* Omitted sub-dependencies result in install-time errors rather than
the user discovering omissions at run-time.
* pip's native caching is used, and all packages are installed in one
pip invocation, so it's significantly faster.
* It has better handling of errors and corner cases.
Key facts about the native feature:
* hash-checking mode is enabled if at least one hash is found in the
requirements files passed to pip, or can be force enabled by passing
`--requires-hashes` when running `pip install`.
* Once enabled, hash-checking mode enforces that all packages:
- are pinned to a specific version
- have hashes listed
- have all sub-dependencies specified
* Older versions of pip will error out if either `--require-hashes` or
the requirements file `--hash` syntax is used, meaning it's not
possible to accidentally lose hash-checking protection if the pip used
is older than expected.
For more details, see:
https://pip.pypa.io/en/stable/user_guide/#hash-checking-modehttps://pip.pypa.io/en/stable/reference/pip_install/#hash-checking-mode
The pip version on Travis and in the Vagrant virtualenv has been updated
to 8.0.2 in bug 1241144, and the stage/prod virtualenv in bug 1241519.
The Heroku Python buildpack pip was updated in bug 1241909.
The requirements files hashes were ported using `peep port`, and then
comments/URLs re-added by hand.
We're already using pylibmc on Heroku, since we need its SASL
authentication support. However we were still using python-memcached on
Vagrant, Travis, stage & prod.
To reduce the number of simultaneous changes when we migrate to Heroku,
and to ensure at that point we're testing what we ship, this switches us
to pylibmc pre-emptively for all environments.
Django does have a native pylibmc backend [1], however it doesn't
support SASL authentication, so we have to use the custom django-pylibmc
backend instead [2], which we choose to use everywhere (even though only
Heroku's memcache instances require auth) for consistency.
Installing pylibmc requires libmemcached-dev, which is installed by
default on Travis, has just been installed on stage/prod (bug 1243767),
and as of this change, is now installed in the Vagrant environment too.
The comment mentioning that pylibmc must be present in the root
requirements.txt file no longer applies, since the Python buildpack now
uses pip-grep (rather than regex) to determine whether it is in the
requirements files [3], and so handles included requirements files too.
Example: https://emorley.pastebin.mozilla.org/8858007
I'll be checking for any changes in performance when this is deployed,
however if anything I expect it to be faster since it's not pure Python.
[1] https://github.com/django/django/blob/1.8.7/django/core/cache/backends/memcached.py#L171
[2] https://github.com/django-pylibmc/django-pylibmc/blob/master/django_pylibmc/memcached.py
[3] https://github.com/heroku/heroku-buildpack-python/blob/v75/bin/steps/pylibmc#L22
Removes the gcc and libxml2-dev entries.
The gcc entry is a no-op, since no newer version is available for Ubuntu
14.04. It's not clear what previously required libxml2-dev, however it
was introduced in a commit that added AWS support for the now
decommissioned AWS dev environment. Either way the provision succeeds
without it now.
The individual RABBITMQ_* variables are never used on their own, so it
makes more sense to switch to a URL variable that combines all of them.
Travis/Vagrant have also had BROKER URL explicitly set, since we're
generally moving away from having testing defaults set in settings.py.
Once stage/prod have BROKER_URL set, we can remove the fallback to the
old variable names, and also remove defaults entirely, making missing
settings fail fast.
Now that we're using django-environ's env.bool() in settings.py, the
value of TREEHERDER_DEBUG set in the Vagrant environment is correctly
picked up, so the local settings file DEBUG define is unnecessary.
This means rather than just returning this to the client:
'{"detail":"Hawk authentication failed"}'
We'll output more information to the stage/prod/Vagrant console and thus
logs. eg:
WARNING [hawkrest:81] access denied: MacMismatch: MACs do not match;
ours: 8HqvrjA7MO0CmJ3c6Ij2VfEuZxkchEL+pOL6Cw7At2c=; theirs:
CyDv9vnmqP6yGqT+LknMA+FOwJvSLATJwvVgG0OTl8A=
Thanks to the log.warn() here:
https://github.com/kumar303/hawkrest/blob/0.0.8/hawkrest/__init__.py#L80
I've removed the log level set on the console handler, since I think
it's clearer to set them on the loggers themselves, but it's not
entirely clear what Django logging best practices are, since the Django
docs are not very specific.
Since they're not specific to the Django app 'webapp'.
Whilst we're there, the local & example settings files have been
renamed. In the future I'd like to combine settings_local.example.py
with puppet/files/treeherder/local.vagrant.py, but I'll do that in
another bug.
Ideally we'd remove local.py entirely, in favour of environment
variables. However at least for now, it's still needed for setting the
logging preferences/enabling debugging apps for the Vagrant environment.
As such, it's being kept but disabled everywhere other than for local
development.
The stage/prod local.py files have been vetted, and (a) the contents
confirmed to match that in the environment, and (b) all affected
settings in base.py checked to ensure they read the correct environment
variable. Heroku doesn't have a local.py. As such, this should be a
no-op other than locally, where a Vagrant provision will be required.
I added a create_credentials command to help setting up the initial
development environment. The puppet setup now creates a new user and set
it as the owner of the treeherder-etl credentials.
Previously if TREEHERDER_DJANGO_SECRET_KEY was not set, we'd silently
fall back to a default value for SECRET_KEY, meaning we wouldn't realise
we were using an insecure key on a live deployment instance.
With this change, TREEHERDER_DJANGO_SECRET_KEY being missing from the
environment is fatal, resulting in:
"ImproperlyConfigured: The SECRET_KEY setting must not be empty."
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
For bug 1124278, we're going to want to sprinkle new relic annotations
around the codebase, so by always installing it, we save having to stub
these out in development/on Travis. It also seems wise to have prod
running as close to the same packages as in development.
Since NEW_RELIC_LICENSE_KEY isn't set locally, plus
NEW_RELIC_DEVELOPER_MODE is set to true, the New Relic agent doesn't
submit anything. See:
https://docs.newrelic.com/docs/agents/python-agent/installation-configuration/python-agent-configuration#developer_mode
Removes the default value for DATABASE_URL and DATABASE_URL_RO, so that
if either are not set, there is a clearer upfront Django error about
Databases not being configured, rather than timeouts trying to connect
to localhost.
This makes it more obvious when setting up a new instance that the
variables are not present in the environment (and would have avoided
some confusion on Heroku today, when DATABASE_URL was set, but
DATABASE_URL_RO was not) - in exchange for now needing two extra lines
in the puppet config for local testing.
Since it only speeds up parsing by a few percent of total runtime, and
is therefore not worth the added complexity for deployment and local
hack-test-debug cycles when working on the log parser.
The .gitignore and update.py entries will be removed in a later commit,
once the stage/prod src directories have been cleaned up.
In order that we can serve the UI on Heroku, we wrap the Django wsgi app
with WhiteNoise, so both the UI and API requests are served by gunicorn.
In the Vagrant environment, Apache has been removed and Varnish instead
now proxies all requests to gunicorn/Django runserver directly, without
Apache as a go-between.
The UI on production will not be affected by this commit, since the
Apache config there will still intercept requests for the UI assets
rather than proxying them to gunicorn.
It's worth noting too, that we're not able to make use of WhiteNoise's
automatic Django GZip/caching support since that assumes we are using
Django templates and referring to resources using {% static "foo.css" %}
However, we can sub-class WhiteNoise (or more specifically the
DjangoWhiteNoise class) and override the is_immutable_file() method to
add caching support at a later date:
http://whitenoise.evans.io/en/latest/base.html#caching-headers
Documentation for WhiteNoise can be found at:
http://whitenoise.evans.io/
Since we should be doing this using Django's GZip middleware, not in
Varnish, which is only used in development. At least with this removed,
we won't check a local instance and believe we have everything set up
correctly if its not.
This removes the puppet workaround added by bug 1144805, since the
Ubuntu 14.04 vagrant image comes with puppet 3.4.3, and so includes the
fix for:
https://projects.puppetlabs.com/issues/10907
This uses the newer image from:
https://vagrantcloud.com/ubuntu/boxes/trusty32
This also updates us from apache 2.2 to apache 2.4, which means the
apache config must be changed to be compatible with apache 2.4:
http://httpd.apache.org/docs/2.4/upgrading.html
(The syntax for "Allow from all" has changed, plus the default config
now specifies a "Require all denied" which we have to override for the
UI directory)
A few puppet steps were either depending on the wrong package (ie
apache2-dev instead of apache2) or else did not have a 'requires'
property set at all. This was causing errors during provision due to
races - particularly after updating to Ubuntu 14.04.