Now that consumers of OrangeFactor have been switched to the new
intermittent failures view UI/API, we can stop submitting failure
classifications to OrangeFactor's Elasticsearch instance.
Adds the virtualenv directory to `PATH` in `env.sh`, to save having
to activate the virtualenv (which is about to be removed) or set
`PATH` in each of the wrapper bin scripts.
Now we're no longer using datasource, the memory leaks previously
seen have gone. As such there is no need to make the celery worker
processes restart so frequently (they will now only be restarted at
the daily Heroku dyno restart), which will improve performance.
In addition, this will reduce the number of transactions that don't get
reported to New Relic (when the maxtasksperchild threshold is
reached, the worker is forcibly killed before it can submit the
New Relic payload).
The log parser tasks have been left unchanged for now, since they
appear to still be leaking.
Outputting to the console rather than a log file:
* is more user-friendly during development
* is more consistent with Heroku
* means the Vagrant-specific Django LOGGING config is now closer to the
one in settings.py, and so more easily combined with it
Both gunicorn and celery default to outputting to stdout/stderr, so the
`logfile` options can be omitted entirely.
Previously if the `bin/` directory scripts were run in the Vagrant
environment, the processes run would not have been started via the
New Relic wrapper command, since the New Relic licence key is not set.
For greater consistency between Vagrant and production, the wrapper
command is now always used.
This works since `NEW_RELIC_DEVELOPER_MODE` is defined in the Vagrant
environment (thanks to a previous commit in the same bug), which
prevents the agent from trying to submit real data:
https://docs.newrelic.com/docs/agents/python-agent/installation-configuration/python-agent-configuration#developer_mode
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
In order to rsync the virtualenv as part of deployment, we'll need to
use |virtualenv --relocatable|. However this does not update the
activate scripts (making them unusable), and the changes it makes in the
virtualenv bin directory require that the virtualenv's python is first
in the path:
https://virtualenv.pypa.io/en/latest/userguide.html#making-environments-relocatable
As such, we must make the Treeherder runner scripts add the virtualenv
bin to PATH instead of using the activate pattern.
The existing name "curr_dir" isn't overly clear - does it mean the
current working directory or the directory in which this script exists?
It should also be uppercase. I think the behaviour is clearer when
reworked like this :-)
There's a lot of other cleanup that could be done (eg reducing
duplication), but saving that for bug 1153971.
...into 'classification_mirroring' and 'publish_to_pulse'. This gives
greater visibility into the relative queue sizes of each, allows us to
move one of them to another node, plus means we can pause consumption of
the classification_mirroring task for when the Elasticsearch indicies
are migrated in bug 1142538.