This commit changes all the references to 'builds-4h' to 'buildbot_text'. Following
are the changed files along with the no. of occurences that have been changed in
each.
1. tests/etl/test_buildapi.py: 3 occurences
2. tests/sample_data/job_data.txt: 304 occurences
3. treeherder/etl/buildapi.py: 1 occurence
4. treeherder/model/sample_data/job_data.json.sample: 2 occurences
In treeherder/webapp/api/logslice.py, a conditional was removed. The todo above
it instructed it to be removed once this bug was addressed.
To make sure all tests run properly, three files were renamed. Only the portion
of the filename that said 'builds-4h' was changed to say 'buildbot_text'.
We're no longer using the vendor directory & this script wasn't entirely
reliable anyway, so let's remove it. The virtualenv package can be
removed from dev.txt, since virtualenv is installed globally, and
nothing inside our virtualenv (which is where the packages in dev.txt
end up) needs a local installation of it.
All packages in the vendor directory are now installed in the virtualenv
site-packages, so there's no need to add the in-repo vendor directory to
the Python path.
This differentiation was only useful when explaining which packages
could be listed in which requirements file (since compiled packages
could not be added to checked-in.txt). Now that all packages are peep
installed, common.txt contains both pure and compiled packages.
Previously, the requests package had to be listed in dev.txt even
though it was in the vendor directory, since it was used by conftest.py
before the vendor directory was added to the Python path. Now that the
packages in checked-in.txt have been moved to common.txt, 'requests' is
listed in two requirements files that are peep installed, so we can
remove the dupe.
Now that we're using virtualenvs and peep to manage packages in
production, there's no need to use an in-repo vendor directory. As
such, all packages that were in checked-in.txt have been moved to
common.txt, so they will now be peep installed during deployment/testing
and also during the provision of the Vagrant environment.
Prior to this change, on stage/production we didn't use virtualenvs
(unlike dev/the local Vagrant project) and instead pip installed
packages globally (when puppet ran periodically), using requirements
files that are not in the repo.
Now during deployment, a virtualenv is created and then populated using
peep (which uses hashes to verify the contents of packages before pip
installing them). The virtualenv is then made relocatable (as best as it
can, the feature isn't perfect), the lib64 symlinks are made relative,
and then the virtualenv is rsynced to all nodes, along with the source.
The one main remaining limitation of --relocatable is that the bash
activate script will not work on the other nodes - however the wrapper
scripts under treeherder-service/bin/ add venv/bin/ to PATH so using the
activate script is unnecessary for them. This just leaves running
manage.py commands locally on a node, for we can use:
|../venv/bin/python manage.py foo|, an alias or else we can fix up the
activate scripts in a follow-up bug.
In order to rsync the virtualenv as part of deployment, we'll need to
use |virtualenv --relocatable|. However this does not update the
activate scripts (making them unusable), and the changes it makes in the
virtualenv bin directory require that the virtualenv's python is first
in the path:
https://virtualenv.pypa.io/en/latest/userguide.html#making-environments-relocatable
As such, we must make the Treeherder runner scripts add the virtualenv
bin to PATH instead of using the activate pattern.
The existing name "curr_dir" isn't overly clear - does it mean the
current working directory or the directory in which this script exists?
It should also be uppercase. I think the behaviour is clearer when
reworked like this :-)
There's a lot of other cleanup that could be done (eg reducing
duplication), but saving that for bug 1153971.
The 'treeherder-service' repo has been renamed to 'treeherder', ready
for when the treeherder-ui repo is imported into it. This means the
Github URL, Travis URL and directory name when cloned changes. The Read
The Docs URL cannot be changed, so for now we will leave as-is, and in
the future (once service and UI docs combined) we will create a new
project on RTD with name "treeherder".
This updates doc links and puppet/Vagrant configs, but leaves the
stage/prod deploy script alone, since renaming the directories on our
infra is non-trivial. The dev instance will need some TLC since unlike
stage/prod, it does use the puppet scripts in the repo.
The current values were copied from another project's deploy script, but
they are not working - so let's use what the latest New Relic docs say
we should use.
If attempting to fetch the log from the provided URL results in a 404,
mark the parse_status as failed but do not generate an exception. Whilst
missing logs are not ideal, they are not an app error, nor something is
likely to change over time (vs say an HTTP 500 from ftp.mozilla.org).
Unlike the Pulse publishing, the code for consuming data from Pulse is
unused, being a leftover from initial attempts to ingest buildbot data
via pulse, rather than builds-{4hr,running,pending}.js
...into 'classification_mirroring' and 'publish_to_pulse'. This gives
greater visibility into the relative queue sizes of each, allows us to
move one of them to another node, plus means we can pause consumption of
the classification_mirroring task for when the Elasticsearch indicies
are migrated in bug 1142538.
We're submitting to Elasticsearch (used by OrangeFactor), not directly
to OrangeFactor, so "Elasticsearch" is more appropriate. The use of
"Bug" in the name makes it sound like we're submitting a bug, which
we're not, we're submitting a bug comment (or ES doc) which contains a
number of different fields, in response to a classification entry made a
sheriff iff the classification included a bug number, which is slightly
different, and too nuanced to include in the name.
As such whilst not perfect, I think this is slightly clearer:
s/OrangeFactorRequest/ElasticsearchDocRequest/
and
s/BugzillaBugRequest/BugzillaCommentRequest/
Use prefork scheduling instead of gevent scheduling, to avoid issues
we've had with gevent - both with zombie tasks & also incompatibilities
with Python 2.7.9.
Now that the job names have been made more consistent by bug 740142, we
can simplify our regex again :-)
This is a direct revert of the last three hunks in:
d7abe14635
...plus appropriate updates to the job names in the tests.
In order to find the job in question, the project name is required in
addition to the job guid. We also now specify the URL for errors, to
save having to look up the job at all in many cases.