Instead, generate the data when required. We will store the return value
of this in memcache for a day to ensure things are responsive for the sheriffs
when classifying recent failures.
This changes ingestion, the API endpoints, and the frontend to match
the new structure. For now we continue to store text_log_summary artifacts,
though they don't do anything anymore.
Bug 1280910 made the bug suggestions generation during log parsing no
longer hit the bugscache Treeherder API, so we no longer need to mock
the request to it.
Any future uses of `fetch_json()` during tests would be better mocked
using the responses library instead.
Since Hawk credentials can now be created without an owner, there's no
need for this fixture to create a user to be associated with the test
Hawk credentials, since none of the tests using the `client_credentials`
fixture actually use the created user.
The `transactional_db` fixture is now required since it's no longer
inherited from the `test_user` fixture.
Also switch back to `.create()` rather than `.get_or_create()` now that
bug 1133273 is fixed.
Add support for matching test failures where the test, subtest, status,
and expected status are all exact matches, but the message is not an
exact match. The matching uses ElasticSearch and is initially optimised
for cases where the messages differ only in numeric values since this is
a relatively common case.
This commit also adds ElasticSearch to the travis environment.
Since they are virtually identical, and downstream consumers (eg
`_get_json()`) are having to check for the presence of `project` to
figure out which to call.
The `timeout` parameter to requests.{get,post} only control the timeout
waiting for a response to the socket, and not the total request time. As
such, it doesn't really make sense to want to specify different timeouts
for different API requests (since the time to connect should be similar,
even if one API request is more time-consuming to generate overall).
This commit removes the per-method timeout argument, leaving the
TreeherderClient constructor as the single place to control the timeout.
Previously, if the `client_id` and `secret` were set, the Hawk auth
headers would only be sent for requests using `_post_json()`. Now they
will also be sent when using `_get_json()` too.
This change means:
1) For users already setting `client_id` & `secret` for GET-based client
functionality, we can more easily tell from where requests are coming
(since transactions in New Relic are annotated with the `client_id`).
2) In the future bug 1203556 can implement rate limiting for GETs too,
with higher thresholds for authenticated users.
Do this by storing the content of each template in the database (currently
we have just one, for Talos) and adding an API endpoint for fetching it.
For now, we will store the data that should be in each template in a
pre-loaded fixture. Bonus new feature: automatic population of a "CC" list.
When setting the bug number for a classified failure, first check for an
classified failure that already has that bug number, and if one exists,
update all references to the classified failure being changed to instead
point at the existing classification
This will be useful for some views where we don't know off the bat
whether a signature has subtests or not (e.g. the e10s dashboard,
the graphs view) and thus whether to show a "show subtests button"
New resultsets will still store a value in their ``revision_hash`` field, but it will
just be the same value as their ``long_revision`` field.
This will log an exception in New Relic when a new resultset or job is posted
to the API with only a ``revision_hash``and not a ``revision`` value.
This also switches to using the longer 40 char revisions along side the
12 char revisions. But we leverage the longer ones for most actions. The
short revisions are stored and used so that people and the UI can support
locating a resultset (or setting ranges) with short revisions.
It's almost entirely unnecessary (the few times we need base data
we can generate it ourselves) and can overwrite other test data
if we're not careful. Better just to remove it.
* We needed to allow the respository and last_updated fields to be null
temporarily while we waited to run a migration, but this is no longer
necessary
* Also remove temporary migration script, it's no longer needed
This returns some basic info about the resultsets that were submitted via a POST to the API.
This is needed for Task Cluster as we transition away from ``revision_hash`` so that they
take it from us, instead of using what they submit. This way, we can return them the
``revision_hash`` that we WANT to be moving to (PR 1322). They also need this so they can
then send that new ``revision_hash`` to the tasks (jobs) that belong to it.
In Python 2 you can get away with using relative paths for imports
without the dot syntax. However it's best practice to use it when the
import path is relative to the current directory rather than the project
root. This also makes it Python 3 compatible.
The `auth` parameter was intended to receive either a `TreeherderAuth`
instance (for OAuth credentials), or a `HawkAuth` instance. The former
no longer exists since OAuth support has been removed, and the latter is
not necessary, since the client now supports simply passing `client_id`
and `secret` to the `TreeherderClient` constructor, as a simpler way of
specifying the Hawk credentials. As such, the `auth` parameter is
superfluous and can be removed.
Since they are deprecated and all submitters have switched over to using
Hawk credentials instead.
The automatically created migrations file was edited to remove the
unused `models` import, since otherwise flake8 complains. We could
alternatively exclude the migrations directory from flake8, however we
would then miss linter errors in any hand-written migrations files.
In addition, Django have fixed the issue in 1.9:
a7bc00e17b
Specifically, this update creates an API endpoint that allows updating a
"revised" summary id property off each alert. This is intended for the case
where a performance alert was generated against the wrong summary (usually
because of a downstream merge between branches). For now, the API requires the
user to be staff (i.e. a sheriff), otherwise it will generate a 403 denied.
At the moment tests that store jobs can get away with not having the
test project listed in the repository table. However with later commits,
the new job_duration table will have a foreign key on the repository
table, which will break those tests unless they use the test_repository
fixture, which creates the repository record for them.
pytest-django doesn't setup a test database for every single test, but
only for those tests that actually require a db. Tests that require a db
need to either be marked with `@pytest.mark.django_db` or use a fixture
that has a dependency on `db` or `transactional_db`.
Using a non transactional db would make tests execution much faster, but
unfortunately it doesn't play well with the treeherder datasource
creation so I used a transactional_db.
pytest-django also allows you to specify a settings file to use for
tests in a pytest.ini file, which is nicer than monkeypatch the original
settings file in the pytest session start function 😃.
We were previously using the same database (test_treeherder) for both the
jobs and reference data model. I centralized the new db name in the test
settings file. All the test requiring the jobs db or its repository counterpart
can now access it using the `test_project` fixture, while utility functions use
directly the metioned setting. Where the project name is hardcoded in a static
file, I just replaced it with the new name `test_treeherder_jobs`
This guarantees that jobs databases is dropped at the end of each
test. It also makes the jobs database life cycle easier to understand.
In general to keep the tests as fast as possible we shouldn't have much
code in the setup and tear down of each test.