Main changes:
- removed the full parameter on the jobs endpoint, since in both cases the data returned had similar shape/size but slightly different set of attributes.
- removed the exclusion_state parameter in favour of exclusion_profile. The latter allows to specify which profile to apply to the jobs; by default it will use the default profile and can be disabled using exclusion_profile=false
- the data is now returned as a flat list instead of a triple nested structure. As a result the jobs endpoint is now much faster to return data and it allows to easily do basic operations like filtering, sorting, and pagination. Also, it will allow to implement attribute selection with a minimal effort.
- removed the debug parameter in favour of a more explicit return_type (dict|list) that allows a consumer to specify the type of structure expected for the results (dict by default)
- the resultset endpoint doesn't return jobs anymore but only resultsets.
* Separates out the steps required for running the tests, from those for
setting up a local instance.
* The running ingestion tasks step now explains what they are, that the
API server must be running already, and how to ingest just a single
revision for testing.
* The log parser compile step is moved inline, so it's harder to forget.
Previously we only get the HTTPError code in the exception, without the
server response that gives more context. The latter would help explain
the 400s in bug 1114785.
The most recent push from a repository is given the tag 'tip' in
Mercurial. To save having to look up a specific revision to ingest when
testing, 'tip' can now be passed to ingest_push, and the revision that
'tip' currently maps to, will be used during push and jobs ingestion.
Previously, jobs ingestion would have used the non-existent SHA of 'tip'
when trying to associate jobs with pushes, resulting in no jobs being
imported from builds-{pending,running,4hr}.
Currently parse-log tasks retry after 10 minutes. With this change, the
first retry is now after 1 minute, then the time for each subsequent
retry lengthens by a further minute each time.
The 'host' property is unrecognised (and ignored) by Datasource, causing
it to silently fall back to the default of 'master_host'. The property
we needed to have set is in fact called 'host_type'.
As a result, we never use the read-only host!
An issue has been filed against Datasource for making this less silent:
https://github.com/jeads/datasource/issues/20
Later commits will fix the typos in model/sql/reference.json, resulting
in Datasource expecting us to provide a 'read_host' key for queries that
can be performed on the read-only DB. However prior to this change, the
call to Datasource for reference data didn't actually provide this.
On initial repo setup, we were previously setting the read_only_host
field in the Datasource table to the host value for the master, not the
read-only slave. Oops.
Passes the fallback default values correctly, rather than as a string.
Also adds the read-only host to base.py, otherwise later commits would
fail tests when run locally or on Travis.
Since we were missing leak failure messages that were prefixed with the
process name, eg:
"... | leakcheck | tab process: 42114 bytes leaked (...)"
Using .search() is quicker than using .match() with '.*' prefixed to the
regex - see https://bugzilla.mozilla.org/show_bug.cgi?id=1076770#c1
Some recently added platforms were added to the ingestion regex and UI
mappings, but not PLATFORM_ORDER - and so are ordered incorrectly in the
API response, and thus Treeherder's UI.
Previously classifying failures with a bug number would add a comment to
the bug that listed the datetime of classification. However this is
redundant, since it's virtually the same as the datetime of the bug
comment itself. Instead, it's more useful to list the job start time.
Only classifications for completed jobs are submitted to Bugzilla, so we
do not need to add handling for pending jobs, that do not have a start
time.