...when referring to the datetime that a classification was made. This
avoids confusion in ElasticsearchDocRequest, since previously we had
two similarly named variables: 'job_data["submit_timestamp"]' and
'self.submit_timestamp', the former referring to the time the job was
scheduled, the latter to the time the classification was submitted.
Since it's unused, and hg.mozilla.org now has this information available
via its API.
Note: This commit depends on bug 1178719, to prevent issues during
deployment. Also, due to https://code.djangoproject.com/ticket/25036 a
migrate will need to be run interactively after deployment, to clean up
the old repositoryversion content type.
Since otherwise we may end up with interactive prompts.
Note: When using call_command() we instead have to use 'interactive'
instead of 'noinput' due to https://code.djangoproject.com/ticket/22985,
which is only fixed in Django 1.8+.
The datasource table has a 'dataset' field, to allow for multiple
datasources of the same type (for partitioning; eg the "1" in
`mozilla-central_jobs_1`). However we have never used it, so let's just
remove it.
Before if two celery jobs were updating the same series, one would overwrite
the other because the locking code did not actually work (it just always
unconditonally got a new lock without checking if anything was using it).
This fixes that.
* Using machine history is now optional (perfherder doesn't track it, and
we don't think we need it)
* Performance datums, analyze_t take keyword arguments to make API more
intuitive
* Various other minor updates to make code easier to understand
* Create a proper setup.py so it can eventually be distributed on pypi
(and installed locally meanwhile)
* Make relevant tests run along with the rest of treeherder-service's tests
* Remove dashboard, old graphserver business logic
The fixtures path changed from:
webapp/test/mock/*
...to:
tests/ui/mock/*
However basePath in the Karma config was also changed from webapp/
to the repo root.
This reverts commit e71e781565.
This commit caused pending and running jobs to be put into the objectstore.
This causes their completed versions not to be ingested.
This introduces two new ways to generate ``Bug suggestions`` artifacts from
a ``text_log_summary`` artifact
1. POST a ``text_log_summary`` on the ``/artifact`` endpoint
2. POST a ``text_log_summary`` with a job on the ``/jobs`` endpoint.
Both of these cases will schedule an asynchronous task to generate the
``Bug suggestions`` artifact with ``celery``.
Artifact generation scenarios:
JobCollections
^^^^^^^^^^^^^^
Via the ``/jobs`` endpoint:
1. Submit a Log URL with no ``parse_status`` or ``parse_status`` set to "pending"
* This will generate ``text_log_summary`` and ``Bug suggestions`` artifacts
* Current *Buildbot* workflow
2. Submit a Log URL with ``parse_status`` set to "parsed" and a ``text_log_summary`` artifact
* Will generate a ``Bug suggestions`` artifact only
* Desired future state of *Task Cluster*
3. Submit a Log URL with ``parse_status`` of "parsed", with ``text_log_summary`` and ``Bug suggestions`` artifacts
* Will generate nothing
ArtifactCollections
^^^^^^^^^^^^^^^^^^^
Via the ``/artifact`` endpoint:
1. Submit a ``text_log_summary`` artifact
* Will generate a ``Bug suggestions`` artifact if it does not already exist for that job.
2. Submit ``text_log_summary`` and ``Bug suggestions`` artifacts
* Will generate nothing
* This is *Treeherder's* current internal log parser workflow
As part of merging the UI repo into this one, the following directory
moves were performed:
webapp/app/ -> ui/
webapp/test/ -> tests/ui/
webapp/config/ -> tests/ui/config/
webapp/scripts/ -> tests/ui/scripts/
webapp/scripts/web-server.js -> web-server.js
* Create a generic TreeherderClient class
* Add a single method called `post_collection` which takes care of all
details of validation, submitting stuff and raising errors
* Also add a new update_parse_status method, for updating status (replaces
manual calls to post information on raw TreeherderRequest)
The directory is empty apart from a .gitkeep, since it only exists to
house jstd.log, which is output by watchr.rb. I was going to move the
directory around in bug 1056877, but let's just delete it and move the
log file one directory higher up.
We were previously calling them OS X 10.8 for aesthetics in TBPL,
however it can cause confusion with developers. In addition, in
Treeherder the platform is not just used in the UI, but for downstream
analysis, so using the incorrect platform has more severe consequences.
This supports ingesting job ``log_references`` that have a
``parse_status`` value. This is so that external tools can submit jobs
that don’t require our internal log parsing. They will then submit
their own log summary artifact.
This commit changes all the references to 'builds-4h' to 'buildbot_text'. Following
are the changed files along with the no. of occurences that have been changed in
each.
1. tests/etl/test_buildapi.py: 3 occurences
2. tests/sample_data/job_data.txt: 304 occurences
3. treeherder/etl/buildapi.py: 1 occurence
4. treeherder/model/sample_data/job_data.json.sample: 2 occurences
In treeherder/webapp/api/logslice.py, a conditional was removed. The todo above
it instructed it to be removed once this bug was addressed.
To make sure all tests run properly, three files were renamed. Only the portion
of the filename that said 'builds-4h' was changed to say 'buildbot_text'.
The 'treeherder-service' repo has been renamed to 'treeherder', ready
for when the treeherder-ui repo is imported into it. This means the
Github URL, Travis URL and directory name when cloned changes. The Read
The Docs URL cannot be changed, so for now we will leave as-is, and in
the future (once service and UI docs combined) we will create a new
project on RTD with name "treeherder".
This updates doc links and puppet/Vagrant configs, but leaves the
stage/prod deploy script alone, since renaming the directories on our
infra is non-trivial. The dev instance will need some TLC since unlike
stage/prod, it does use the puppet scripts in the repo.
Unlike the Pulse publishing, the code for consuming data from Pulse is
unused, being a leftover from initial attempts to ingest buildbot data
via pulse, rather than builds-{4hr,running,pending}.js
We're submitting to Elasticsearch (used by OrangeFactor), not directly
to OrangeFactor, so "Elasticsearch" is more appropriate. The use of
"Bug" in the name makes it sound like we're submitting a bug, which
we're not, we're submitting a bug comment (or ES doc) which contains a
number of different fields, in response to a classification entry made a
sheriff iff the classification included a bug number, which is slightly
different, and too nuanced to include in the name.
As such whilst not perfect, I think this is slightly clearer:
s/OrangeFactorRequest/ElasticsearchDocRequest/
and
s/BugzillaBugRequest/BugzillaCommentRequest/
Now that the job names have been made more consistent by bug 740142, we
can simplify our regex again :-)
This is a direct revert of the last three hunks in:
d7abe14635
...plus appropriate updates to the job names in the tests.
For debugging & also for when filing new intermittent failure bugs, it
is useful to see which search terms were extracted from a log failure
line, and used to query the bugscache for bug suggestions. In the future
this could be used by an intermittent bug filer to verify the bug
summary contained the term extracted for failures of that type.
* Use a text column instead of varchar for storing series property (since the
subtest signatures can be quite long). Also stop indexing it.
* Update query for getting series signatures to not use a coallation (which
also has a size limit by default)
The treeherder client is in the vendor directory, however that doesn't
get added to the sys.path until settings/base.py is loaded, so defer the
import until we need it.
Currently if a TinderboxPrint line contains a space in the link title,
eg in 'hazard results' here:
TinderboxPrint: hazard results: https://ftp-ssl.mozilla.org/...
...then we ingest it as content_type 'raw_html' rather than 'link'.
test_resultset_api.py:294:5: F841 local variable 'email' is assigned to but never used
test_resultset_api.py:305:5: F841 local variable 'resp' is assigned to but never used
Fixes:
tests/model/derived/test_jobs_model.py:343:23: E712 comparison to False should be 'if cond is False:' or 'if not cond:'
treeherder/__init__.py:11:1: E731 do not assign a lambda expression, use a def
treeherder/etl/buildapi.py:107:16: E713 test for membership should be 'not in'
treeherder/log_parser/utils.py:183:33: W503 line break before binary operator
treeherder/model/derived/base.py:73:12: E713 test for membership should be 'not in'
treeherder/model/derived/base.py:82:12: E713 test for membership should be 'not in'
treeherder/model/derived/jobs.py:1998:26: W503 line break before binary operator
Generated using:
autopep8 --in-place --recursive --aggressive --aggressive
--max-line-length 999 --exclude='.git,__pycache__,.vagrant,build,vendor,
0001_initial.py,models.py,test_note_api.py,test_bug_job_map_api.py' .
autopep8's aggressive mode, unlike standard mode, makes non-whitespace
changes. It also uses lib2to3 to correct deprecated code (W690), some of
which aren't pep8 failures. Some of these changes are more dubious, but
rather than disable W690 completely, I've just excluded the files where
the unwanted changes would have been made, so we can benefit from the
rest.
Tests that are not aimed at the jobs API should not be dependant on the
order of jobs returned by get_job_list().
* test_tbpl.py does not even need to use get_job_list() since the only
accessed property is the job_id, which we are better off hard-coding.
* test_note_apy.py should use the job_id found earlier in the test,
rather than hard-coding a wrong value.
* In test_bug_job_map_api.py, there is no ORDER BY clause for the stored
get_bug_job_map_list query. The current test only happens to pass
since the bug_job_map table currently uses the InnoDB engine, which
default to the order of the primary key. Were our test environment and
production bug_job_map tables to use different engines, the behaviour
would silently change, so it seems wrong for the test to give the
illusion of a guaranteed order. If in the future we wanted to give
such a guarantee, we should add an ORDER BY to the
get_bug_job_map_list query & update the test accordingly.
Bug 1097090 combined get_job_list and get_job_list_full, but the two
queries were actually subtly different. The former had an ORDER BY
push_timestamp, which was lost when they were combined. This means jobs
displayed in the similar jobs panel are from the past, and not the most
recent jobs of the same type.
The get_job_list query also sorted on platform, however I don't believe
this is necessary, so I've not added it back in here.
The sample config now points at the production service API to make
the first-run experience for new contributors easier. However the tests
use the sample config as part of the test run, so this breaks them.
However since the sample config is now optional, we can just not use it
at all during the tests to fix the failures.
To avoid logs with excessive number of lines that match the error regex
from taking up too much space in the DB & also making the API response
and thus UI unwieldy, we cap the number of error lines at 100 per step
of the job. The 'errors_truncated' property can be used by the UI to
indicate that the error lines are only a subset of the total failures.
Generated using:
autopep8 --in-place --recursive .
Before:
$ pep8 | wc -l
1686
After:
$ pep8 | wc -l
57
A later autopep8 run will be performed using --aggressive, which makes
non-whitespace changes too.
Bugzilla comments can be no more than 65535 characters in length. To
both avoid hitting this limit (and the comment being rejected) and to
reduce the spam left on bugs, truncate the comment body at 40000
characters.
This is to cover those cases where we have more than 2000 jobs either on
a single push or on a periodic update; it also sync the number of jobs
requested to the limit imposed by the service (again, 2000)
We did not previously pass the username of the person who made a failure
classification to BugzillaBugRequest. As a result, BugzillaBugRequest
has to fetch the bug mapping again to find it out. After the fix for our
master-read DB setup, this fetch of the bug mapping was being performed
against the read host, which occasionally did not yet have the requested
mapping (due to replication delay) causing an ObjectNotFoundException.
The 'get_bug_job_map_detail' query could have been switched to use the
master host to avoid the race, however the more efficient fix is just to
pass the username to BugzillaBugRequest directly, like we already do for
OrangeFactorBugRequest, avoiding the query entirely.
Main changes:
- removed the full parameter on the jobs endpoint, since in both cases the data returned had similar shape/size but slightly different set of attributes.
- removed the exclusion_state parameter in favour of exclusion_profile. The latter allows to specify which profile to apply to the jobs; by default it will use the default profile and can be disabled using exclusion_profile=false
- the data is now returned as a flat list instead of a triple nested structure. As a result the jobs endpoint is now much faster to return data and it allows to easily do basic operations like filtering, sorting, and pagination. Also, it will allow to implement attribute selection with a minimal effort.
- removed the debug parameter in favour of a more explicit return_type (dict|list) that allows a consumer to specify the type of structure expected for the results (dict by default)
- the resultset endpoint doesn't return jobs anymore but only resultsets.
The 'host' property is unrecognised (and ignored) by Datasource, causing
it to silently fall back to the default of 'master_host'. The property
we needed to have set is in fact called 'host_type'.
As a result, we never use the read-only host!
An issue has been filed against Datasource for making this less silent:
https://github.com/jeads/datasource/issues/20
Since we were missing leak failure messages that were prefixed with the
process name, eg:
"... | leakcheck | tab process: 42114 bytes leaked (...)"
Using .search() is quicker than using .match() with '.*' prefixed to the
regex - see https://bugzilla.mozilla.org/show_bug.cgi?id=1076770#c1
Previously classifying failures with a bug number would add a comment to
the bug that listed the datetime of classification. However this is
redundant, since it's virtually the same as the datetime of the bug
comment itself. Instead, it's more useful to list the job start time.
Only classifications for completed jobs are submitted to Bugzilla, so we
do not need to add handling for pending jobs, that do not have a start
time.
Mappings of bug IDs to failures are currently mirrored to ElasticSearch
for use by OrangeFactor, until OrangeFactor is rewritten to use
Treeherder's DB directly. This patch makes Treeherder submit these
mappings directly to the ElasticSearch instance, rather than doing so
via TBPL's starcomment.php - so that TBPL can be switched off. For
TBPL's implementation, see:
https://hg.mozilla.org/webtools/tbpl/file/eb654a2734c3/php/starcomment.php
Added several parameters to the cycle_data shell command: cycle-interval (in days),
chunk-size (in number of result sets), sleep-time (in seconds).
I made the cycle_data task a very thin wrapper around the shell command,
there is no more logic in it.
All the queries for data cycling are executed with the retry logic to
handle db deadlocks
We divide bug suggestions for a search term into 'open_recent' and
'all_others'. The former is supposed to be group #1 below, and the
latter groups 2-4, with the summation of the two groups corresponding
to every bug whose summary matches the search term.
1) Open + recently modified
2) Open + not recently modified
3) Resolved + recently modified
4) Resolved + not recently modified
However prior to this patch group #3 was not being returned at all, when
it should have been included in all_others.
Test that we are treating the search term literally in the LIKE
statement, and so have correctly escaped any underscores, percent signs
or escape symbols.
The regex for the windows jobs was pre-emptively added by bug 1016448,
but the job names changed slightly since then. Also adds missing tests
for Mulet platforms.