Граф коммитов

63 Коммитов

Автор SHA1 Сообщение Дата
Ed Morley 52d6017c5b
Bug 1295997 - Skip parsing logs whose compressed size exceeds 5MB (#4700)
Occasionally failing build/test runs can fail in such a way that results
in a significant amount of log spam and therefore log files that are
hundreds of MB in size each. This can cause log parsing backlogs,
particularly when many jobs on the same push fail in such a way.

The log parser now checks the `Content-Length` of log files prior to
streaming them, and skips the download/parse if it exceeds the set
threshold. The frontend has been adjusted to display an appropriate
message explaining why the parsed log is not available.

The threshold has been set to 5MB, since:
* the 99th percentile of download size on New Relic was ~2.8MB:
  https://insights.newrelic.com/accounts/677903/dashboards/339080
* `Content-Length` is the size of the log prior to decompression, and
  the chronic logspam cases have been known to have compression ratios
  of 20-50x, which would translate to an uncompressed size limit of
  up to 250MB (which is already much larger than buildbot's former 50MB
  uncompressed size limit).
2019-02-25 19:04:38 +00:00
Ed Morley e1ebb72c1d Bug 1526743 - Clean up log parsing test utilities
Previously any exceptions raised whilst loading the expected output JSON
fixtures were suppressed, which made debugging the Python 3 test failures
harder than needs be.

The reason failures were suppressed was to allow the test to continue far
enough that the actual output could be saved to the fixture when creating
new tests. However reordering `do_test()` has the same effect without the
need for the `load_exp()` try-except handling.
2019-02-12 20:00:16 +00:00
Ed Morley 4bdf9f9101 Bug 1526743 - Explicitly .decode() streamed log lines
Since the `request` package's `iter_lines()` returns bytes by default.

We cannot pass `decode_unicode=True` to `iter_lines()` since it splits on
Unicode newlines (which can exist in test message output), so instead we
manually `.decode()` each line before parsing it.

Fixes the following exception under Python 3:
`TypeError: a bytes-like object is required, not 'str'`

The test utility `load_exp()` had to be modified to no longer use append
mode when opening the expected output JSON files, in order to fix:
`json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)`
2019-02-12 20:00:16 +00:00
Ed Morley ecf4c4e33a
Bug 1349182 - Remove support for submitting jobs via the REST API (#4075)
Now that autophone and AWFY have migrated to Taskcluster, there are
no more submitters of jobs to the REST API (confirmed via New Relic
Insights). As such, this deprecated data ingestion method can now be
removed, along with support for API Hawk auth, API POST throttling
and `treeherder-client` job submission capability.

After this lands we'll need to manually drop the `credentials` table.
2018-09-28 02:17:48 +01:00
Ed Morley a5023192e9
Bug 1453482 - Use d-r-f's APIClient instead of WebTest's TestApp (#3439)
To avoid unnecessary dependencies, and use a more conventional
django-rest-framework testing approach:
http://www.django-rest-framework.org/api-guide/testing/#apiclient

APIClient has a few API differences:
* `.json` -> `.json()`
* `.status_int` -> `.status_code`
* Get parameters are passed as keyword argument `data` not `params`
* The default hostname is `http://testserver` not `http://localhost`
* Additional HTTP headers are passed directly as keyword arguments,
  rather than nested under a `headers` property.
* It doesn't check the status code itself, so explicit checks are
  required, along with removing `expect_errors`.
* The `.post_json()` and `.put_json()` methods don't exist.

See also the docs for the Django test client (which APIClient wraps):
https://docs.djangoproject.com/en/1.11/topics/testing/tools/#the-test-client

Whilst making these changes, I also cleaned up the session fetching
in `test_auth.py` and `test_backends.py`, and added a `status_code`
check in `conftest.py`'s `mock_post_json()` - which makes the root
cause of test failures clearer.
2018-04-12 18:06:53 +01:00
Ed Morley 6eaf49ab43 Bug 1452212 - Remove unused variables
As reported by pylint 'unused-variable'.
2018-04-10 20:10:27 +01:00
Ed Morley b521523113
Bug 1402992 - Remove all usages of coalesced_to_guid (#3219)
Since after bug 1400069 it is no longer used by the UI.

This removes everything but the Job model field (since the table is
large enough that migrations need to be carefully coordinated, and we
can batch up that change with others).
2018-02-13 18:15:50 +00:00
William Lachance 0c81ed8e76 Bug 1328985 - Remove project_specific_id compatibility shim (#3030) 2017-12-21 00:37:07 +00:00
Cameron Dawson a250d1dcb6 Bug 1400069 - Replace term coalesced with superseded in most places
Except where it has to touch the database field, since that will be
vestigial in a later commit and removed in a alter PR.
2017-09-29 17:18:42 -07:00
camd e3ed847578 Bug 1215587 - Add job_group field to jobs table (#2601)
This is to allow any job type to be able to belong to any
job group.  This will also mean that if someone accidentally
picked the wrong group for a job type, we don't need to
fix it in the DB for all new jobs.  They can fix their task
definition, and all new jobs will go to the new job group.

This includes a management command to migrate the old
data from job_type.job_group to the new field of job.job_group.

A follow-up PR will remove the old field and set the API to
read from the job.job_group field.
2017-08-23 11:36:30 -07:00
Max Chehab c6e0c26bc8 Bug 1336272 - Refactor changing wording from resultset to push (#2644)
Change of new environment variable `PULSE_PUSH_SOURCES`.

Keep old `publish-resultset-runnable-job-action` task name by creating a 
method that points to `publish_push_runnable_job_action`.
2017-08-04 09:38:57 -07:00
Ed Morley 6717d9f5fc Bug 1165356 - Log parser: Fetch logs using requests instead of urllib2
Doing so also fixes incorrect line numbers in logs that have Windows
line endings (bug 1328880), hence having to update the expected logview
output.

The tests have to be adjusted to use responses, since requests doesn't
support `file://` URLs.
2017-03-22 00:50:02 +00:00
Ed Morley f4b37c026b Bug 1348375 - Stop using a wildcard import in treeherder.client
This import only affects internal treeherder usage, people using the
PyPI package import from the `thclient` subdirectory instead.

Fixes:

treeherder/client/__init__.py:1:1: F401 '.thclient.*' imported but unused
2017-03-20 13:18:20 +00:00
William Lachance ae0be9d81c Bug 1329002 - Remove jobs model 2017-01-06 15:01:19 -05:00
William Lachance f6407b7493 Bug 1318474 - Stop writing to datasource altogether for jobs data
The last thing we were using it for were cycling data and giving an
id numbering for jobs, and we can easily replace those with calls to
the ORM.
2017-01-05 16:09:51 -05:00
William Lachance 9f604fa323 Bug 1318474 - Switch API's and some internal consumers to use Django ORM for job data
Still *ingesting* into datasource for now
2016-11-25 12:54:53 -05:00
William Lachance 1c0c32cec0 Bug 1311185 - Turn off remaining result set ingestion and use 2016-11-11 15:40:02 -05:00
William Lachance 8ccc91ee95 Bug 1311185 - Make API's return push id's, not result set id's
For now the resultset and job API endpoints still return result_set_id,
for backwards compatibility.
2016-11-09 13:51:00 -05:00
William Lachance 129adcfab3 Bug 1301477 - Disable external facing artifact endpoints and API's (#1965)
* Can no longer store raw artifacts (anything treeherder doesn't understand
  is ignored)
* Attempting to retrieve an artifact now returns a 405 (not allowed)
2016-11-08 09:49:21 -05:00
Ed Morley a1c84ec768 Bug 1279213 - Make Treeherder use the new client server_url parameter
And use `SITE_URL` rather than requiring the separate environment
variables just for requests to Treeherder's own API. (Thereby reducing
the number of environment variables I have to toggle back and forth for
the Heroku migration).
2016-06-23 09:47:46 +01:00
KWierso 3926d546b5 Bug 1281516 - Correct the spelling of 'unknown' (#1612) r=emorley 2016-06-22 18:08:51 +01:00
Ed Morley b6677fd5c6 Bug 1276254 - Remove unnecessary usage of .read() when json decoding
By using `json.load()` instead of `json.loads()` we can pass the file
object directly, rather than having to `.read()` it first.
2016-05-27 18:24:38 +01:00
William Lachance 0e9d516f8a Bug 1273231 - Store job log url/name/parse information in central database 2016-05-25 13:58:44 -04:00
William Lachance 92f02ccc25 Bug 1231361 - Remove refdata model dependency from unit tests 2016-04-04 15:10:08 -04:00
Cameron Dawson 2a9dbefa49 Bug 1199364 - Use revsion instead of revision_hash for resultsets
New resultsets will still store a value in their ``revision_hash`` field, but it will
just be the same value as their ``long_revision`` field.

This will log an exception in New Relic when a new resultset or job is posted
to the API with only a ``revision_hash``and not a ``revision`` value.

This also switches to using the longer 40 char revisions along side the
12 char revisions.  But we leverage the longer ones for most actions.  The
short revisions are stored and used so that people and the UI can support
locating a resultset (or setting ranges) with short revisions.
2016-03-17 15:48:49 -07:00
William Lachance 161b3635aa Bug 1239185 - Remove use of reference data model in job ingestion 2016-03-14 15:17:43 -04:00
Ed Morley bb56cfc5de Bug 1144721 - Remove dead Python code found using vulture
https://pypi.python.org/pypi/vulture
2016-01-22 18:20:13 +00:00
Ed Morley 17b514d32e Bug 1212937 - Remove support for passing `auth` to the Python client
The `auth` parameter was intended to receive either a `TreeherderAuth`
instance (for OAuth credentials), or a `HawkAuth` instance. The former
no longer exists since OAuth support has been removed, and the latter is
not necessary, since the client now supports simply passing `client_id`
and `secret` to the `TreeherderClient` constructor, as a simpler way of
specifying the Hawk credentials. As such, the `auth` parameter is
superfluous and can be removed.
2015-12-16 18:17:58 +00:00
Mauro Doglio be64542d11 Bug 1209555 - switch etl requests to hawk
I added a create_credentials command to help setting up the initial
development environment. The puppet setup now creates a new user and set
it as the owner of the treeherder-etl credentials.
2015-10-08 13:11:54 +01:00
Ed Morley 027abb7057 Bug 1211253 - Remove the concept of a platform's VM status
It's calculated during ingestion, but not used and in fact not actually
stored anywhere.
2015-10-05 18:49:25 +01:00
Ed Morley 571c57af71 Bug 1192957 - Mass-update python import style using isort
To fix pre-existing deviations from the style that will be enforced on
Travis in the next commit.
2015-10-02 17:55:29 +01:00
Ed Morley ba39c25fa2 Bug 1202626 - Stop storing each revision's commit_timestamp
It's unused in the UI and doesn't add any value, since we're
normally much more interested in the push_timestamp.

This can land without causing errors, since the field is DEFAULT NULL,
so can be dropped at our leisure later.
2015-09-23 10:43:10 +01:00
William Lachance 7440089508 Bug 1198786 - Stop storing performance artifacts
They were never used for anything and take up a lot of space. They
also don't fit into the new performance model we're working on. We
can always bring back the useful bits later.
2015-08-28 10:49:50 -04:00
Ed Morley 90ba77e596 Bug 1192801 - Remove per-file MPL boilerplate since it's unnecessary
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
2015-08-18 23:32:11 +01:00
Ed Morley 1c00ccfcc7 Bug 1192661 - Clean up Python import order
Created using |isort -p tests -rc .| and a couple of manual tweaks.

The order is:
* futures
* std library
* third party packages
* local imports
* relative local imports
...with each group ordered with "import x" before "from x import y", and
then alphabetically.
2015-08-10 18:33:49 +01:00
Cameron Dawson 00cfe6643d Bug 1140349 - Remove the objectstore code
After the previous commit, the Objectstore is effectively "dead code".
So this commit removes all the dead code after anything left over in
the Objectstore has been drained and added to the DB.
2015-07-21 14:13:21 -07:00
Mauro Doglio b9881f937c Bug 1183575 - Create a requests auth backend for 2-legged oauth 2015-07-20 16:12:33 +02:00
Vaibhav Agrawal 096560caf5 Bug 1178474 - Have a common function for calling various endpoints. 2015-06-30 10:53:04 -07:00
William Lachance dc084310f6 Bug 1163674 - Update treeherder client to be more generic
* Create a generic TreeherderClient class
* Add a single method called `post_collection` which takes care of all
  details of validation, submitting stuff and raising errors
* Also add a new update_parse_status method, for updating status (replaces
  manual calls to post information on raw TreeherderRequest)
2015-05-19 17:32:22 -04:00
William Lachance 0e6e61fbbe Bug 1159831 - Make treeherder use in-tree copy of treeherder-client 2015-05-01 13:34:29 -04:00
Ed Morley 5e0f36ae91 Bug 1133482 - Always ensure files are closed after being open()ed 2015-03-03 17:13:34 +00:00
Ed Morley f5c0b53e0c Bug 1059814 - Whitespace pep8 fixes
Generated using:
autopep8 --in-place --recursive .

Before:
$ pep8 | wc -l
1686

After:
$ pep8 | wc -l
57

A later autopep8 run will be performed using --aggressive, which makes
non-whitespace changes too.
2015-02-15 14:52:31 +00:00
Ed Morley b0d1c0c168 Bug 1059811 - pyflakes: Remove unused imports 2015-01-16 12:33:32 +00:00
Jonathan French dbb4d11e09 Bug 1090689 - Add MPL2.0 headers to the repo 2014-11-03 13:06:03 -05:00
mdoglio 8e0c47068c add log parsing status handling 2014-07-03 15:58:04 +02:00
Jonathan Eads 1c340db7ed fixed tests 2014-06-13 14:05:16 -07:00
Jonathan Eads 4058c245f8 fixed tests 2014-06-13 13:56:49 -07:00
Jonathan Eads 5ee8880971 added explicit disconnect 2014-06-02 13:48:58 -07:00
mdoglio 5cc866fbf2 separate oauth credentials from OAuthLoaderMixin 2014-03-10 19:36:01 +01:00
Jonathan Eads 35d1a21c23 fixed tests 2014-02-03 17:13:13 -08:00