* remove StepParser and switch to ErrorParser
* remove writes to TextLogStep from artifact.py
* remove buildbot ref in builders
* replace TextLogStep model in DetailsPanel, SimilarJobsTab and logviewer App
* cleanup DetailsPanel
* remove old log parsing tests and update others
* add logging to error_summary.py
* add parse max error lines limit to ErrorParser
* fix in similar jobs tab for Bug 1652869
* remove StepParser and stop storing steps in TextLogStep
* remove more buildbot references
* replace TextLogStep model in DetailsPanel and logviewer App
* remove old log parsing tests and update other tests
* add logging to error_summary.py
* remove StepParser and stop storing steps
* replace TextLogStepModel in the UI with text-log-errors API
* remove old references to buildbot
* update and cleanup tests
Occasionally failing build/test runs can fail in such a way that results
in a significant amount of log spam and therefore log files that are
hundreds of MB in size each. This can cause log parsing backlogs,
particularly when many jobs on the same push fail in such a way.
The log parser now checks the `Content-Length` of log files prior to
streaming them, and skips the download/parse if it exceeds the set
threshold. The frontend has been adjusted to display an appropriate
message explaining why the parsed log is not available.
The threshold has been set to 5MB, since:
* the 99th percentile of download size on New Relic was ~2.8MB:
https://insights.newrelic.com/accounts/677903/dashboards/339080
* `Content-Length` is the size of the log prior to decompression, and
the chronic logspam cases have been known to have compression ratios
of 20-50x, which would translate to an uncompressed size limit of
up to 250MB (which is already much larger than buildbot's former 50MB
uncompressed size limit).
Now that we're using requests, the log URL being used to access the file
is consistent across environments (since it doesn't reference the local
directory structure), so we don't need to exclude it from comparisons.
Doing so also fixes incorrect line numbers in logs that have Windows
line endings (bug 1328880), hence having to update the expected logview
output.
The tests have to be adjusted to use responses, since requests doesn't
support `file://` URLs.
The MPL 2.0 terms state that as long as a LICENSE file is present, the
per-file header text is not required. See "Exhibit A" at the end of:
https://www.mozilla.org/MPL/2.0/
Bug 1060339 made check_errors always be true, since we want to parse all
logs, not just those for failing jobs. As such, we have no use for
check_errors.
Created using |isort -p tests -rc .| and a couple of manual tweaks.
The order is:
* futures
* std library
* third party packages
* local imports
* relative local imports
...with each group ordered with "import x" before "from x import y", and
then alphabetically.
To avoid logs with excessive number of lines that match the error regex
from taking up too much space in the DB & also making the API response
and thus UI unwieldy, we cap the number of error lines at 100 per step
of the job. The 'errors_truncated' property can be used by the UI to
indicate that the error lines are only a subset of the total failures.
Generated using:
autopep8 --in-place --recursive .
Before:
$ pep8 | wc -l
1686
After:
$ pep8 | wc -l
57
A later autopep8 run will be performed using --aggressive, which makes
non-whitespace changes too.
In an ideal world, we would never have a successful job that contained
errors which our log parser regex matched against. Unfortunately this is
not the case in practice, since:
1) There is sometimes spammy log output that isn't intended to make the
job fail, but which false-positive matches against our log regex.
2) Sometimes the test/harness/buildbot/... is broken and the log error
is real, and it should have caused the run to fail.
In the case of #1, hiding these errors in successful runs means that
when someone looks at the error summary for a failed run, they don't
realise that a proportion of the failures they can see are actually
present in _every_ job, not just the failed ones.
In the case of #2, hiding the errors means we don't realise <foo> is
broken and needs fixing.
As such, we now parse for error lines in all jobs, including successful.