Previously users who had multiple accounts with the same email address
(but different login methods) would receive exceptions when trying to
create/edit performance alerts. Now the unique `username` field is used
from the Django User model, instead of the non-unique `email`.
This method was exclusively used in auto-classification making it better
suited to live there. As an added benefit we can remove some complexity
because we know all TextLogError instances in auto-classification have
related FailureLines.
Flip the matching algorithm to apply a given list of Matchers to each
TextLogError instead of giving each matcher N errors to find matches
for. This allows us to yield scored ClassifiedFailured IDs to create
in-memory (unsaved!) TextLogErrorMatch instances and avoid passing
around tuples too often.
Note: this means we're also dropping the cut off when scoring.
* Remove unused methods
* Use TextLogError.set_classification instead of FailureLine's version since they're the same effective code it's better to have a single entry point.
* Remove FailureLine.mark_best_classification_verified for TextLogError's to create a single place to mark a classification as verified.
* Get related FailureLine in a clearer way
* Only save changed fields
* Use decorator form of transaction.atomic to avoid extra indentation
* Use count on QuerySets instead of len since len() triggers a DB query and then counts the objects in memory while .count() triggers a query returning the result.
* Pull classified failure we're comparing against out into a variable
* Get objects from the db instead of refreshing in place
* Remove use of FailureMatch object/relation in favour of TLEMatch
* Manually build the matches output for the failure lines API
* Pull expected object out of list
* Manually build the classified failures output for the failure lines API
Previously it was not possible to test features that required an
authenticated user when:
* using `yarn start` with Vagrant (bug 1363722), which meant slower
watch builds
* pointing the UI at the prod/stage API (bug 1317752), which was
extremely limiting
Now login works in all environments, since the frontend no longer uses a
URL prefix, but instead webpack-dev-server proxies non-webpack URLs to
the chosen `BACKEND_DOMAIN` - avoiding cross-domain issues. Cookies are
rewritten to remove any `secure` directive (which is set on production),
so that they can still be read from HTTP localhost. The `Referer` has to
also be changed to stop Django's CSRF checks from rejecting request.
The slower "build into `dist` and watch" mode is therefore no longer
necessary, so `yarn start:local` instead invokes webpack-dev-server just
like `yarn start` - and the `local-watch.js` workaround has been
removed.
Support for the "publish to GitHub with hardcoded `SERVICE_DOMAIN`"
workflow has been dropped, since it was already rarely used and there is
no way to make it support login.
The API domain environment variable was renamed to `BACKEND_DOMAIN` to
avoid potential confusion given it no longer behaves the same as
`SERVICE_DOMAIN` used to.
NB: For full stack Vagrant workflows users must now connect to port
*5000* on localhost, not 8000.
This is a separate function to chunked_qs to keep both functions simple
and easy to read. Both functions contain some inherent complexity since
they are windowing functions and it's important to not hide that below
multiple conditional branches.
The last remaining `ProvidePlugin` definition cannot be removed since
AngularJS and Flot both expect `window.jQuery` to be defined (and
whilst AngularJS falls back to jqLite, it's buggy). It's not possible
to set `window.jQuery` manually in the entrypoints, since ES6 imports
are hoisted, thereby giving no opportunity to modify window first.
This enables `DeprecationWarnings` for things that Python 2 knows
are not compatible with Python 3. The `error` entry in the pytest
`filterwarnings` setting ensures these will be surfaced as Exceptions
and so result in test failures.
See:
https://docs.python.org/2/using/cmdline.html#cmdoption-3
The removal of `sorted()` from `test_bug_job_map_api.py` is to fix:
`DeprecationWarning: dict inequality comparisons not supported in 3.x`
To avoid unnecessary dependencies, and use a more conventional
django-rest-framework testing approach:
http://www.django-rest-framework.org/api-guide/testing/#apiclient
APIClient has a few API differences:
* `.json` -> `.json()`
* `.status_int` -> `.status_code`
* Get parameters are passed as keyword argument `data` not `params`
* The default hostname is `http://testserver` not `http://localhost`
* Additional HTTP headers are passed directly as keyword arguments,
rather than nested under a `headers` property.
* It doesn't check the status code itself, so explicit checks are
required, along with removing `expect_errors`.
* The `.post_json()` and `.put_json()` methods don't exist.
See also the docs for the Django test client (which APIClient wraps):
https://docs.djangoproject.com/en/1.11/topics/testing/tools/#the-test-client
Whilst making these changes, I also cleaned up the session fetching
in `test_auth.py` and `test_backends.py`, and added a `status_code`
check in `conftest.py`'s `mock_post_json()` - which makes the root
cause of test failures clearer.
Fixes pylint:
```
tests/etl/test_text.py:34,0: Anomalous Unicode escape in byte string:
'\U'. String constant might be missing an r or u prefix. (W1402:
anomalous-unicode-escape-in-string)
```
Since in Python 3 they return iterators rather than lists, so if
used in contexts where iterators are not supported must first be
cast to `list`.
These cases weren't caught by pylint `dict-keys-not-iterating` and
`dict-values-not-iterating` (since it isn't able to infer the type
of anything but straightforward `dict` usages), but instead by manual
auditing.
Instead of casting `resp.json.keys()` in `test_performance_data_api.py`
the asserts have been removed, since they were duplicating the
coverage provided by the `assert` on the next line.
Python 3's `map()`, `filter()` and others now return iterators rather
than list/..., so must be cast back to a `list()` if used in contexts
where an iterator is not supported.
Relative imports found by running:
`futurize -w -n -f absolute_import .`
...however then hand-edited to convert to absolute, and the newly added
`from __future__ import absolute_import` removed (since we mostly
don't use relative imports, and PEP8 prefers absolute).
After looking at these, I think I want to go back to using "th" prefix and "ph" prefix.
It is a nice way to see right off the bat that these are constants that are established
by our code and not some external library. So converting ``platformMap`` back to
``thPlatformMap``. It also makes for less code/blame churn as we move to imported
constants.
* Move thOptionOrder to constants.js
* Move thFailureResults to constants.js
* Move thRepoGroupOrder to constants.js
* Move thFavIcons to constants.js
* Move phCompareDefaultOriginalRepo to constants.js
* Move phCompareDefaultNewRepo to constants.js
* Move thDefaultRepo to constants.js
* Move thTitleSuffixLimit to constants.js
* Move phTimeRanges to constants.js
* Move thDateFormat to constants.js
* Move phDefaultTimeRangeValue to constants.js
* Move phTimeRangeValues to constants.js
* Move phBlockers to constants.js
* Move phDefaultFramework to constants.js
* Move phAlertSummaryStatusMap to constants.js
* Move phAlertSummaryIssueTrackersMap to constants.js
* Move phAlertStatusMap to constants.js
* Move thJobNavSelectors to constants.js
* Move thPerformanceBranches to constants.js
* Move phDashboardValues to dashboard.js
* Move phCompareBaseLineDefaultTimeRange to constants.js
* Move thPinboardCountError to constants.js
* Remove now-unused values.js
* Move thResultStatusFilters to constants.js
* Move thEvents to constants.js
* Move thAggregateIds to a aggregateIdHelper.js
* Move thReftestStatus to jobHelper.js
* Remove now-unused provider.js
The cache-based engine stores session data in our Redis cache rather
than in the `django_sessions` table in the MySQL database.
This has several advantages:
* faster access (session lookups occur for virtually all API requests)
* expired sessions are automatically cleaned up (the database engine
requires that the `clearsessions` command be run periodically)
* fewer secrets being stored in the database and its backups
See:
https://docs.djangoproject.com/en/1.11/topics/http/sessions/#configuring-sessionshttps://docs.djangoproject.com/en/1.11/ref/settings/#session-engine
We're not concerned about very occasional session loss, so don't need
the persistence guarantees of the related `cached_db` engine.
The `INSTALLED_APPS` entry is being removed since it's only purpose
is to load the session database model.
The tests have been adjusted to use the proper sessions API (rather
than directly accessing the session DB model) per:
https://docs.djangoproject.com/en/1.11/topics/http/sessions/#using-sessions-out-of-views
* Don't use SERVICE_DOMAIN for references to the front-end code
* Change getRootUrl to getApiUrl to better represent what it is for
* Convert other usages of SERVICE_DOMAIN to getApiUrl
* Convert usages of SERVICE_DOMAIN to getProjectUrl
* Convert uses of SERVICE_DOMAIN to getServiceUrl
* Convert tests to use ``urlHelper``
I believe this was the main bug causing the slowdown. It was setting state
in ``filterPlatform`` which is called once per platform when a filter change is
made. But that ``setState`` was updating ALL the platforms for that push.
* Remove unnecessary clonejobs artifact
* Fix the regression with expand/collapse counts
* Very minor optimizations and cleanup
* Add some unit tests for groups
* Move JobDetailsPane to details-panel folder
* Convert to 2 space indent
* Cleanup props with deconstruction
* Subsume some filters into jobdetailspane
* Subsume logic for job signature filtering
* Subsume logic for visibleFields and visibleTimeFields
* Only need latest classification, not whole array
* Ensure all helper functions are named
* Move getBugUrl, getSlaveHealthUrl, getInspectTaskUrl, getWorkerExplorerUrl,
getLogViewerUrl, getRootUrl, getProjectUrl and getProjectJobUrl to urlHelper
* Move getJobsUrl to only place its used
* Replace thServiceDomain with SERVICE_DOMAIN
* remove thUrl provider
This converts virtually all of the CJS imports, apart from:
* the partials/plugin template loading in cache-loader - since that
is being handled in bug 1441614
* `.eslintrc.js` / `neutrino-custom/*` - since they are used by node
directly (which doesn't yet support ES6 modules) rather than via
webpack.
File extensions have been omitted for JS/JSX, otherwise eslint reports:
`Unexpected use of file extension "js" for "..." (import/extensions)`
An AngularJS module declares a dependency on another module like so:
```
const fooModule = angular.module('foo', ['name_of_dep_module']);
```
This is suboptimal, since:
a) the dependency is only validated at runtime not during the build
b) the module name often doesn't match the NPM package name, making
it hard to know whether imports are unused.
As such, many AngularJS packages now export a string containing the
module name, which allows for an ES6-like approach (it still uses
globals underneath, but better than nothing):
```
import nameOfDepModule from 'dep-module';
const fooModule = angular.module('foo', [nameOfDepModule]);
```
Just to complicate things, some modules don't export the module name,
but the actual module object itself. This can't be passed to
`angular.module()` directly, so for these modules we have to use the
`name` property on the module instead:
https://docs.angularjs.org/api/ng/type/angular.Module#name
eg:
```
import depModuleObject from 'awkward-dep-module';
const fooModule = angular.module('foo', [depModuleObject.name]);
```
In fact, the treeherder modules do just that - however its not worth
fixing, since otherwise we'd have to change every file that registers
directives/filters/values/... to fetch the module first, using
`angular.module('name-of-existing-module')`.
And whilst we're there, use explicit imports instead of relying on
the webpack `ProvidePlugin` to import it for us - which lets us also
remove the eslint global exemption plus unit test workaround.
This helps prevent:
https://www.owasp.org/index.php/Reverse_Tabnabbing
We're not also using `noreferrer`, since most browsers now support
`noopener` (https://caniuse.com/#search=noopener) and the link targets
are all Mozilla properties where the referrer may be useful.
The auth.js `window.open()` has not been changed, since the login
callback makes use of `window.opener`.
This adds some new components and removes the AngularJS ng-repeat for
pushes. In the course of this work, some of the AngularJS providers were
converted to helper functions.
In a couple cases, I had to add new code to the AngularJS areas so that it
would continue to interact well between Angular and React.
Also:
* Rename some functions and CSS classes from resultset to push
* Add unlistening for events during unmount of components
Since unfortunately the migration failed on stage due to:
```
IntegrityError: (1452, 'Cannot add or update a child row: a foreign
key constraint fails (`treeherder`.`#sql-1c26_1936a5`, CONSTRAINT
`bug_job_map_bug_id_013db14b_fk_bugscache_id` FOREIGN KEY (`bug_id`)
REFERENCES `bugscache` (`id`))')
```
I believe this is due to the fact that bug numbers are sometimes
manually entered when classifying failures (eg for infra failures),
meaning that not every bug ID in `BugJobMap` is for an intermittent
bug, and so doesn't exist in `Bugscache` (which is currently only
populated with bugs that have the keyword `intermittent-failure`).
This wasn't caught on prototype, since whilst that has some failure
classifications saved, none of them where hand-entered and so all
existed in `Bugscache`, satisfying the new constraint.
I've edited the existing migrations file since in most environments
it won't have already run (I'll fix up prototype/stage by hand). The
non-FK changes have been left as-is, since they are working fine.
change BugJobMap bug_id field to foreign key field to create a join on
Bugscache model; add whiteboard to bugzilla, add index to Push time
field for search by date range; update associated tests
Since it's not being used. This will save us having to convert it to
React for now, and also means one less thing slowing down our builds.
Should it be needed in the future, it can be resurrected using the
Git log.
Since revision details are included in the response from the `/push/`
API, so there's never any need to fetch them separately.
(And if anyone wants more than the first 20 commits, they should be
using hg.mozilla.org's API instead.)
This replaces the logic that was in clonejobs.js that formerly
handled rendering the pushes, platforms and jobs and converts it to
ReactJS. The ReactJS code is hosted as a directive in AngularJS
using ngReact reactDirective.
This also removes the feature where you can hide revisions because
it was believed to not be used and added unnecessary complications
to the code.
Co-authored-by: Casey Williams cwillia5@gmail.com
Since after bug 1400069 it is no longer used by the UI.
This removes everything but the Job model field (since the table is
large enough that migrations need to be carefully coordinated, and we
can batch up that change with others).
With ES6, the `'use strict'` directives are unnecessary:
https://eslint.org/docs/rules/strict
The directives have been left in the Neutrino configs, since they
are used by node directly, which doesn't yet support ES6 modules.
* Allow Google login
Presently, when you login with google you are prompted with a screen
that tells you to login using another provider. However, if you try to
login with google using an LDAP email "@mozilla.com", then there is a
blank page saying Unrecognized identity.
* Add test cases
## Rough summary of the changes
### Front end
The auth callback is written in React and lives under the /login.html endpoint. It communicates with Treeherder using the localStorage.
### Credential expiration
The Django user session expiration is set to expire when the client access token or the id token expires (whichever one expires first). These values are controlled by the IAM team. Presently, the access token expires after 1 day and the id token expires after a week. That being said, the session will therefore expire after 1 day. If you want this value change, we simply need to send a request to the IAM team.
### Credential renewal
Renewals are set to happen every 15 minutes or so. The renewal is skewed slightly so that different open tabs don't renew at the same time. Once renewal happens, both tokens are renewed and the Django session is updated.
### Migration
If the userSession localStorage key is not set, then the user will be logged out including logging out from the Django session. In other words, all users will be automatically logged out when the merge to production happens.
Since:
* Redis has additional features we need (eg for bug 1409679)
* the Redis server, python client and Django backend are more
actively maintained (so have persistent connections/pooling that
actually works, which gives a big perf win)
* we can use Heroku's own Redis addon rather than relying on a
third-party's (to hopefully prevent a repeat of the certificate
expiration downtime)
This commit:
* Switches pylibmc/django-pylibmc to redis-py/django-redis, and
configures the new backend according to:
http://niwinz.github.io/django-redis/latest/https://github.com/andymccurdy/redis-py
* Uses redis-py's native support for TLS to connect to the Heroku
Redis server's stunnel port directly, avoiding the complexity of
using a buildpack to create an stunnel between the dyno and server:
https://devcenter.heroku.com/articles/securing-heroku-redis#connecting-directly-to-stunnel
* Uses explicit `REDIS_URL` values on Travis/Vagrant rather than
relying on the django-environ `default` value (for parity with
how we configure `DATABASE_URL` and others).
* Removes the pylibmc connection-closing workaround from `wsgi.py`.
Note: Whilst the Heroku docs suggest using `django-redis-cache`, it's
not as actively maintained, so we're using `django-redis` instead:
https://devcenter.heroku.com/articles/heroku-redishttps://github.com/niwinz/django-redishttps://github.com/sebleier/django-redis-cache
Before this is merged, Heroku Redis instances will need to be created
for stage/production (prototype done) - likely on plan `premium-3`:
https://elements.heroku.com/addons/heroku-redis
We'll also need to pick an eviction policy:
https://devcenter.heroku.com/articles/heroku-redis#maxmemory-policy
Once deployed, the `memcachier-tls-buildpack` buildpack will no longer
be required and can be removed from prototype/stage/prod, along with
the `TREEHERDER_MEMCACHED` environment variable and Memcachier addon.
* Adjusts the test name and docstring, since there is nothing
memcached-specific about the test.
* Overrides the default timeout of never to 10 seconds, to prevent
previous test run values persisting.
Modify the code to:
* share assets and global settings wherever possible
* update links going both directions
* other small UI tweaks for uniformity with Treeherder
* Fixed a few routing dead-ends on the react side
* Removed the dead TestDetail file we weren't using anyway
* fix production domain urls
The runnable jobs API now fetches runnable-jobs.json if available and fallsback to
full-task-graph.json.
The new file is less than a tenth of the original file and contains the minimum
amount of data required for the endpoint.
Drop support for full-task-graph.json and 'job_type_description'.
The UI has already been removed. This cleans up the data ingestion
and removes the `JobDuration` model, however leaves the `running_eta`
field on the `Job` model for the next time that table is touched (since
the table is large, so altering the schema would likely require
downtime).
Now that no submissions are using revision_hash, it can be removed.
This removes everything but the model field, which will be handled
later.
I've removed revision_hash from the Pulse jobs schema without bumping
the version, which wouldn't normally be ok, but no one is still using
it, and I'd rather have explicit failures later than if we left the
schema unchanged.
Also upgrade Enzyme from 2.7.1 to 3.1.1
Most notable change here is that React.PropTypes is now
moved to a separate package and referenced just by
PropTypes. So this needed some import and linting changes.
* Make the performance data api accept signature id as input
* Rewrite most of the frontend to use that API in preference to passing
in signature hashes
* Correspondingly increase the chunks of comparison data we fetch
at once
This adds the upgrade to Bootstrap 4, and some basic changes and
some CSS tweaks we needed to keep out UI consistent.
The simpler changes are things like:
* Classes that were renamed
* Adding classes that are now needed (dropdown-item, etc)
* Change an item from a button to a span
* Changing order of items (modal header close button, etc)
* CSS class syntax changes
The other changes are lots of CSS padding, margin, font and
other spacing tweaks.
Prior to this, we stored a guid in the ``coalesced_to_guid``
field and deduced that it was coalesced by that field being
non-null. This will just set the result to "superseded" and
the state to "completed".
Note: This leaves the models and tables intact so that we can
revert without data loss in case we discover an issue. A follow-up
commit will remove those tables and models.
* Makes TaskCluster jobs independent of profiles to set Tier. We take
whatever tier we are given by TaskCluster for the job.
* Removes use of exclusion profiles for Buildbot jobs. Jobs that
should be Tier-2 and Tier-3 are hard-coded in the Treeherder code
by their job signature.
This is to allow any job type to be able to belong to any
job group. This will also mean that if someone accidentally
picked the wrong group for a job type, we don't need to
fix it in the DB for all new jobs. They can fix their task
definition, and all new jobs will go to the new job group.
This includes a management command to migrate the old
data from job_type.job_group to the new field of job.job_group.
A follow-up PR will remove the old field and set the API to
read from the job.job_group field.
pytest treats objects starting with the string "Test" as tests, so an
underscore prefix has been added to prevent warnings of form:
```
WC1 .../test_detect_intermittents.py cannot collect test class 'TestFailureDetector' because it has a __init__ constructor
```
It's having to be added as a platform rather than a new job/group
name, since otherwise comparisons can't be made in Perfherder with
the existing tests.
Change of new environment variable `PULSE_PUSH_SOURCES`.
Keep old `publish-resultset-runnable-job-action` task name by creating a
method that points to `publish_push_runnable_job_action`.
All environments are now using the native MySQL 5.7 libmysqlclient
library, so the custom vendoring script is unused. The tests checking
that the library isn't vulnerable aren't needed any more, since
heroku-16 doesn't even have that package installed (it's not available
on Xenial).
The GraphQL queries for the TestGroup UI were creating too many sub-queries.
There is a bug in graphene_django that eliminates any usage of
select_related or prefetch_related. This works around that limitation and
allows the user to create a mapping of fields to the optimization that
can be used for each. It will only apply the optimization if the field
is present in the query.
We use python .format() to concatenate the commit author’s name and
email address. When we hit an author name with non-ASCII
characters, it would bomb out trying to format it into a String.
So this makes sure to put those fields into a Unicode field to preserve
them.
Since we have to upgrade from MySQL 5.6 to 5.7 and there are still
enough remnants behind that cause the test to fail. Multiple attempts
have been made to clean them up to no avail, and since this will go
away soon when we switch Travis to using Docker images, it's not worth
spending any more time on it.
Previously it was dependant on the order of results returned by the
API, when the backing DB query doesn't use an ORDER BY. This means
that changes in the optimiser (eg upgrading to a different MySQL
version) could cause the test to fail.
* Bug 1371820 - Automatically select 'fixed by commit' classification type in certain circumstances
If a 12 or 40 char string consisting only of a-f,0-9 is pasted into the classification textbox, or if the pasted string contains 'hg.mozilla.org', automatically select the 'fixed by commit' classification type, to reduce the number of clicks needed to properly classify this failure.