* rename backfill bot' s packages to Sherlock
* slight logs adjustments to Sherlock
* replace 'Perfsheriff bot' prefix with 'Sherlock'
* adjust comment referring to Sherlock
* rename SecretaryTool to Secretary
* remove StepParser and switch to ErrorParser
* remove writes to TextLogStep from artifact.py
* remove buildbot ref in builders
* replace TextLogStep model in DetailsPanel, SimilarJobsTab and logviewer App
* cleanup DetailsPanel
* remove old log parsing tests and update others
* add logging to error_summary.py
* add parse max error lines limit to ErrorParser
* fix in similar jobs tab for Bug 1652869
* remove StepParser and stop storing steps in TextLogStep
* remove more buildbot references
* replace TextLogStep model in DetailsPanel and logviewer App
* remove old log parsing tests and update other tests
* add logging to error_summary.py
* remove StepParser and stop storing steps
* replace TextLogStepModel in the UI with text-log-errors API
* remove old references to buildbot
* update and cleanup tests
Currently, Treeherder consumes Pulse messages from an intermediary service called `taskcluster-treeherder`.
Such service needs to be shut down and its functionality imported into Treeherder.
In order to do this we need to switch to the standard Taskcluster exchanges as defined in here:
https://docs.taskcluster.net/docs/reference/platform/queue/exchanges
On a first pass we are only including the code from `taskcluster-treeherder` without changing
much of Treeherder's code. The code is translated from Javascript to Python and only some minor
code changes were done to reduce the difficulty on porting the code without introducing bugs.
Internally, on this first pass, we will still have an intermediary data structure representing
what `taskcluster-treeherder` is emitting, however, we will stop consuming the messages
from it and be able to shut it down.
Instead of consuming from one single exchange we will be consuming multiple ones. Each one representing
a different kind of task (e.g. pending vs running).
In order to test this change you need to open 5 terminal windows and follow these steps:
* On the first window run `docker-compose up`
* On the next three windows `export PULSE_URL="amqp://foo:bar@pulse.mozilla.org:5671/?ssl=1"` and run the following commands:
* `docker-compose run -e PULSE_URL backend ./manage.py pulse_listener_jobs`
* `docker-compose run -e PULSE_URL backend ./manage.py pulse_listener_tasks`
* `docker-compose run -e PULSE_URL backend ./manage.py pulse_listener_pushes`
* On the last window run `docker-compose run backend celery -A treeherder worker -B --concurrency 5`
* Open on your browser `http://localhost:5000`
This is just a summary from [the docs](https://treeherder.readthedocs.io/pulseload.html).
= ETL management commands =
This change also introduces two ETL management command that can be executed like this:
== Ingest push and tasks ==
This script can ingest into Treeherder all tasks associated to a push.
It uses Python's asyncio to speed up the ingestion of tasks.
```bash
./manage.py ingest_push_and_tasks
```
== Update Pulse test fixtures ==
```bash
./manage.py update_pulse_test_fixtures
```
This command will read 100 Taskcluster Pulse messages, process them and store them as test fixtures
under these two files: `tests/sample_data/pulse_consumer/taskcluster_{jobs,metadata}.json`
Following this work would be to get rid of the intermediary job representation ([bug 1560596](https://bugzilla.mozilla.org/show_bug.cgi?id=1560596) which will
clean up some of the code and some of the old tests.
= Extra script =
Script that permits comparing pushes from two different Treeherder instances.
```
usage: Compare a push from a Treeherder instance to the production instance.
[-h] [--host HOST] --revision REVISION [--project PROJECT]
optional arguments:
-h, --help show this help message and exit
--host HOST Host to compare. It defaults to localhost
--revision REVISION Revision to compare
--project PROJECT Project to compare. It defaults to mozilla-central
```
= Other changes =
Other changes included:
* Import `taskcluster-treeherder`'s validation to ensure we're not fed garbage.
* Change `yaml.load(f)` for `yaml.load(f, Loader=yaml.FullLoader)`. Read [this](https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation) for details
* Introduce `taskcluster` and `taskcluster-urls` as dependencies
* The test `test_retry_missing_revision_never_succeeds` makes no sense because
we make Json validation on the Pulse message
* Bug 1395254 - Consume Taskcluster Pulse messages from standard queue exchanges
Currently, Treeherder consumes Pulse messages from an intermediary service called `taskcluster-treeherder`.
Such service needs to be shut down and its functionality imported into Treeherder.
In order to do this we need to switch to the standard Taskcluster exchanges as defined in here:
https://docs.taskcluster.net/docs/reference/platform/queue/exchanges
On a first pass we are only including the code from `taskcluster-treeherder` without changing
much of Treeherder's code. The code is translated from Javascript to Python and only some minor
code changes were done to reduce the difficulty on porting the code without introducing bugs.
Internally, on this first pass, we will still have an intermediary data structure representing
what `taskcluster-treeherder` is emitting, however, we will stop consuming the messages
from it and be able to shut it down.
Instead of consuming from one single exchange we will be consuming multiple ones. Each one representing
a different kind of task (e.g. pending vs running).
In order to test this change you need to open 4 terminal windows and follow these steps:
* On the first two windows `export PULSE_URL="amqp://foo:bar@pulse.mozilla.org:5671/?ssl=1"` and run the following commands:
* `docker-compose run -e PULSE_URL backend ./manage.py pulse_listener_jobs`
* `docker-compose run -e PULSE_URL backend ./manage.py pulse_listener_pushes`
* On the third window run `docker-compose run backend celery -A treeherder worker -B --concurrency 5`
* On the last window run `docker-compose up`
* Open on your browser `http://localhost:5000`
This is just a summary from [the docs](https://treeherder.readthedocs.io/pulseload.html).
= ETL management commands =
This change also introduces two ETL management command that can be executed like this:
== Ingest push and tasks ==
This script can ingest into Treeherder all tasks associated to a push.
It uses Python's asyncio to speed up the ingestion of tasks.
```bash
./manage.py ingest_push_and_tasks
```
== Update Pulse test fixtures ==
```bash
./manage.py update_pulse_test_fixtures
```
This command will read 100 Taskcluster Pulse messages, process them and store them as test fixtures
under these two files: `tests/sample_data/pulse_consumer/taskcluster_{jobs,metadata}.json`
Following this work would be to get rid of the intermediary job representation ([bug 1560596](https://bugzilla.mozilla.org/show_bug.cgi?id=1560596) which will
clean up some of the code and some of the old tests.
= Other changes =
Other changes included:
* Import `taskcluster-treeherder`'s validation to ensure we're not fed garbage.
* Change `yaml.load(f)` for `yaml.load(f, Loader=yaml.FullLoader)`. Read [this](https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation) for details
* Introduce `taskcluster` and `taskcluster-urls` as dependencies
* The test `test_retry_missing_revision_never_succeeds` makes no sense because
we make Json validation on the Pulse message
Since as of #3980 (bug 1470622) the frontend no longer calls the
`/retrigger/` `/cancel/` or `/cancel_all/` Treeherder APIs.
Whilst looking at the pulse related fixtures, I spotted that the
`mock_message_broker` fixture was already unused.
Since everything but standard retrigger/cancellation is now handled
client-side by tcactions, making the backend parts that provided pulse
messages to pulse_actions redundant (since it was decommissioned in
bug 1379172).
Now that no submissions are using revision_hash, it can be removed.
This removes everything but the model field, which will be handled
later.
I've removed revision_hash from the Pulse jobs schema without bumping
the version, which wouldn't normally be ok, but no one is still using
it, and I'd rather have explicit failures later than if we left the
schema unchanged.
Change of new environment variable `PULSE_PUSH_SOURCES`.
Keep old `publish-resultset-runnable-job-action` task name by creating a
method that points to `publish_push_runnable_job_action`.
This sets the field lengths to what they will be in a later PR for
the job_details model. But these are still within the constraints
of the current field lengths for that table.
Several existing jobs are already out of compliance with these patterns
and there is no existing way to tell task definition developers how to
comply with our required patterns.
created bug 1283866 in Taskcluster for that tool/workflow
This removes the pattern requirements. If we ever decide that we DO
need these patterns, we can create that tool and then fix old task
definitions to comply.
* Bug 1254325 - Adding support to download all_tasks.json
* Adding support to send task ID of gecko decision via Pulse
* Fixing a few problems
* Adding UTC timestamp in pulse message
* Fixing nits, adding restriction to block ui changes
* Updating URL for full tasks file
* Renaming buildernames to requested_jobs
Some repos are longer-lived and do not yet have the Task Cluster
code that allows them to submit tasks with a revision. They only
have the older code to submit revision_hash. This prevents the
jobs from being ingested via Pulse. This commit adds support
for revision_hash until a time when it's no longer needed.
This contains several tweaks and fixes that allow us to ingest data from
a real Task Cluster owned exchange.
One of the main fixes is the way I was binding to the exchange and
routing keys with the same Queue. Before, it was re-creating the Queue,
so would miss some of the bindings.
This will also prune the durable queue if the config has removed some
exchanges and/or routing keys.
These changes were discovered to be needed after direct testing
against Task Cluster Pulse exchanges.
-Made some JSON/YML schema changes to be more precise for several fields
-Modified to job_loader to be more resilient to optional data being
missing