* Allow autoclassification when there are multiple failures of the same type reported for the same test
* Add a test with a test failing in the same group in two different modes
* Support parsing suite and configuration from label when a task is not split in chunks
* Assert we can parse all labels to suites when checking if a group is still running on a push
* Add mochitest-browser-a11y to the list of suites
* Add a comment to remind ourselves of stopping to parse labels when the suite information will be part of the task definition
* Extend the classify command to perform backfills and retriggers
* Fix code and existing tests
* Fix repeat_retrigger + Filter out groups not associated to any failing task
* Fix CI
* Fix some review related code snippets
* Fixes + Nits + Cleanup
Co-authored-by: Eva Bardou <ebardou@teklia.com>
* Filter out groups that are still running from real failures
* Fix tests
* Nits
* Fix typo + Nit
* Get suite from label instead of relying on tests-type tag as the tag isn't always available
* MOZHARNESS_TEST_PATHS is part of 'env'
* No need to create a list first to create a set
* No need to check MOZHARNESS_TEST_PATHS if the test suite is different
This way we can avoid to retrieve the task definitions in many cases
* Refactor logic to acknowledge running groups
* Fix new logic
* Clearer comment
Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
Co-authored-by: Eva Bardou <ebardou@teklia.com>
Co-authored-by: Marco Castelluccio <mcastelluccio@mozilla.com>
* Implement ability to perform backfills on tasks
* Add tests
* Retrieve TC Firefox CI credentials from mozci config secret
Co-authored-by: Eva Bardou <ebardou@teklia.com>
* Identify failures type for a Push and group them by test group
* Add a docstring + refactor typing info
Co-authored-by: Eva Bardou <ebardou@teklia.com>
* Support retrieving group durations in the errorsummary data source
The treeherder_client data source should support it too, but the data is not available yet
in Treeherder, so for now it will just return None so we don't have to split the contract in two.
Fixes#646
* Mention #662 in TODO comments for future cleanup
* Support GroupSummary durations using data from results
* Implement __make_group_summary_objects using data from group results
Fixes#359
* Update tests
* Update more tests
* Remove integration tests depending on ActiveData
Reinstating them without relying on ActiveData is tracked in #390
* Remove references to adr and ActiveData from the documentation
* Remove adr dependency
* Remove last references to ActiveData from code comments
* Install cache extras for tests as it is now required
Previously we were relying on adr depending on boto3
* Take into account group results in following pushes too for determining cross config failures
Fixes#610
* Use old style dict type definition until we stop supporting Python 3.7
A failure is intermittent when it is not cross-config *and* not config-consistent, we were instead
checking if it was either not cross-config or not config-consistent.
The option can be None if we want to disable cross-config detection, or a tuple with two ints where:
- the first number specifies the number of configurations to consider a failure as cross-config for intermittent detection;
- the second number specifies the number of configurations to consider a failure as cross-config for real failure detection.
This way we can tune the detection. E.g. if we want to be conservative, we can set a low first number and a high second number.
Fixes#605
To identify real failures, we check that at least three tasks on any configuration are exhibiting the failure.
To identify intermittent failures, we check that there are no configurations with two tasks exhibiting the failure.
The numbers are configurable through the parameters to the push classify function.
Fixes#607
* Fix the cross config logic to consider failures as cross config when they happen on all configurations
* Fix test
* A failure in a single task should not be considered as cross config
* Ensure a failure is considered intermittent only if the group run on multiple tasks and failed only in some
Instead of considering it an intermittent when it run and failed on a single task
* Use all information avaiable in json-pushes to populate bugs lists
* Fix conftest
* Fix tests/test_regression.py
* Always use full push info
* Use parse_bugs from hg_custom
* Update data structure in unit tests
* Fix bug on summary display for calcs
* Suggestions from Marco
* Support raw list of changeset IDs
* Cache tasks more often than only when a push is finalized
* Add tests
* Try to fix CI
* Mock cache.get instead
Co-authored-by: Eva Bardou <ebardou@teklia.com>