Boris Feld
7906380e6f
Bump version of taskboot to use latest version of img tool ( #562 )
...
It is necessary to support mulit-tag Docker image building
2019-06-07 12:21:09 +02:00
Ayush Shridhar
6e39b0a5a5
Change timedelta to 21 days in the script to generate Duplicate model results ( #563 )
2019-06-07 12:05:07 +02:00
Sladyn
860bb69c10
Add a basic test for the StepsToReproduce model ( #503 )
2019-06-07 11:11:39 +02:00
Boris Feld
e0accae208
Move string formatting to f-string in spawn_data_pipeline ( #559 )
2019-06-07 11:04:33 +02:00
pyup.io bot
c590278bff
Update pyyaml from 5.1 to 5.1.1 ( #560 )
2019-06-07 10:56:26 +02:00
Marco Castelluccio
f3caa72d54
Version 0.0.44
2019-06-06 19:15:37 +02:00
Boris Feld
5a31c99ac9
Add support for specific Docker tag in spawn_data_pipeline.py ( #553 )
...
* Revert "Revert "Add support for specific Docker tag in spawn_data_pipeline.py (#489 )" (#499 )"
This reverts commit 249ed40eb6
.
* Ignore task with a tagged docker image
* Restrict Docker tag update to bugbug related images
2019-06-06 19:14:27 +02:00
Marco
ee935d6e5b
Download previous commits DB and experiences, and only mine data for new commits landed since then ( #546 )
...
* Download previous commits DB and experiences, and only mine data for new commits landed since then
Fixes #537
* Simplify db methods
* Add an option to return mined commits
2019-06-06 18:55:17 +02:00
Boris Feld
32f56a3962
Add a script to update the hook definition with the TAG during release ( #507 )
...
Fixes #501 , fixed relanding of #491 .
2019-06-06 18:11:59 +02:00
Boris Feld
08e36a7d8a
Build tagged Docker images ( #554 )
2019-06-06 18:06:16 +02:00
pyup.io bot
75045dac44
Update pre-commit from 1.16.1 to 1.17.0 ( #555 )
2019-06-06 18:00:14 +02:00
Ayush Shridhar
f21b4ee9d8
Add a script to generate duplicate classifier results ( #548 )
2019-06-06 16:30:23 +02:00
Marco Castelluccio
b8f0a14f4e
Store ETag when downloading DBs and only redownload if necessary
2019-06-05 14:54:36 +02:00
Marco Castelluccio
47eb2da7ec
Don't overwrite the first_pushdate value
...
Follow up to bea28a17f6
2019-06-05 01:07:12 +02:00
Marco Castelluccio
f5951ad63a
Support retrieving some label files at runtime, and do it for the regressor labels
2019-06-05 00:37:26 +02:00
Marco
5165524b62
Iterate over bugs only once during training ( #527 )
...
Fixes #515
2019-06-04 18:45:55 +02:00
Marco Castelluccio
b0207e2448
Version 0.0.43
2019-06-04 16:34:36 +02:00
Marco Castelluccio
1e6cb79573
Avoid all types of weirdness with ', ' in 'added' or 'removed'
2019-06-04 16:34:18 +02:00
Ayush Shridhar
551af5ff1c
Remove sampler from Duplicate model ( #543 )
2019-06-04 16:22:58 +02:00
Ayush Shridhar
3f2b1d4efa
Randomly choose non-duplicate bugs for Duplicate model training ( #542 )
2019-06-04 15:51:14 +02:00
Marco Castelluccio
218e100b3e
Version 0.0.42
2019-06-04 13:49:15 +02:00
Marco Castelluccio
7790f5e3d5
Use raw CSV file, not GitHub's HTML page
2019-06-04 13:08:24 +02:00
Marco Castelluccio
d57177f1e4
Fix destination path of the regressor.csv label file
2019-06-04 13:07:59 +02:00
Marco Castelluccio
b1ddef742a
Download bugs DB when the model is a BugCoupleModel too
2019-06-04 12:56:52 +02:00
Marco Castelluccio
dfbe7a5ed4
Add functions to download and extract DB support files
...
For example, the version file.
2019-06-04 12:55:03 +02:00
Marco Castelluccio
bea28a17f6
Get first pushdate from hg log on following runs of the repository mining script
...
Otherwise we'd use the pushdate of the first new commit as the first pushdate.
2019-06-04 12:52:08 +02:00
Marco Castelluccio
36d7d4449e
Store first_commit_time dict too in the experiences file
...
Otherwise on following runs of calculate_experiences we'd have wrong seniority.
2019-06-04 12:51:29 +02:00
Marco Castelluccio
86babf8222
Transform results should be available when merge_data is False too
2019-06-04 11:33:52 +02:00
Marco Castelluccio
089f8dd7ca
Don't rollback the same bug multiple times in case of bug couples
2019-06-04 11:33:15 +02:00
Marco Castelluccio
9afa655651
Increase number of duplicate and non-duplicate bugs to consider
2019-06-04 01:36:43 +02:00
Marco Castelluccio
9357b91c16
Build set of all IDs in one go
2019-06-04 01:36:03 +02:00
Marco Castelluccio
6cfe6fe8e1
Remove duplicate duplicate IDs
2019-06-04 01:34:56 +02:00
Marco Castelluccio
8e3f2d58eb
Limit the number of duplicates to consider, without leaking duplicates into non-duplicates
...
We were stopping to iterate bugs when we reached the number of duplicates we wanted.
This meant that we were considering some duplicate bugs to be non-duplicate.
2019-06-04 01:33:23 +02:00
Marco Castelluccio
17626e16d1
No need to declare non_duplicate_ids as empty list
2019-06-04 01:01:14 +02:00
Marco Castelluccio
b441709a26
Print number of labels consistently in the Duplicate model
2019-06-04 01:00:41 +02:00
Marco Castelluccio
baf8650399
Use a set for storing all IDs, and calculate non-duplicate IDs as the difference between sets of all bugs and of duplicate bugs
2019-06-04 01:00:03 +02:00
Marco Castelluccio
0a7ce5b763
No need to limit the overall number of bug IDs to consider, as long as we limit the number of duplicate bugs to consider
2019-06-04 00:57:16 +02:00
Marco Castelluccio
967a038018
Misc cleanup for the label calculation of the Duplicate model
2019-06-04 00:56:25 +02:00
Marco Castelluccio
4f77ac82e3
Don't fail when the 'cf_has_str' field is not available for a given bug
2019-06-03 23:31:53 +02:00
pyup.io bot
78c5fd7edb
Update pytest from 4.6.1 to 4.6.2 ( #538 )
2019-06-03 22:40:30 +02:00
Marco Castelluccio
9e1f32f03f
Version 0.0.41
2019-06-03 22:29:45 +02:00
Marco Castelluccio
44e26ff0e8
Add a training task for the Regressor model
2019-06-03 22:15:18 +02:00
Marco Castelluccio
2804436357
Download regressor labels from marco-c/mozilla-central-regressors repository in the train_regressor Docker image
2019-06-03 22:14:47 +02:00
Marco Castelluccio
72ddfea2e3
Add a Docker image for the task to train the Regressor model
2019-06-03 21:46:35 +02:00
Marco Castelluccio
6b99570349
Sort models by name
2019-06-03 21:46:07 +02:00
Marco Castelluccio
ab39f26c2a
Add a model to predict patches more likely to cause regressions
2019-06-03 21:45:13 +02:00
Marco Castelluccio
4ce438a35a
Fix typo in artifact name for the commits retrieval task
2019-06-03 21:37:39 +02:00
Marco Castelluccio
f397033f77
Version 0.0.40
2019-06-03 19:43:25 +02:00
Marco Castelluccio
d993ae0d15
Add more defect/enhancement/task labels gathered from changed made by users on Bugzilla
2019-06-03 19:35:28 +02:00
Marco
d8b84ca798
Support retrieving commits in steps ( #536 )
...
* Support retrieving commits in steps
* Store component mapping ETag to actually avoid downloading it again when not needed
* Store a version file alongside the DBs
* Export the commits DB version file and the experiences values as artifacts of the commit-retriever task
2019-06-03 19:29:08 +02:00