Marco Castelluccio
65bf1b4604
Only run integration test after data retrieval and training tasks are done
2019-11-02 17:49:36 +01:00
Marco Castelluccio
4b48bccab5
Make apt-get be quiet in the integration test
2019-11-02 17:08:18 +01:00
Boris Feld
807ecaca85
Misc fixes to enable integration tests at release time ( #987 )
...
Fixes #985 and fixes #329
2019-10-24 20:09:32 +02:00
Marco Castelluccio
a8866bb562
GDBM doesn't add '.db' at the end of the path
...
I had tested locally with NDBM, which adds it
2019-10-23 12:01:46 +01:00
Marco Castelluccio
4a642f215f
Add the past failures support DB to the artifacts list of the test scheduling history retrieval task
2019-10-22 17:46:15 +01:00
Marco Castelluccio
f01badfb11
Add the version file of the test scheduling history DB to the artifacts list of the retrieval task
2019-10-22 17:46:15 +01:00
Marco Castelluccio
898d911013
Fix path in Taskcluster worker to the test scheduling history DB
2019-10-22 17:46:15 +01:00
Marco Castelluccio
0cfacecb57
Fix push_data.json.zst artifact path
2019-10-20 14:04:20 +01:00
Marco Castelluccio
940e97cdcf
Be quiet when installing bugbug package in the test scheduling history push data retrieval task
2019-10-19 21:22:42 +01:00
Marco Castelluccio
86a6d0a6b9
Fix dependency name
...
Regressed by dc3c3b83da
2019-10-18 14:20:57 +01:00
Marco Castelluccio
5713425500
Use relman-svc compute for the ADR task
...
Since the tasks were split with dc3c3b83da
,
the ADR task is not bounded by performance yet.
2019-10-18 13:38:13 +01:00
Marco Castelluccio
dc3c3b83da
Split test scheduling history retriever task into two
2019-10-18 13:33:53 +01:00
Marco Castelluccio
7f8e08c20d
Add a task to train the test selection model
2019-10-12 17:31:28 +01:00
Marco Castelluccio
2cfd8fc01a
Try using relman-svc-compute for the test scheduling history retrieval task
2019-10-10 18:52:08 +01:00
Marco Castelluccio
6ace4d78bc
Use relman-svc-memory for the test scheduling history retriever task
2019-10-10 11:08:45 +01:00
Marco Castelluccio
251c2712ea
Train a more interpretable regressor model
2019-09-30 15:22:02 +02:00
Marco Castelluccio
16e36cb54b
Fix similarity model path
...
It's 25, not 52...
Fixes #939
2019-09-28 18:15:28 +02:00
Marco Castelluccio
2add7ecc21
Temporarily disable integration test
...
Until #985 is fixed
2019-09-27 15:33:42 +02:00
Boris Feld
5aa036c06d
Ensure the integration tests are green before deploying a new HTTP service ( #979 )
...
Fixes #949
2019-09-26 15:20:28 +02:00
Marco Castelluccio
0412d894de
Offer dataset files from the regressor model as artifacts
2019-09-25 16:56:52 +02:00
Marco Castelluccio
2054e93a1c
Generate a DB of past test runs, with their failure history
...
Also move the artifacts to be in relative directories rather than absolute
2019-09-18 19:58:13 +02:00
Marco Castelluccio
53603d4a4b
Don't use bash l option
2019-09-12 14:17:20 +02:00
Marco Castelluccio
7e084dd91a
The script is not downloaded in scripts/
2019-09-12 09:53:23 +02:00
Marco Castelluccio
833c56bb7b
Get the raw file from GitHub, not the HTML view
2019-09-12 01:45:46 +02:00
Marco Castelluccio
2492ed58b4
Add a task to retrieve test scheduling history
2019-09-11 21:16:19 +02:00
Marco
d65ba69ff3
Add a Dockerfile for tools using bugbug nlp stuff ( #934 )
...
* Add a Dockerfile for tools using bugbug nlp stuff
* Use the bugbug-base-nlp image for the similarity training task
Fixes #933
2019-09-07 00:45:47 +02:00
Boris Feld
1b4c47407b
Bump training tasks timeout ( #932 )
...
We want to keep tasks and metrics artifacts around so we can monitor their
evolution but we don't want to keep the models for a too long period of time
to reduce storage usage.
2019-09-05 11:54:15 +02:00
Ayush Shridhar
59ed555325
Add a task to train a similarity model (the BM25 one) ( #874 )
2019-09-04 14:38:33 +02:00
Marco Castelluccio
cc21b76c51
Use relman-svc instead of releng-svc
...
Also fix typo in 'svc'
2019-08-09 15:57:17 +02:00
Boris Feld
8b4cfd2dc4
Check metrics evolution ( #836 )
...
Fixes #360 and fixes #641 .
2019-08-05 10:22:55 +02:00
Marco
8aac03002a
Use relman-* workers instead of releng-svc ( #842 )
...
Fixes #324
2019-08-03 00:40:38 +02:00
Boris Feld
afd67402e2
Fix copy-paste typo with the new indexing schema ( #801 )
2019-07-28 20:38:05 +02:00
Boris Feld
a43ad03b2a
Add a new indexing schema for training tasks ( #795 )
...
In order to efficiently solve #614 , we need a new indexing schema
so getting all metrics following a given date is easy.
2019-07-26 18:28:04 +02:00
Marco Castelluccio
a614d34735
Move download of bugs linked to commits in the bug-retriever script
...
Also, make the bug-retriever task depend on the commit-retriever one, making the
download of bugs linked to commits actually work :)
2019-07-25 01:05:25 +02:00
Marco Castelluccio
66367584cd
Revert "Enable feature importance calculation for the defect/enhancement/task model"
...
This reverts commit d9cdcdc238
.
It's running out of memory on releng-svc-compute workers (c5.4xlarge), so we need to temporarily disable it.
2019-07-15 15:49:28 +02:00
Anurag Aggarwal
656d6e844b
Remove bugs_retrieval image and use the base image instead in its place ( #691 )
...
* Fixes #633
2019-07-12 14:17:41 +02:00
Marco Castelluccio
d9cdcdc238
Enable feature importance calculation for the defect/enhancement/task model
2019-07-11 20:44:07 +02:00
Marco Castelluccio
17b027c767
Enable feature importance calculation at training time for the regressor model
2019-07-10 16:25:38 +02:00
Boris Feld
e7add98563
Update task-boot to 0.1.9 ( #675 )
2019-07-05 15:36:16 +02:00
Marco Castelluccio
6ce18762de
'payload.command' should be an array
2019-07-02 13:26:46 +02:00
Marco Castelluccio
d12a25f644
Upload feature visualization image as an artifact of the training tasks
2019-07-01 13:10:39 +02:00
Boris Feld
7459f79317
Use the base image for training models ( #656 )
...
Fixes #350
2019-06-29 00:01:51 +02:00
Boris Feld
d24993d0ac
Remove dependency on rollbacktest in docker build. ( #653 )
...
Fixes #651
2019-06-28 15:32:39 +02:00
Boris Feld
54e41d1497
Use taskboot 0.1.8 ( #645 )
...
The new taskboot release solves the double build on non-tag commits and
allows the heroku deploy to be fully atomic.
2019-06-28 11:11:48 +02:00
x249wang
ab28e8ace2
Use zstandard instead of xz ( #524 )
...
Fixes #461 .
2019-06-24 13:16:44 +02:00
Boris Feld
9834053a36
Start tracking training metrics as Taskcluster artifacts ( #604 )
...
Fixes #342
2019-06-22 14:18:08 -07:00
Boris Feld
27f9104fb5
Make sure the Docker build task uses the tagged code ( #610 )
...
If not, new master code might get released and conflict with the code in the
bugbug images.
Fixes #609
2019-06-21 08:20:08 -07:00
Boris Feld
c06db28442
Bump taskboot to version 1.0.7 ( #583 )
...
Now that https://github.com/mozilla/task-boot/issues/39 is fixed, let's update
task-boot version to use it.
Also add missing tags and cache option when building Docker images in
data-pipeline.yml
2019-06-12 20:11:34 +02:00
Marco Castelluccio
89b37b96ae
Upload version file too in the bugs retrieval task
2019-06-09 00:13:20 +02:00
Marco Castelluccio
353d21d01b
Clone repository quietly
2019-06-08 11:19:01 +02:00
Marco Castelluccio
4a991ac6ef
Fix download of bugs DB in the rollback test
2019-06-08 11:17:15 +02:00
Marco Castelluccio
9de91456f6
Update to taskboot 0.1.6
2019-06-07 22:03:00 +02:00
Boris Feld
a8faa48d8a
Support classifying batches of bugs with a background worker ( #321 )
2019-06-07 21:22:14 +02:00
Marco Castelluccio
82d9c0ece0
Update to taskboot 0.1.5
2019-06-07 16:47:28 +02:00
Boris Feld
2e05e57be2
Build docker images data pipeline tag ( #566 )
...
* Build the HTTP Docker image with the right tag
* Ensure the builded docker image has the right parent image
2019-06-07 16:46:05 +02:00
Boris Feld
2988700028
Use tagged index urls for pushing artifacts ( #561 )
...
* Use tagged index urls for pushing artifacts
Also replace previous code that updated Docker image tag to use JSON-e
templating instead.
2019-06-07 12:52:29 +02:00
Boris Feld
7906380e6f
Bump version of taskboot to use latest version of img tool ( #562 )
...
It is necessary to support mulit-tag Docker image building
2019-06-07 12:21:09 +02:00
Marco Castelluccio
44e26ff0e8
Add a training task for the Regressor model
2019-06-03 22:15:18 +02:00
Marco Castelluccio
4ce438a35a
Fix typo in artifact name for the commits retrieval task
2019-06-03 21:37:39 +02:00
Marco
d8b84ca798
Support retrieving commits in steps ( #536 )
...
* Support retrieving commits in steps
* Store component mapping ETag to actually avoid downloading it again when not needed
* Store a version file alongside the DBs
* Export the commits DB version file and the experiences values as artifacts of the commit-retriever task
2019-06-03 19:29:08 +02:00
Marco Castelluccio
e62dd6f37d
Make rollback-test task verbose
2019-06-03 11:06:32 +02:00
Ayush Shridhar
9d71677667
Add a training task for the Duplicate model ( #525 )
2019-05-31 17:05:58 +02:00
Marco Castelluccio
bd3e4c7900
Increase the maximum runtime for the commits retrieval task
2019-05-30 13:27:23 +02:00
Marco Castelluccio
42d2ff2db8
Add a training task for the Backout model
2019-05-30 13:27:06 +02:00
Boris Feld
6ee9fb57f0
Fix Docker build by downloading the models inside the image. Fix #504 ( #516 )
...
The data pipeline failed before because it tried downloading the model from
outside the Docker image and didn't had bugbug installed.
The clean way of solving this would be to build a base http service image on
release and build another one where we simply download the models but let's
fix it this way for now.
2019-05-29 20:43:58 +02:00
Boris Feld
1bae5834ab
Implement deployment to Heroku ( #458 )
2019-05-23 20:39:02 +02:00
Ayush Shridhar
b41170baa5
Add training task for the StepsToReproduce model ( #441 )
2019-05-22 21:43:11 +02:00
Ayush Shridhar
91bf939fb7
Add training task for the RegressionRange model ( #466 )
2019-05-22 18:58:47 +02:00
Boris Feld
d3c3bcbece
Bump version of taskboot used in taskcluster and data pipeline ( #446 )
2019-05-16 13:02:58 +02:00
Marco Castelluccio
ff9ea35ed0
Reduce deadlines to maximum of 5 days
...
Taskcluster only allows up to 5 days
2019-05-14 20:39:00 +02:00
Marco
9223954520
Remove training tasks' unneeded dependencies on commit retrieval task ( #407 )
...
Fixes #390
2019-05-14 15:22:44 +02:00
Marco
c4bd01278e
Add 'expires' to all tasks to avoid them expiring in a too long time ( #393 )
...
Fixes #391 .
2019-05-12 21:46:58 +02:00
Marco
e3230ca999
Increase deadline of data pipeline tasks ( #389 )
...
Fixes #388 .
2019-05-10 16:12:46 +02:00
Marco
6f09488573
Rename mozilla/bugbug-train-defect image to mozilla/bugbug-train-defectenhancementtask ( #375 )
...
Fixes #364 .
2019-05-09 23:36:38 +02:00
Marco Castelluccio
c3f55e682a
Rename train-defect to train-defectenhancementtask
2019-05-07 13:16:22 +02:00
Marco Castelluccio
2eaf90be20
Add a cache to the commit retrieval task
...
Fixes #347
2019-05-07 11:38:02 +02:00
Boris Feld
6937e0e5e8
Add the rollback test in the data pipeline ( #337 )
...
Add the rollback test in the data pipeline and move the bug snapshot test to a pytest test
2019-05-03 14:20:43 +02:00
Marco
9995b8c236
Make training code more generic to make it possible to train on other kinds of objects (e.g. commits) ( #335 )
...
* Move feature cleanup functions in a separate module
As they can be shared for different objectives, e.g. both training on bugs and on commits.
* Make Model more generic to make it possible to train on different objects
Introduce BugModel and CommitModel, as base classes for models training on bugs and on commits.
Update all models to use BugModel and to use the new feature_cleanup module.
Fixes #306 .
* Update ID and description of the defect/enhancement/task Taskcluster task definition
* Add a module to extract features from commit data
* Add an example model training on commits to predict commits which will be backed out
* Update defect model name, and add possibility to train backout model
2019-05-03 11:57:48 +02:00
Boris Feld
297963e4ce
Skip checking models while building the http service image, and only push it as part of the pipeline ( #331 )
...
* Add a way to skip checking models while building the http service image
* Don't push the http service on release
It isn't built with the real models on release
* Use taskboot 0.1.1
2019-05-02 23:18:51 +02:00
Boris Feld
369b44ea02
Update the index URLs in bugbug ( #328 )
...
* Update the index URLs in bugbug
* Split the http service Docker image in two
This way we can both:
- Build the first half (code + dependencies) in the usual CI.
- Build the second half at the end of the data pipeline with updated models.
Taskboot build-compose doesn't support building all services except a
specific one and it might be cumbersome to add this feature so move the second
half of the Docker image to a separate docker-compose file.
2019-05-02 17:00:32 +02:00
Boris Feld
6e7ca892cd
Introduce a new Docker image for data-pipeline spawning ( #320 )
2019-05-02 14:36:50 +02:00