Граф коммитов

50 Коммитов

Автор SHA1 Сообщение Дата
Boris Feld afd67402e2 Fix copy-paste typo with the new indexing schema (#801) 2019-07-28 20:38:05 +02:00
Boris Feld a43ad03b2a Add a new indexing schema for training tasks (#795)
In order to efficiently solve #614, we need a new indexing schema
so getting all metrics following a given date is easy.
2019-07-26 18:28:04 +02:00
Marco Castelluccio a614d34735 Move download of bugs linked to commits in the bug-retriever script
Also, make the bug-retriever task depend on the commit-retriever one, making the
download of bugs linked to commits actually work :)
2019-07-25 01:05:25 +02:00
Marco Castelluccio 66367584cd Revert "Enable feature importance calculation for the defect/enhancement/task model"
This reverts commit d9cdcdc238.

It's running out of memory on releng-svc-compute workers (c5.4xlarge), so we need to temporarily disable it.
2019-07-15 15:49:28 +02:00
Anurag Aggarwal 656d6e844b Remove bugs_retrieval image and use the base image instead in its place (#691)
* Fixes #633
2019-07-12 14:17:41 +02:00
Marco Castelluccio d9cdcdc238 Enable feature importance calculation for the defect/enhancement/task model 2019-07-11 20:44:07 +02:00
Marco Castelluccio 17b027c767 Enable feature importance calculation at training time for the regressor model 2019-07-10 16:25:38 +02:00
Boris Feld e7add98563 Update task-boot to 0.1.9 (#675) 2019-07-05 15:36:16 +02:00
Marco Castelluccio 6ce18762de 'payload.command' should be an array 2019-07-02 13:26:46 +02:00
Marco Castelluccio d12a25f644 Upload feature visualization image as an artifact of the training tasks 2019-07-01 13:10:39 +02:00
Boris Feld 7459f79317 Use the base image for training models (#656)
Fixes #350
2019-06-29 00:01:51 +02:00
Boris Feld d24993d0ac Remove dependency on rollbacktest in docker build. (#653)
Fixes #651
2019-06-28 15:32:39 +02:00
Boris Feld 54e41d1497 Use taskboot 0.1.8 (#645)
The new taskboot release solves the double build on non-tag commits and
allows the heroku deploy to be fully atomic.
2019-06-28 11:11:48 +02:00
x249wang ab28e8ace2 Use zstandard instead of xz (#524)
Fixes #461.
2019-06-24 13:16:44 +02:00
Boris Feld 9834053a36 Start tracking training metrics as Taskcluster artifacts (#604)
Fixes #342
2019-06-22 14:18:08 -07:00
Boris Feld 27f9104fb5 Make sure the Docker build task uses the tagged code (#610)
If not, new master code might get released and conflict with the code in the
bugbug images.

 Fixes #609
2019-06-21 08:20:08 -07:00
Boris Feld c06db28442 Bump taskboot to version 1.0.7 (#583)
Now that https://github.com/mozilla/task-boot/issues/39 is fixed, let's update
task-boot version to use it.

Also add missing tags and cache option when building Docker images in
data-pipeline.yml
2019-06-12 20:11:34 +02:00
Marco Castelluccio 89b37b96ae Upload version file too in the bugs retrieval task 2019-06-09 00:13:20 +02:00
Marco Castelluccio 353d21d01b Clone repository quietly 2019-06-08 11:19:01 +02:00
Marco Castelluccio 4a991ac6ef Fix download of bugs DB in the rollback test 2019-06-08 11:17:15 +02:00
Marco Castelluccio 9de91456f6 Update to taskboot 0.1.6 2019-06-07 22:03:00 +02:00
Boris Feld a8faa48d8a Support classifying batches of bugs with a background worker (#321) 2019-06-07 21:22:14 +02:00
Marco Castelluccio 82d9c0ece0 Update to taskboot 0.1.5 2019-06-07 16:47:28 +02:00
Boris Feld 2e05e57be2 Build docker images data pipeline tag (#566)
* Build the HTTP Docker image with the right tag

* Ensure the builded docker image has the right parent image
2019-06-07 16:46:05 +02:00
Boris Feld 2988700028 Use tagged index urls for pushing artifacts (#561)
* Use tagged index urls for pushing artifacts

Also replace previous code that updated Docker image tag to use JSON-e
templating instead.
2019-06-07 12:52:29 +02:00
Boris Feld 7906380e6f Bump version of taskboot to use latest version of img tool (#562)
It is necessary to support mulit-tag Docker image building
2019-06-07 12:21:09 +02:00
Marco Castelluccio 44e26ff0e8 Add a training task for the Regressor model 2019-06-03 22:15:18 +02:00
Marco Castelluccio 4ce438a35a Fix typo in artifact name for the commits retrieval task 2019-06-03 21:37:39 +02:00
Marco d8b84ca798
Support retrieving commits in steps (#536)
* Support retrieving commits in steps

* Store component mapping ETag to actually avoid downloading it again when not needed

* Store a version file alongside the DBs

* Export the commits DB version file and the experiences values as artifacts of the commit-retriever task
2019-06-03 19:29:08 +02:00
Marco Castelluccio e62dd6f37d Make rollback-test task verbose 2019-06-03 11:06:32 +02:00
Ayush Shridhar 9d71677667 Add a training task for the Duplicate model (#525) 2019-05-31 17:05:58 +02:00
Marco Castelluccio bd3e4c7900 Increase the maximum runtime for the commits retrieval task 2019-05-30 13:27:23 +02:00
Marco Castelluccio 42d2ff2db8 Add a training task for the Backout model 2019-05-30 13:27:06 +02:00
Boris Feld 6ee9fb57f0 Fix Docker build by downloading the models inside the image. Fix #504 (#516)
The data pipeline failed before because it tried downloading the model from
outside the Docker image and didn't had bugbug installed.

The clean way of solving this would be to build a base http service image on
release and build another one where we simply download the models but let's
fix it this way for now.
2019-05-29 20:43:58 +02:00
Boris Feld 1bae5834ab Implement deployment to Heroku (#458) 2019-05-23 20:39:02 +02:00
Ayush Shridhar b41170baa5 Add training task for the StepsToReproduce model (#441) 2019-05-22 21:43:11 +02:00
Ayush Shridhar 91bf939fb7 Add training task for the RegressionRange model (#466) 2019-05-22 18:58:47 +02:00
Boris Feld d3c3bcbece Bump version of taskboot used in taskcluster and data pipeline (#446) 2019-05-16 13:02:58 +02:00
Marco Castelluccio ff9ea35ed0 Reduce deadlines to maximum of 5 days
Taskcluster only allows up to 5 days
2019-05-14 20:39:00 +02:00
Marco 9223954520
Remove training tasks' unneeded dependencies on commit retrieval task (#407)
Fixes #390
2019-05-14 15:22:44 +02:00
Marco c4bd01278e
Add 'expires' to all tasks to avoid them expiring in a too long time (#393)
Fixes #391.
2019-05-12 21:46:58 +02:00
Marco e3230ca999
Increase deadline of data pipeline tasks (#389)
Fixes #388.
2019-05-10 16:12:46 +02:00
Marco 6f09488573
Rename mozilla/bugbug-train-defect image to mozilla/bugbug-train-defectenhancementtask (#375)
Fixes #364.
2019-05-09 23:36:38 +02:00
Marco Castelluccio c3f55e682a Rename train-defect to train-defectenhancementtask 2019-05-07 13:16:22 +02:00
Marco Castelluccio 2eaf90be20 Add a cache to the commit retrieval task
Fixes #347
2019-05-07 11:38:02 +02:00
Boris Feld 6937e0e5e8 Add the rollback test in the data pipeline (#337)
Add the rollback test in the data pipeline and move the bug snapshot test to a pytest test
2019-05-03 14:20:43 +02:00
Marco 9995b8c236
Make training code more generic to make it possible to train on other kinds of objects (e.g. commits) (#335)
* Move feature cleanup functions in a separate module

As they can be shared for different objectives, e.g. both training on bugs and on commits.

* Make Model more generic to make it possible to train on different objects

Introduce BugModel and CommitModel, as base classes for models training on bugs and on commits.

Update all models to use BugModel and to use the new feature_cleanup module.

Fixes #306.

* Update ID and description of the defect/enhancement/task Taskcluster task definition

* Add a module to extract features from commit data

* Add an example model training on commits to predict commits which will be backed out

* Update defect model name, and add possibility to train backout model
2019-05-03 11:57:48 +02:00
Boris Feld 297963e4ce Skip checking models while building the http service image, and only push it as part of the pipeline (#331)
* Add a way to skip checking models while building the http service image

* Don't push the http service on release

It isn't built with the real models on release

* Use taskboot 0.1.1
2019-05-02 23:18:51 +02:00
Boris Feld 369b44ea02 Update the index URLs in bugbug (#328)
* Update the index URLs in bugbug

* Split the http service Docker image in two

This way we can both:
- Build the first half (code + dependencies) in the usual CI.
- Build the second half at the end of the data pipeline with updated models.

Taskboot build-compose doesn't support building all services except a
specific one and it might be cumbersome to add this feature so move the second
half of the Docker image to a separate docker-compose file.
2019-05-02 17:00:32 +02:00
Boris Feld 6e7ca892cd Introduce a new Docker image for data-pipeline spawning (#320) 2019-05-02 14:36:50 +02:00