Граф коммитов

64 Коммитов

Автор SHA1 Сообщение Дата
Marco c087dc9b1c
Use MODELS in run.py instead of redefining list of available models (#473) 2019-05-23 18:16:17 +02:00
Ayush Shridhar 24c805e64e Add a RegressionRange model (#449) 2019-05-18 13:03:57 +02:00
Marco 8a5795417a
Add a pre-commit hook using codespell (#411)
* Add a pre-commit hook using codespell

Fixes #410

* Fix some spelling mistakes
2019-05-16 17:24:18 +02:00
Boris Feld 0a5e37439d Add a central place where the models are defined (#398)
* Add a central place where the models are defined

Also add some helpers to load a model.

* Add missing tensorflow dependency in extra-nn-requirements.txt
2019-05-16 15:34:38 +02:00
Ayush Shridhar d5ba1b20df Add initial implementation of a 'Steps to Reproduce' model (#423) 2019-05-15 14:25:57 +02:00
Ayush Shridhar add9a937b3 Multilabel classifier for detecting type of bug (#395) 2019-05-14 12:17:53 +02:00
Marco 9995b8c236
Make training code more generic to make it possible to train on other kinds of objects (e.g. commits) (#335)
* Move feature cleanup functions in a separate module

As they can be shared for different objectives, e.g. both training on bugs and on commits.

* Make Model more generic to make it possible to train on different objects

Introduce BugModel and CommitModel, as base classes for models training on bugs and on commits.

Update all models to use BugModel and to use the new feature_cleanup module.

Fixes #306.

* Update ID and description of the defect/enhancement/task Taskcluster task definition

* Add a module to extract features from commit data

* Add an example model training on commits to predict commits which will be backed out

* Update defect model name, and add possibility to train backout model
2019-05-03 11:57:48 +02:00
Boris Feld 053954d70b Run pre-commit in the lint task (#297) 2019-04-19 18:01:24 +02:00
Assiya Khuzyakhmetova 0440989b18 Add option to analyze 'historical' bugs in the Bug model (#261) 2019-04-12 19:44:33 +02:00
Boris Feld bad6a50d8b Pre commit setup (#252)
* Add pre-commit configuration

Add auto-formatting configuration using the https://pre-commit.com/ project.
Having auto-formatting setup and automatically enforced helps speeding up
development and review process.

* Apply the auto-formatting on all files in the repository

* Removes flake8-quotes as it conflicts with Black formatting

* Disable some Flake8 rules

Disable Flake8 rules that are handled by Black. The list comes from
https://github.com/ambv/black/issues/429#issuecomment-472687803.
2019-04-09 15:57:29 +02:00
Marco Castelluccio d1cbaf6575 Change nomenclature (feature -> enhancement) everywhere to avoid confusion 2019-04-05 16:09:15 +02:00
Marco Castelluccio 0a3f23a64e Add a 'token' argument for when we need to download bugs from Bugzilla 2019-04-04 22:14:53 +02:00
Assiya Khuzyakhmetova 1a1bcf2c3a Add assignee model to run.py (#242) 2019-04-03 11:50:38 +02:00
Marco Castelluccio 97ec49624d Remove dashes from defect_feature_task goal name 2019-02-28 20:57:43 +01:00
Marco Castelluccio 19abd72091 Make download_bugs_between return downloaded bugs 2019-02-28 20:56:41 +01:00
John Giannelos 24cacd673c Fix naming when loading model file (#180) 2019-02-18 16:57:01 +01:00
John Giannelos 38b0778853 Add alternative component model using a neural network (#169) 2019-02-18 00:37:10 +01:00
Marco Castelluccio 3b6c16f704 Add a model to distinguish between defects, feature/enhancement request, and task 2019-02-13 23:20:12 +01:00
Yatin Maan b121cc1976 Only show "meaningful enough" features (#78) 2019-01-29 21:39:06 +01:00
Marco Castelluccio a0a59ee4c5 Add an argument to run.py to generate a sheet with evaluation on bugs from the previous week 2019-01-25 15:32:44 +01:00
Marco Castelluccio 421f3f1043 Add a dev-doc-needed model
Fixes #79
2019-01-23 16:17:12 +01:00
Subhajit Das fdb5488f42 Format with f-strings instead of .format (#85) 2019-01-20 22:30:34 +01:00
Marco Castelluccio 18c27879f0 Print most important features in the run script 2019-01-14 23:48:51 +01:00
Marco Castelluccio 37ceb89266 Add a model to classify the product/component of a bug 2019-01-03 14:50:30 +01:00
Marco Castelluccio dbf992f696 Only download the prefilled DB when training 2019-01-02 15:26:43 +01:00
Marco Castelluccio 100205c4fa Always download required data for training 2019-01-02 15:26:43 +01:00
Marco Castelluccio beefa23382 Make download_bugs a normal function to avoid footguns 2018-12-22 00:33:57 +01:00
Marco Castelluccio 7e64a77479 Consume iterators returned from bugzilla.download_bugs, so that bugs are actually downloaded 2018-12-17 23:37:51 +01:00
Marco Castelluccio b0c1b0b913 Avoid '.model' extension 2018-12-14 00:33:36 +01:00
Marco Castelluccio 9df8ed0a86 Perform classification only when the user asks to 2018-12-13 23:23:41 +01:00
Marco Castelluccio 368c7cb3d8 Download data from Bugzilla without using the search API.
At least until https://bugzilla.mozilla.org/show_bug.cgi?id=1508695 is fixed.
2018-12-13 23:03:46 +01:00
Marco Castelluccio 289ff7bf92 Add an 'uplift' model 2018-12-13 12:12:42 +01:00
Marco Castelluccio 62d3c2a44f Fix model file name 2018-12-12 18:13:19 +01:00
Marco Castelluccio 91438a3124 Refactoring to make it possible to have different extraction pipeline and classifier for each model 2018-12-12 10:47:26 +01:00
Ayush Shridhar df9f06d9f5 Add qaneeded option to run.py (#27) 2018-12-09 23:56:30 +01:00
Marco Castelluccio 0eff7b9bd9 Add function to retrieve all bug IDs from all label files and use it to add an option to download required files for training 2018-11-22 11:43:20 +01:00
Marco Castelluccio 4eca461339 Make it possible to train the classifier for different goals 2018-11-22 00:27:58 +01:00
Marco Castelluccio 30ee98533b Add a module to perform classification 2018-11-21 12:40:57 +01:00
Marco Castelluccio 1f5fa66957 Add script to run the training 2018-11-20 16:47:24 +01:00
Marco Castelluccio cdf84f685b Move run.py to a 'train' module in bugbug 2018-11-20 16:47:08 +01:00
Marco Castelluccio 64d5a0c5ec Rename get_labels to get_bugbug_labels since labels can now include multiple kinds of labels
Former-commit-id: db485d1d25c50438d47932e25d94baeb1e90323b
2018-11-20 01:20:38 +01:00
Marco Castelluccio 13eed862f6 Move Python modules to a 'bugbug' subdirectory
Former-commit-id: d1db546fb0
2018-11-19 22:02:31 +01:00
Marco Castelluccio bcc33779b9 Add commit data to bugs, but don't use it yet (doesn't improve results)
Former-commit-id: 554ae35320
2018-11-12 17:55:41 +01:00
Marco Castelluccio f741d77717 Refactor get_bugs code into multiple modules
Former-commit-id: 466aa8446b
2018-10-12 00:46:14 +02:00
Marco Castelluccio bfec0f94c6 Add another TODO
Former-commit-id: 335cd747a8
2018-10-11 22:43:21 +02:00
Marco Castelluccio 69ac23fe50 Read bugs iteratively
Former-commit-id: 581f1bc8b3
2018-10-11 22:42:34 +02:00
Marco Castelluccio 84a2b08381 No need to retrieve both keys and values to generate y
Former-commit-id: 3a20373872
2018-10-11 20:13:21 +02:00
Marco Castelluccio 887514cafa Perform augmentation directly when retrieving labels
Former-commit-id: 8d0d63403b
2018-10-11 20:12:43 +02:00
Marco Castelluccio 8492939b08 Skip labels for which we have no bug data directly in get_labels
Former-commit-id: c5075c7e84
2018-10-11 20:04:26 +02:00
Marco Castelluccio 7462bbea6b Split downloading of bugs and retrieval of bugs for training
Former-commit-id: 67c300263f
2018-10-11 19:40:53 +02:00