Граф коммитов

56 Коммитов

Автор SHA1 Сообщение Дата
Assiya Khuzyakhmetova 0440989b18 Add option to analyze 'historical' bugs in the Bug model (#261) 2019-04-12 19:44:33 +02:00
Boris Feld bad6a50d8b Pre commit setup (#252)
* Add pre-commit configuration

Add auto-formatting configuration using the https://pre-commit.com/ project.
Having auto-formatting setup and automatically enforced helps speeding up
development and review process.

* Apply the auto-formatting on all files in the repository

* Removes flake8-quotes as it conflicts with Black formatting

* Disable some Flake8 rules

Disable Flake8 rules that are handled by Black. The list comes from
https://github.com/ambv/black/issues/429#issuecomment-472687803.
2019-04-09 15:57:29 +02:00
Marco Castelluccio d1cbaf6575 Change nomenclature (feature -> enhancement) everywhere to avoid confusion 2019-04-05 16:09:15 +02:00
Marco Castelluccio 0a3f23a64e Add a 'token' argument for when we need to download bugs from Bugzilla 2019-04-04 22:14:53 +02:00
Assiya Khuzyakhmetova 1a1bcf2c3a Add assignee model to run.py (#242) 2019-04-03 11:50:38 +02:00
Marco Castelluccio 97ec49624d Remove dashes from defect_feature_task goal name 2019-02-28 20:57:43 +01:00
Marco Castelluccio 19abd72091 Make download_bugs_between return downloaded bugs 2019-02-28 20:56:41 +01:00
John Giannelos 24cacd673c Fix naming when loading model file (#180) 2019-02-18 16:57:01 +01:00
John Giannelos 38b0778853 Add alternative component model using a neural network (#169) 2019-02-18 00:37:10 +01:00
Marco Castelluccio 3b6c16f704 Add a model to distinguish between defects, feature/enhancement request, and task 2019-02-13 23:20:12 +01:00
Yatin Maan b121cc1976 Only show "meaningful enough" features (#78) 2019-01-29 21:39:06 +01:00
Marco Castelluccio a0a59ee4c5 Add an argument to run.py to generate a sheet with evaluation on bugs from the previous week 2019-01-25 15:32:44 +01:00
Marco Castelluccio 421f3f1043 Add a dev-doc-needed model
Fixes #79
2019-01-23 16:17:12 +01:00
Subhajit Das fdb5488f42 Format with f-strings instead of .format (#85) 2019-01-20 22:30:34 +01:00
Marco Castelluccio 18c27879f0 Print most important features in the run script 2019-01-14 23:48:51 +01:00
Marco Castelluccio 37ceb89266 Add a model to classify the product/component of a bug 2019-01-03 14:50:30 +01:00
Marco Castelluccio dbf992f696 Only download the prefilled DB when training 2019-01-02 15:26:43 +01:00
Marco Castelluccio 100205c4fa Always download required data for training 2019-01-02 15:26:43 +01:00
Marco Castelluccio beefa23382 Make download_bugs a normal function to avoid footguns 2018-12-22 00:33:57 +01:00
Marco Castelluccio 7e64a77479 Consume iterators returned from bugzilla.download_bugs, so that bugs are actually downloaded 2018-12-17 23:37:51 +01:00
Marco Castelluccio b0c1b0b913 Avoid '.model' extension 2018-12-14 00:33:36 +01:00
Marco Castelluccio 9df8ed0a86 Perform classification only when the user asks to 2018-12-13 23:23:41 +01:00
Marco Castelluccio 368c7cb3d8 Download data from Bugzilla without using the search API.
At least until https://bugzilla.mozilla.org/show_bug.cgi?id=1508695 is fixed.
2018-12-13 23:03:46 +01:00
Marco Castelluccio 289ff7bf92 Add an 'uplift' model 2018-12-13 12:12:42 +01:00
Marco Castelluccio 62d3c2a44f Fix model file name 2018-12-12 18:13:19 +01:00
Marco Castelluccio 91438a3124 Refactoring to make it possible to have different extraction pipeline and classifier for each model 2018-12-12 10:47:26 +01:00
Ayush Shridhar df9f06d9f5 Add qaneeded option to run.py (#27) 2018-12-09 23:56:30 +01:00
Marco Castelluccio 0eff7b9bd9 Add function to retrieve all bug IDs from all label files and use it to add an option to download required files for training 2018-11-22 11:43:20 +01:00
Marco Castelluccio 4eca461339 Make it possible to train the classifier for different goals 2018-11-22 00:27:58 +01:00
Marco Castelluccio 30ee98533b Add a module to perform classification 2018-11-21 12:40:57 +01:00
Marco Castelluccio 1f5fa66957 Add script to run the training 2018-11-20 16:47:24 +01:00
Marco Castelluccio cdf84f685b Move run.py to a 'train' module in bugbug 2018-11-20 16:47:08 +01:00
Marco Castelluccio 64d5a0c5ec Rename get_labels to get_bugbug_labels since labels can now include multiple kinds of labels
Former-commit-id: db485d1d25c50438d47932e25d94baeb1e90323b
2018-11-20 01:20:38 +01:00
Marco Castelluccio 13eed862f6 Move Python modules to a 'bugbug' subdirectory
Former-commit-id: d1db546fb0
2018-11-19 22:02:31 +01:00
Marco Castelluccio bcc33779b9 Add commit data to bugs, but don't use it yet (doesn't improve results)
Former-commit-id: 554ae35320
2018-11-12 17:55:41 +01:00
Marco Castelluccio f741d77717 Refactor get_bugs code into multiple modules
Former-commit-id: 466aa8446b
2018-10-12 00:46:14 +02:00
Marco Castelluccio bfec0f94c6 Add another TODO
Former-commit-id: 335cd747a8
2018-10-11 22:43:21 +02:00
Marco Castelluccio 69ac23fe50 Read bugs iteratively
Former-commit-id: 581f1bc8b3
2018-10-11 22:42:34 +02:00
Marco Castelluccio 84a2b08381 No need to retrieve both keys and values to generate y
Former-commit-id: 3a20373872
2018-10-11 20:13:21 +02:00
Marco Castelluccio 887514cafa Perform augmentation directly when retrieving labels
Former-commit-id: 8d0d63403b
2018-10-11 20:12:43 +02:00
Marco Castelluccio 8492939b08 Skip labels for which we have no bug data directly in get_labels
Former-commit-id: c5075c7e84
2018-10-11 20:04:26 +02:00
Marco Castelluccio 7462bbea6b Split downloading of bugs and retrieval of bugs for training
Former-commit-id: 67c300263f
2018-10-11 19:40:53 +02:00
Marco Castelluccio 31a487cb16 Add doc2vec to the list of things to try
Former-commit-id: a1a99b8ea4
2018-10-01 02:41:34 +02:00
Marco Castelluccio 0ce2382213 Add TODO about text cleanup
Former-commit-id: 58383767fc
2018-10-01 02:41:06 +02:00
Marco Castelluccio 5f8d64aa7f Add optional lemmatization using spaCy
Former-commit-id: 5262e12663
2018-10-01 02:40:06 +02:00
Marco Castelluccio f50971f7f9 Fix flake8 issues
Former-commit-id: 3eb73e0948
2018-10-01 01:22:59 +02:00
Marco Castelluccio 7dee06e724 Refactor code in a function
Former-commit-id: bb7931dade
2018-10-01 01:20:26 +02:00
Marco Castelluccio 2df82f0a1d Set a fixed seed for the random under-sampler, so we get consistent results
Former-commit-id: da8d6fc7b5
2018-09-24 23:08:49 +01:00
Marco Castelluccio 400b04858a Small cleanup by making get_labels directly return a dict
Former-commit-id: 1995e0ccb0
2018-09-24 14:14:15 +01:00
Marco Castelluccio 540b7ebaa7 Perform under-sampling of the majority class
Former-commit-id: 8d3c7c3ba4
2018-09-24 00:11:49 +01:00