Assiya Khuzyakhmetova
0440989b18
Add option to analyze 'historical' bugs in the Bug model ( #261 )
2019-04-12 19:44:33 +02:00
Boris Feld
bad6a50d8b
Pre commit setup ( #252 )
...
* Add pre-commit configuration
Add auto-formatting configuration using the https://pre-commit.com/ project.
Having auto-formatting setup and automatically enforced helps speeding up
development and review process.
* Apply the auto-formatting on all files in the repository
* Removes flake8-quotes as it conflicts with Black formatting
* Disable some Flake8 rules
Disable Flake8 rules that are handled by Black. The list comes from
https://github.com/ambv/black/issues/429#issuecomment-472687803 .
2019-04-09 15:57:29 +02:00
Marco Castelluccio
d1cbaf6575
Change nomenclature (feature -> enhancement) everywhere to avoid confusion
2019-04-05 16:09:15 +02:00
Marco Castelluccio
0a3f23a64e
Add a 'token' argument for when we need to download bugs from Bugzilla
2019-04-04 22:14:53 +02:00
Assiya Khuzyakhmetova
1a1bcf2c3a
Add assignee model to run.py ( #242 )
2019-04-03 11:50:38 +02:00
Marco Castelluccio
97ec49624d
Remove dashes from defect_feature_task goal name
2019-02-28 20:57:43 +01:00
Marco Castelluccio
19abd72091
Make download_bugs_between return downloaded bugs
2019-02-28 20:56:41 +01:00
John Giannelos
24cacd673c
Fix naming when loading model file ( #180 )
2019-02-18 16:57:01 +01:00
John Giannelos
38b0778853
Add alternative component model using a neural network ( #169 )
2019-02-18 00:37:10 +01:00
Marco Castelluccio
3b6c16f704
Add a model to distinguish between defects, feature/enhancement request, and task
2019-02-13 23:20:12 +01:00
Yatin Maan
b121cc1976
Only show "meaningful enough" features ( #78 )
2019-01-29 21:39:06 +01:00
Marco Castelluccio
a0a59ee4c5
Add an argument to run.py to generate a sheet with evaluation on bugs from the previous week
2019-01-25 15:32:44 +01:00
Marco Castelluccio
421f3f1043
Add a dev-doc-needed model
...
Fixes #79
2019-01-23 16:17:12 +01:00
Subhajit Das
fdb5488f42
Format with f-strings instead of .format ( #85 )
2019-01-20 22:30:34 +01:00
Marco Castelluccio
18c27879f0
Print most important features in the run script
2019-01-14 23:48:51 +01:00
Marco Castelluccio
37ceb89266
Add a model to classify the product/component of a bug
2019-01-03 14:50:30 +01:00
Marco Castelluccio
dbf992f696
Only download the prefilled DB when training
2019-01-02 15:26:43 +01:00
Marco Castelluccio
100205c4fa
Always download required data for training
2019-01-02 15:26:43 +01:00
Marco Castelluccio
beefa23382
Make download_bugs a normal function to avoid footguns
2018-12-22 00:33:57 +01:00
Marco Castelluccio
7e64a77479
Consume iterators returned from bugzilla.download_bugs, so that bugs are actually downloaded
2018-12-17 23:37:51 +01:00
Marco Castelluccio
b0c1b0b913
Avoid '.model' extension
2018-12-14 00:33:36 +01:00
Marco Castelluccio
9df8ed0a86
Perform classification only when the user asks to
2018-12-13 23:23:41 +01:00
Marco Castelluccio
368c7cb3d8
Download data from Bugzilla without using the search API.
...
At least until https://bugzilla.mozilla.org/show_bug.cgi?id=1508695 is fixed.
2018-12-13 23:03:46 +01:00
Marco Castelluccio
289ff7bf92
Add an 'uplift' model
2018-12-13 12:12:42 +01:00
Marco Castelluccio
62d3c2a44f
Fix model file name
2018-12-12 18:13:19 +01:00
Marco Castelluccio
91438a3124
Refactoring to make it possible to have different extraction pipeline and classifier for each model
2018-12-12 10:47:26 +01:00
Ayush Shridhar
df9f06d9f5
Add qaneeded option to run.py ( #27 )
2018-12-09 23:56:30 +01:00
Marco Castelluccio
0eff7b9bd9
Add function to retrieve all bug IDs from all label files and use it to add an option to download required files for training
2018-11-22 11:43:20 +01:00
Marco Castelluccio
4eca461339
Make it possible to train the classifier for different goals
2018-11-22 00:27:58 +01:00
Marco Castelluccio
30ee98533b
Add a module to perform classification
2018-11-21 12:40:57 +01:00
Marco Castelluccio
1f5fa66957
Add script to run the training
2018-11-20 16:47:24 +01:00
Marco Castelluccio
cdf84f685b
Move run.py to a 'train' module in bugbug
2018-11-20 16:47:08 +01:00
Marco Castelluccio
64d5a0c5ec
Rename get_labels to get_bugbug_labels since labels can now include multiple kinds of labels
...
Former-commit-id: db485d1d25c50438d47932e25d94baeb1e90323b
2018-11-20 01:20:38 +01:00
Marco Castelluccio
13eed862f6
Move Python modules to a 'bugbug' subdirectory
...
Former-commit-id: d1db546fb0
2018-11-19 22:02:31 +01:00
Marco Castelluccio
bcc33779b9
Add commit data to bugs, but don't use it yet (doesn't improve results)
...
Former-commit-id: 554ae35320
2018-11-12 17:55:41 +01:00
Marco Castelluccio
f741d77717
Refactor get_bugs code into multiple modules
...
Former-commit-id: 466aa8446b
2018-10-12 00:46:14 +02:00
Marco Castelluccio
bfec0f94c6
Add another TODO
...
Former-commit-id: 335cd747a8
2018-10-11 22:43:21 +02:00
Marco Castelluccio
69ac23fe50
Read bugs iteratively
...
Former-commit-id: 581f1bc8b3
2018-10-11 22:42:34 +02:00
Marco Castelluccio
84a2b08381
No need to retrieve both keys and values to generate y
...
Former-commit-id: 3a20373872
2018-10-11 20:13:21 +02:00
Marco Castelluccio
887514cafa
Perform augmentation directly when retrieving labels
...
Former-commit-id: 8d0d63403b
2018-10-11 20:12:43 +02:00
Marco Castelluccio
8492939b08
Skip labels for which we have no bug data directly in get_labels
...
Former-commit-id: c5075c7e84
2018-10-11 20:04:26 +02:00
Marco Castelluccio
7462bbea6b
Split downloading of bugs and retrieval of bugs for training
...
Former-commit-id: 67c300263f
2018-10-11 19:40:53 +02:00
Marco Castelluccio
31a487cb16
Add doc2vec to the list of things to try
...
Former-commit-id: a1a99b8ea4
2018-10-01 02:41:34 +02:00
Marco Castelluccio
0ce2382213
Add TODO about text cleanup
...
Former-commit-id: 58383767fc
2018-10-01 02:41:06 +02:00
Marco Castelluccio
5f8d64aa7f
Add optional lemmatization using spaCy
...
Former-commit-id: 5262e12663
2018-10-01 02:40:06 +02:00
Marco Castelluccio
f50971f7f9
Fix flake8 issues
...
Former-commit-id: 3eb73e0948
2018-10-01 01:22:59 +02:00
Marco Castelluccio
7dee06e724
Refactor code in a function
...
Former-commit-id: bb7931dade
2018-10-01 01:20:26 +02:00
Marco Castelluccio
2df82f0a1d
Set a fixed seed for the random under-sampler, so we get consistent results
...
Former-commit-id: da8d6fc7b5
2018-09-24 23:08:49 +01:00
Marco Castelluccio
400b04858a
Small cleanup by making get_labels directly return a dict
...
Former-commit-id: 1995e0ccb0
2018-09-24 14:14:15 +01:00
Marco Castelluccio
540b7ebaa7
Perform under-sampling of the majority class
...
Former-commit-id: 8d3c7c3ba4
2018-09-24 00:11:49 +01:00