Platform for Machine Learning projects on Software Engineering
Перейти к файлу
Marco Castelluccio 67c300263f Split downloading of bugs and retrieval of bugs for training 2018-10-11 19:40:53 +02:00
data Don't use MongoDB 2018-10-11 15:43:44 +02:00
.gitignore First commit 2018-03-11 20:12:35 +00:00
.isort.cfg Enable several flake8 checkers 2018-09-21 16:45:04 +02:00
.travis.yml Add Travis CI configuration 2018-09-22 02:01:35 +02:00
LICENSE First commit 2018-03-11 20:12:35 +00:00
README.md Update number of bugs in the dataset 2018-10-11 17:29:20 +02:00
bug_features.py Add a ML-based classifier 2018-09-22 02:10:21 +02:00
bugbug.py Fix flake8 issues in bugbug.py 2018-09-22 02:00:36 +02:00
classes.csv First commit 2018-03-11 20:12:35 +00:00
classes_more.csv Add more labelled bugs 2018-10-11 14:04:41 +02:00
get_bugs.py Split downloading of bugs and retrieval of bugs for training 2018-10-11 19:40:53 +02:00
handwritten_rules_run.py Split downloading of bugs and retrieval of bugs for training 2018-10-11 19:40:53 +02:00
requirements.txt Don't use MongoDB 2018-10-11 15:43:44 +02:00
run.py Split downloading of bugs and retrieval of bugs for training 2018-10-11 19:40:53 +02:00
setup.cfg Enable several flake8 checkers 2018-09-21 16:45:04 +02:00
test-requirements.txt Enable several flake8 checkers 2018-09-21 16:45:04 +02:00
utils.py Add a ML-based classifier 2018-09-22 02:10:21 +02:00

README.md

bugbug - Classify Bugzilla bugs between actual bugs and bugs that aren't bugs

Bugs on Bugzilla aren't always bugs. Sometimes they are feature requests, refactorings, and so on. The aim of this project is to distinguish between bugs that are actually bugs and bugs that aren't.

The dataset currently contains 2110 bugs, the accuracy of the current classifier is ~92% (precision ~97%, recall ~93%).

Setup

  1. Run pip install -r requirements.txt and pip install -r test-requirements.txt
  2. Run cat data/bugs.json.xz.part* | unxz > data/bugs.json

If you update the bugs database, run cat data/bugs.json | xz -v9 - | split -d -b 20MB - data/bugs.json.xz.part.