bugbug

Граф коммитов

Автор	SHA1	Сообщение	Дата
Marco Castelluccio	16ece06f64	Retry git operations multiple times	2019-06-27 10:25:38 +02:00
Marco Castelluccio	a3933a48a4	Don't fail if there's an error while pulling from the repo	2019-06-27 10:25:38 +02:00
Marco Castelluccio	56f224b9dc	Generate microannotate repository for mozilla-central	2019-06-26 18:57:36 +02:00
Ayush Shridhar	6788b2e33a	Make similarity script more generic and add nearest neighbors similarity with tf-idf encoding (#628 )	2019-06-26 13:42:23 +02:00
Marco Castelluccio	bd118c58ab	Use with statement for hg.open	2019-06-26 11:45:02 +02:00
x249wang	ab28e8ace2	Use zstandard instead of xz (#524 ) Fixes #461.	2019-06-24 13:16:44 +02:00
Boris Feld	9834053a36	Start tracking training metrics as Taskcluster artifacts (#604 ) Fixes #342	2019-06-22 14:18:08 -07:00
cklyyung	f4145b4eca	Use 'everchanged' operator instead of 'changedafter' operator with 1970 (#598 )	2019-06-18 22:01:15 -07:00
AK.py	f6289a4468	Don't try to find inconsistencies in all bugs multiple times (#595 )	2019-06-18 13:37:01 -07:00
Marco Castelluccio	938eb29bbf	Support getting new specific type labels from Bugzilla	2019-06-12 01:35:53 +02:00
Marco Castelluccio	735fccc4a9	In the retrieval task, download only new or changed bugs To support it, refactor bugzilla methods: - adding methods to get IDs given a query and given a time period; - renaming the internal _download method to get, since it's used externally; - changing delete to be more flexible and allowing to use a lambda to choose which bugs to delete. Fixes #440.	2019-06-09 00:32:23 +02:00
Marco Castelluccio	36f9a7c8d9	Move comment_level_labeler script in the scripts directory	2019-06-08 20:36:13 +02:00
Boris Feld	5f9be450cf	Ensure we download data from INDEX URL containing bugbug version (#564 )	2019-06-07 16:14:58 +02:00
Ayush Shridhar	6e39b0a5a5	Change timedelta to 21 days in the script to generate Duplicate model results (#563 )	2019-06-07 12:05:07 +02:00
Marco	ee935d6e5b	Download previous commits DB and experiences, and only mine data for new commits landed since then (#546 ) * Download previous commits DB and experiences, and only mine data for new commits landed since then Fixes #537 * Simplify db methods * Add an option to return mined commits	2019-06-06 18:55:17 +02:00
Ayush Shridhar	f21b4ee9d8	Add a script to generate duplicate classifier results (#548 )	2019-06-06 16:30:23 +02:00
Marco Castelluccio	b1ddef742a	Download bugs DB when the model is a BugCoupleModel too	2019-06-04 12:56:52 +02:00
Marco	d8b84ca798	Support retrieving commits in steps (#536 ) * Support retrieving commits in steps * Store component mapping ETag to actually avoid downloading it again when not needed * Store a version file alongside the DBs * Export the commits DB version file and the experiences values as artifacts of the commit-retriever task	2019-06-03 19:29:08 +02:00
Marco Castelluccio	e465286df1	Add another missing f-string in the trainer script	2019-05-30 21:02:46 +02:00
Marco Castelluccio	f3c76ccb1a	Add missing 'f' in f-string in trainer script	2019-05-30 18:55:40 +02:00
Marco Castelluccio	0e08d04903	Fix subclass detection in trainer script	2019-05-30 18:55:22 +02:00
Marco Castelluccio	7f0a6555f2	Add support for downloading different DBs according to model requirements to trainer script	2019-05-30 13:26:48 +02:00
Marco	025e3f4da2	Fix command to train defect/enhancement/task model (#476 ) * Fix command to train defect/enhancement/task model Fixes #475 * Add more logging in the trainer script, and assert the model is generated	2019-05-21 14:46:57 +02:00
Boris Feld	dd00d7b9ec	Add the support for downloading the model before checking it (#452 ) Also put the right configuration in the check pipeline	2019-05-17 11:45:42 +02:00
Boris Feld	0a5e37439d	Add a central place where the models are defined (#398 ) * Add a central place where the models are defined Also add some helpers to load a model. * Add missing tensorflow dependency in extra-nn-requirements.txt	2019-05-16 15:34:38 +02:00
Marco	2d249793e2	Try regenerating the pushlog using pull and update (#444 )	2019-05-16 11:33:14 +02:00
Marco	9223954520	Remove training tasks' unneeded dependencies on commit retrieval task (#407 ) Fixes #390	2019-05-14 15:22:44 +02:00
Boris Feld	f4b2b938be	Add basic check method and check script (#341 ) * Add basic check method and check script * Ensure the check of component will correctly use super result * Add required infra to schedule model checks * Add scheduling bits for the model checks * Remove the filtering on classification * Extract counting bugs to a new function in bugzilla.py * Also checks conflated components * Fix new hook id * Call bugzilla with the count_only param to speed up the check * Fix the new hook scope to match the hook id * Fix component model check after previous refactoring * Fix component model check method * Use a bugzilla report for even faster component model check * Clarify get_product_component_count docstring We are already filtering out full component with 0 bugs * Update conflated components mapping check A conflated component could also be part of the conflated components mapping * Distinguish between non-existing full components and empty full components * Remove the filter on resolution and unnecessary url params * Update component check method Keep checks as separate as possible for clarity, we could merge them or makes them faster later * Generate dynamically the CSV report url * Fix Docker image name the hook * Implement component check number 5 Get the meaningful components for the last 6 months * Handle reviews comments * Remove extraneous print * Removes TODO * Use a different threshold ration when checking for new meaningful components As we are only checking new bugs for 6 months, adjust the threshold ration to be less sensitive to occasional burst ob bugs for q given component. * Reduce the threshold ratio As we check on a disjoint time window, reduce the chance of false positives * Handle review nits * Fix last nits	2019-05-10 12:20:23 +02:00
Boris Feld	369b44ea02	Update the index URLs in bugbug (#328 ) * Update the index URLs in bugbug * Split the http service Docker image in two This way we can both: - Build the first half (code + dependencies) in the usual CI. - Build the second half at the end of the data pipeline with updated models. Taskboot build-compose doesn't support building all services except a specific one and it might be cumbersome to add this feature so move the second half of the Docker image to a separate docker-compose file.	2019-05-02 17:00:32 +02:00
Marco Castelluccio	3105acef95	Add script to gather defect/enhancement/task labels	2019-04-24 14:15:40 +02:00
Boris Feld	4b55b7f4f3	Add support to get secrets from taskcluster (#294 )	2019-04-19 16:49:07 +02:00
Boris Feld	6af6e8b927	Import Trainer class from release-services repository (#254 ) * Import Trainer class from release-services repository This basically import the `trainer.py` file from the `release-services` repository at hash 77cdddd. I removed imports and reference to cli-common helpers that will likely need to be reimplemented, like the raven support. Also defines 4 docker images, one per model to train. * Remove unused imports	2019-04-09 17:49:56 +02:00
Boris Feld	b651744b18	Import retriever services and add Docker image definition (#251 ) * Import Retriever class from release-services repository This basically import the `retriever.py` file from the `release-services` repository at hash 77cdddd. I removed imports and reference to cli-common helpers that will likely needs to be reimplemented, like the raven support. The next commit will defines some Dockerfiles that will use the imported file. * Add docker image definition Build three Docker image, one is for bugbug itself. It is just installing bugbug and its dependencies. One is for retrieving information from the mozilla-central Mercurial repository, it depends on the first one and install the right Mercurial version. The last one is for retrieving information from the Bugzilla instance, it depends in the first one and needs a valid Bugzilla token. * Separate the two tasks into separate script files They share almost no code at all so they don't need to be in the same file * Apply Black on the scripts to makes Flake8 happy	2019-04-09 16:30:09 +02:00
Boris Feld	bad6a50d8b	Pre commit setup (#252 ) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803.	2019-04-09 15:57:29 +02:00
Marco Castelluccio	41f1aa3b1e	Calculate important components based on their past occurrences rather than having a hardcoded list Fixes #220	2019-03-18 20:18:25 +01:00
John Giannelos	d29621b84d	Add script to compute success rate for component models (#190 )	2019-02-26 15:16:39 +01:00

... 3 4 5 6 7

336 Коммитов