bugbug/scripts/comment_level_labeler.py

# -*- coding: utf-8 -*-
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.

import argparse
import csv
import os
import random

from bugbug import bugzilla
from bugbug.models import load_model

parser = argparse.ArgumentParser()
parser.add_argument(
    "--goal",
    help="Goal of the labeler",
    choices=["str", "regressionrange"],
    default="str",
)
args = parser.parse_args()

if args.goal == "str":
    model = load_model("bug")
elif args.goal == "regressionrange":
    model = load_model("regression")

file_path = os.path.join("bugbug", "labels", f"{args.goal}.csv")

with open(file_path, "r") as f:
    reader = csv.reader(f)
    next(reader)
    labeled_comments = [(int(r[0]), int(r[1]), r[2]) for r in reader]

already_done = set((c[0], c[1]) for c in labeled_comments)

bugs = []
for bug in bugzilla.get_bugs():
    # For the str and regressionrange problems, we don't care about test failures,
    if (
        "intermittent-failure" in bug["keywords"]
        or "stockwell" in bug["whiteboard"]
        or "permafail" in bug["summary"].lower()
    ):
        continue

    # bugs filed from Socorro,
    if (
        "this bug was filed from the socorro interface"
        in bug["comments"][0]["text"].lower()
    ):
        continue

    # and fuzzing bugs.
    if "fuzzing" in bug["comments"][0]["text"].lower():
        continue

    bugs.append(bug)

random.shuffle(bugs)

for bug in bugs:
    # Only show bugs that are really bugs/regressions for labeling.
    c = model.classify(bug)
    if c != 1:
        continue

    v = None

    for i, comment in enumerate(bug["comments"]):
        if (bug["id"], i) in already_done:
            continue

        os.system("clear")
        print(f'Bug {bug["id"]} - {bug["summary"]}')
        print(f"Comment {i}")
        print(comment["text"])

        if args.goal == "str":
            print(
                "\nY for comment containing STR, N for comment not containing STR, K to skip, E to exit"
            )
        elif args.goal == "regressionrange":
            print(
                "\nY for comment containing regression range, N for comment not containing regression range, K to skip, E to exit"
            )
        v = input()

        if v in ["e", "k"]:
            break

        if v in ["y", "n"]:
            labeled_comments.append((bug["id"], i, v))

    if v not in ["e", "k"]:
        with open(file_path, "w") as f:
            writer = csv.writer(f)
            writer.writerow(["bug_id", "comment_num", f"has_{args.goal}"])
            writer.writerows(sorted(labeled_comments))

        print("\nE to exit, anything else to continue")
        v = input()

    if v == "e":
        break
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`# -- coding: utf-8 --`
			`# This Source Code Form is subject to the terms of the Mozilla Public`
			`# License, v. 2.0. If a copy of the MPL was not distributed with this file,`
			`# You can obtain one at http://mozilla.org/MPL/2.0/.`

			`import argparse`
			`import csv`
			`import os`
			`import random`

			`from bugbug import bugzilla`
Add a central place where the models are defined (#398) * Add a central place where the models are defined Also add some helpers to load a model. * Add missing tensorflow dependency in extra-nn-requirements.txt 2019-05-16 16:34:38 +03:00			`from bugbug.models import load_model`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00
			`parser = argparse.ArgumentParser()`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`parser.add_argument(`
			`"--goal",`
			`help="Goal of the labeler",`
			`choices=["str", "regressionrange"],`
			`default="str",`
			`)`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`args = parser.parse_args()`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if args.goal == "str":`
Add a central place where the models are defined (#398) * Add a central place where the models are defined Also add some helpers to load a model. * Add missing tensorflow dependency in extra-nn-requirements.txt 2019-05-16 16:34:38 +03:00			`model = load_model("bug")`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`elif args.goal == "regressionrange":`
Add a central place where the models are defined (#398) * Add a central place where the models are defined Also add some helpers to load a model. * Add missing tensorflow dependency in extra-nn-requirements.txt 2019-05-16 16:34:38 +03:00			`model = load_model("regression")`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00
			`file_path = os.path.join("bugbug", "labels", f"{args.goal}.csv")`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`with open(file_path, "r") as f:`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`reader = csv.reader(f)`
			`next(reader)`
			`labeled_comments = [(int(r[0]), int(r[1]), r[2]) for r in reader]`

			`already_done = set((c[0], c[1]) for c in labeled_comments)`

			`bugs = []`
			`for bug in bugzilla.get_bugs():`
			`# For the str and regressionrange problems, we don't care about test failures,`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if (`
			`"intermittent-failure" in bug["keywords"]`
			`or "stockwell" in bug["whiteboard"]`
			`or "permafail" in bug["summary"].lower()`
			`):`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`continue`

			`# bugs filed from Socorro,`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if (`
			`"this bug was filed from the socorro interface"`
			`in bug["comments"][0]["text"].lower()`
			`):`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`continue`

			`# and fuzzing bugs.`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if "fuzzing" in bug["comments"][0]["text"].lower():`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`continue`

			`bugs.append(bug)`

			`random.shuffle(bugs)`

			`for bug in bugs:`
			`# Only show bugs that are really bugs/regressions for labeling.`
			`c = model.classify(bug)`
			`if c != 1:`
			`continue`

			`v = None`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`for i, comment in enumerate(bug["comments"]):`
			`if (bug["id"], i) in already_done:`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`continue`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`os.system("clear")`
Format with f-strings instead of .format (#85) 2019-01-21 00:30:34 +03:00			`print(f'Bug {bug["id"]} - {bug["summary"]}')`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`print(f"Comment {i}")`
			`print(comment["text"])`

			`if args.goal == "str":`
			`print(`
			`"\nY for comment containing STR, N for comment not containing STR, K to skip, E to exit"`
			`)`
			`elif args.goal == "regressionrange":`
			`print(`
			`"\nY for comment containing regression range, N for comment not containing regression range, K to skip, E to exit"`
			`)`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`v = input()`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if v in ["e", "k"]:`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`break`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if v in ["y", "n"]:`
			`labeled_comments.append((bug["id"], i, v))`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if v not in ["e", "k"]:`
			`with open(file_path, "w") as f:`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`writer = csv.writer(f)`
Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`writer.writerow(["bug_id", "comment_num", f"has_{args.goal}"])`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`writer.writerows(sorted(labeled_comments))`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`print("\nE to exit, anything else to continue")`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`v = input()`

Pre commit setup (#252) * Add pre-commit configuration Add auto-formatting configuration using the https://pre-commit.com/ project. Having auto-formatting setup and automatically enforced helps speeding up development and review process. * Apply the auto-formatting on all files in the repository * Removes flake8-quotes as it conflicts with Black formatting * Disable some Flake8 rules Disable Flake8 rules that are handled by Black. The list comes from https://github.com/ambv/black/issues/429#issuecomment-472687803. 2019-04-09 16:57:29 +03:00			`if v == "e":`
Add a tool to perform labeling at the comment level (for stepswanted and regressionwindow-wanted classifiers) 2019-01-15 05:01:57 +03:00			`break`