Bug Assignment: Insights on Methods, Data and Evaluation

Ali Sajedi Badashian

doi:doi:10.7939/R3804Z178

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

450 views
1516 downloads

Bug Assignment: Insights on Methods, Data and Evaluation

Author / Creator

Ali Sajedi Badashian
The bug-assignment problem is prevalently defined as ranking developers based on their competence to fix a given bug. Previous methods in the area used machine-learning or information-retrieval techniques and considered textual elements of bug reports as evidence of expertise of developers to give each of the developers a score and sort the developers for the given bug report. Despite the importance of the subject and the substantial attention it has received from researchers during last 15 years, still it is a challenging, time-consuming task in large software projects. Even there is still no unanimity on how to validate and comparatively evaluate bug-assignment methods and, often times, methods reported in the literature are not reproducible. In this thesis, we make the following contributions.
1) We investigate the effect of three important experimental-design parameters in the previous research; the evaluation metric(s) they report, their definition of who the real assignee is, and the community of developers they consider as candidate assignees. Supported by our experiment on a comprehensive data set of bugs we collected from Github, we propose a systematic framework for evaluation of bug-assignment research. Addressing those aspects supports better evaluation, enables replication of the study and promotes its usage in other research or industrial applications.
2) We propose a new bug-assignment approach relying on the set of Stack Overflow tags as the thesaurus of programming keywords. Our approach, called Thesaurus and Time based Bug Assignment (TTBA), weights the relevance of a developer’s expertise based on how recently they have fixed a bug with keywords similar to the bug at hand. In spite of its simplicity, our method predicts the assignee with high accuracy, outperforming state-of-the-art methods.
3) We extend TTBA to consider a broader record of the developer’s expertise, considering multiple sources of evidence of expertise. Then we investigate the information value of these information sources, considering various technical contributions to the project and contribution to social software platforms. We show that in addition to bug-fixing contributions, other technical and even social contributions of the developers within version control system are useful for bug-assignment. We also show that extending the sources of expertise can improve the accuracy of assignee recommendations.
4) We study the impact and usefulness of the above contributions using a comprehensive data set of bugs we collected from 13 long-term open-source projects in Github. In addition to the technical work by developers, this data set includes social contributions of developers in the version control system. This is one of the biggest data sets made available online for further studies and research.
Subjects / Keywords
Graduation date

Fall 2018
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R3804Z178
License

Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Stroulia, Eleni (Computing Science)