About Projects


A Contextual Approach towards More Accurate Duplicate Bug Report Detection

by Anahita Alipour, Abram Hindle, and Eleni Stroulia

Bug-tracking and issue-tracking systems tend to be populated with bugs, issues, or tickets written by a wide variety of bug reporters, with different levels of training and knowledge about the system being discussed. Many bug reporters lack the skills, vocabulary, knowledge, or time to efficiently search the issue tracker for similar issues. As a result, issue trackers are often full of duplicate issues and bugs, and bug triaging is time consuming and error prone.

Many researchers have approached the bug-deduplication problem using off-the-shelf information-retrieval tools, such as BM25F used by Sun et al. In our work, we extend the state of the art by investigating how contextual information, relying on our prior knowledge of software quality, software architecture, and system-development (LDA) topics, can be exploited to improve bug-deduplication. We demonstrate the effectiveness of our contextual bug-deduplication method on the bug repository of the Android ecosystem. Based on this experience, we conclude that researchers should not ignore the context of software engineering when using IR tools for deduplication.

The paper:

      author = {Anahita Alipour and Abram Hindle and Eleni Stroulia},
      title = {A Contextual Approach towards More Accurate Duplicate Bug Report Detection},
      booktitle={Proceedings of the 10th Working Conference on Mining Software Repositories},
      year = 2013