BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests
© Couto et al.; licensee Springer. 2014
Received: 20 November 2013
Accepted: 27 February 2014
Published: 21 March 2014
Despite the increasing number of bug analysis tools for exploring bugs in software systems, there are no tools supporting the investigation of causality relationships between internal quality metrics and bugs. In this paper, we propose an extension of the BugMaps tool called BugMaps-Granger that allows the analysis of source code properties that are more likely to cause bugs. For this purpose, we relied on the Granger Causality Test to evaluate whether past changes to a given time series of source code metrics can be used to forecast changes in a time series of defects. Our tool extracts source code versions from version control platforms, calculates source code metrics and defects time series, computes Granger Test results, and provides interactive visualizations for causal analysis of bugs.
We provide an example of use of BugMaps-Granger involving data from the Equinox Framework and Eclipse JDT Core systems collected during three years. For these systems, the tool was able to identify the modules with more bugs, the average lifetime and complexity of the bugs, and the source code properties that are more likely to cause bugs.
With the results provided by the tool in hand, a maintainer can perform at least two main software quality assurance activities: (a) refactoring the source code properties that Granger-caused bugs and (b) improving unit tests coverage in classes with more bugs.
KeywordsBug analysis tools Software metrics Causality tests
A number of software analysis tools has been proposed to improve software quality (Nierstrasz et al. 2005; Hovemeyer and Pugh 2004; Wettel 2009). Such tools use different types of information about the structure and history of software systems. Basically, they are used to analyze software evolution, manage the quality of the source code, compute metrics, check coding rules, etc. In general, such tools help maintainers to understand large amounts of data coming from software repositories.
Particularly, there is a growing interest in analysis tools for exploring bugs in software systems (Hora et al. 2012; D’Ambros and Lanza 2012; Sliwerski et al. 2005; Dal Sassc and Lanza 2013). Such tools help maintainers to understand the distribution, the evolutionary behavior, the lifetime, and the stability of bugs. For example, Churrasco is a web-based tool for collaborative software evolution analysis (D’Ambros and Lanza 2012). The tool automatically extracts information from a variety of software repositories, including versioning systems and bug management systems. The goal is to provide an extensible tool that can be used to reason about software evolution under different perspectives, including the behavior of bugs. Other visualizations were also proposed for understanding the behavior of bugs, including system radiography (which provides a high-level visualization on the parts of the system more impacted by bugs) and bug watch (which relies on a watch metaphor to provide information about a particular bug) (D’Ambros et al. 2007). Hatari (Sliwerski et al. 2005) is a tool that provides views to browse through the most risky locations and to analyze the risk history of a particular component from a system. More recently, the tool in*Bug (Dal Sassc and Lanza 2013) was proposed to allow users navigating and inspecting the information stored in bug tracking platforms, with the specific purpose to support the comprehension of bug reports.
Despite the increasing number of bug analysis tools, they typically do not provide mechanisms for assessing the existence of correlations between the internal quality of a software system and the occurrence of bugs. To the best of our knowledge, there are no bug analysis tools that highlight the possible causes of bugs in the source code. More specifically, there are no tools designed to infer eventual causal relations between changes in the values of source code metrics and the occurrence of defects in object-oriented classes.
In this paper, we propose and describe the BugMaps-Granger tool—an extension of the BugMaps tool (Hora et al. 2012)—that supports detection of causal relations between source code metrics and bugs. The tool provides mechanisms to retrieve data from software repositories, to compute source code metrics, to generate time series of source code metrics and defects, and to infer causal relations between source code properties and defects. Moreover, BugMaps-Granger provides visualizations for identifying the modules with more bugs, the average lifetime and complexity of bugs, and the source code properties that are more likely to cause bugs. More specifically, our tool relies on the Granger Causality Test (Granger 1981) to identify causal relations between time series of source code metrics and defects. This test evaluates whether past changes to a given time series of source code metrics can be used to forecast changes in a time series of defects. The proposed tool has the following features:
The tool automatically extracts source code models of a target system from its version control platform in predefined time intervals.
The tool generates time series of twelve source code metrics and time series with the number of defects in each class of the target system.
The tool computes the Granger Test considering the metrics and defects time series to highlight possible causal relations.
The tool integrates models extracted from the source code with models representing the number of bugs.
The tool provides a set of interactive visualizations to support software maintainers in answering questions such as: (a) Which are the modules with more bugs? (b) What is the average lifetime of bugs? (c) What is the complexity of bugs? (d) What are the source code properties that Granger-cause bugs in a given module?, and (e) What are the metrics with the highest number of positive Granger tests?
The ultimate goal of BugMaps-Granger is to predict the changes in the source code that are more likely to cause defects. For example, with our tool in hand, a maintainer (before making a commit with changes to a given class) can verify whether such changes affect the values of source code metrics that, in the past, Granger-caused defects. If the changes significantly affect these metrics values, the maintainer can, for example, perform extra software quality assurance activities (e.g., she can conduct more unit testing or perform a detailed code inspection) before executing the commit.
In a previous conference paper, we described an exploratory study on using Granger to predict bugs (Couto et al. 2012). Recently, this paper was extended with a concrete approach that relies on Granger Tests to trigger alarms whenever risky changes are applied in the source code (Couto et al. 2014). A preliminary version of BugMaps—without any support to Granger Tests—is described in a short paper (Hora et al. 2012). Later, we proposed a second version of this tool, which we called BugMaps-Granger, including support to Granger Causality (Couto et al. 2013a). In the present paper, we extend this initial work on BugMaps-Granger by including a more detailed presentation on the tool and a case study, with two large open-source systems (Eclipse JDT Core and Equinox Framework).
The execution of the BugMaps-Granger tool is divided into two phases: preprocessing and visualization. The preprocessing phase is responsible for extracting source code models, creating time series, and applying the Granger Test to compute possible causal relations between source code metrics and bugs. In the visualization phase, the user interacts with the tool. For example, he can retrieve the most defective classes of the system and visualize the source code properties that Granger-caused bugs in such classes.
In the following subsections, we describe the modules of this architecture:
2.1 Model extraction
This module receives as input the URL associated to the version control platform of the target system (SVN or Git) and a time interval to be used in the analysis of the bugs. To extract the source code models, the module performs the following tasks: (a) it extracts the source code versions from the version control platforms in intervals of bi-weeks; (b) it removes test classes, assuming that such classes are implemented in directories and subdirectories whose name starts with the words “Test” or “test”; and (c) it parses the source code versions and generates MSE files using the VerveineJ tool (Ducasse et al. 2011; VerveineJ parser 2014). MSE is the default file format supported by the Moose platform to persist source code models.
2.2 Time series creation
Source code metrics considered by BugMaps-Granger
Weighted methods per class
Depth of inheritance tree
Request for class
Number of children
Coupling between object class
Lack of cohesion in methods
Number of classes that reference a given class
Number of classes referenced by a given class
Number of attributes
Number of lines oxf code
Number of methods
To create the time series of defects for each class, the module receives as input a CSV file containing the bugs (IDs and creation dates) collected from the bug tracking platforms (e.g., Bugzilla, Jira, Mantis, etc.). Basically, the module maps the bugs to their respective commits, using the mapping strategy presented in details in (Couto et al. 2012; Couto et al. 2014). Next, the source code files changed by such commits are used to identify the classes changed to fix the respective bugs.
2.3 Granger test module
This module applies the Granger Causality Test considering the metrics and defects time series. To apply the Granger Test, the module relies on Algorithm 1. In this algorithm, Classes is the set of all classes of the system (line 1) and Defects[c] is the time series with the number of defects (line 2). The algorithm relies on function d_check (line 3) to check whether the defects in the time series d conform to the following preconditions:
Algorithm 1 Applying the Granger Test
P1: The time series must have at least 30 values. The motivation for this precondition is the fact that classes that only existed for a small proportion of the time frame considered in the analysis do not present a considerable history of defects to qualify their use in predictions.
P2: The values in the time series of defects must not be all equal to zero. The motivation for this precondition is that it is straightforward to predict defects for classes that never presented a defect in their lifetime; probably, they will remain with zero defects in the future.
P3: The time series of defects must be stationary, which is a precondition required by the Granger Test (Fuller 1995).
Suppose that a given class c passed the previous preconditions. For this class, suppose also that M[n][c] (line 5) is the time series with the values of the n-th considered source code metric, 1≤n ≤NumberOfMetrics. The algorithm relies on function m_check (line 6) to test whether time series m—a time series with metrics values—conforms to the following preconditions:
P4: The time series of source code metrics must not be constant. In other words, metrics time series whose values never change must be discarded, since variations in the independent variables are the key event to observe when computing Granger causality.
P5: The time series of source code metrics must be stationary, as defined for the defects series.
Finally, for the time series m (source code metrics) and d (defects) that passed preconditions P1 to P5, function granger(m,d) checks whether m Granger-causes d (line 7). In practice, to apply the test, BugMaps-Granger relies on the function granger.test() provided by the msbvar (MSBVAR package 2012) package of the R system.
It is worth mentioning that we previously performed an extensive study to evaluate the application of Granger Causality Test on software defects prediction (Couto et al. 2014). Basically, we focus on answering questions such as: (a) How many time series pass the preconditions related to defects (preconditions P1, P2, P3)? (b) How many time series pass the preconditions related to source code metrics (preconditions P4 and P5)? (c) How many classes present positive results on the Granger Test? (d) What is the number of defects potentially covered by Granger? To answer these questions, we used a dataset including time series of source code metrics and defects for four real-world systems (Eclipse JDT Core, Eclipse PDE UI, Equinox Framework, and Lucene) (Couto et al. 2013b).
2.4 Visualization module
This module receives the following input data: a file containing the bugs mapped to their respective classes and the Granger results, a model extracted from the last source code version, and the source code itself of the system under analysis. From this information, the module provides four interactive visualization browsers:
Two browsers are used for analysis. The first one deals with the classes, the number of bugs, and the Granger results of the system under analysis (called Granger browser) while the second one deals with the complexity of the bugs (called Bug as Entity browser).
Two browsers are used to rank the classes and the metrics most involved with bugs.
Such browsers are implemented using visualization packages provided by the Moose Platform. Basically, the visualizations are based on Distribution Map, a generic technique to reason about the results of software analysis and to investigate how a given phenomenon is distributed across a software system (Ducasse et al. 2006). Using a Distribution Map, three metrics can be displayed through the height, width, and color of the objects in the map. In our maps, rectangles represent classes or bugs and containers represent packages.
3 Results and discussion
In this section, we provide an example of use considering data from the Equinox Framework and Eclipse JDT Core systems collected during three years. For Equinox Framework, the tool extracted 79 source code versions in intervals of bi-weeks, including 417 classes, from 2010-01-01 to 2012-12-28. For Eclipse JDT Core, the tool extracted 78 source code versions in intervals of bi-weeks, including 2,467 classes, from 2005-01-01 to 2007-12-15. In a second step, for each class, the tool created eleven time series of source code metrics (for each metric in Table 1) and one time series of defects. Finally, for each pair of time series (source code metrics and defects), the tool applied the Granger Test to identify causal relations. We analyzed the Granger results for both systems according to the proposed visualizations, as discussed next.
The Granger browser can also be used to avoid future defects. For example, with this result in hand, a maintainer (before making a commit with changes to the StateImpl class) can verify whether such changes heavily affect the values of source code metrics that Granger-caused defects in the past (in our example, the metrics that Granger-caused defects were CBO, WMC, and RFC). If the change affects these metrics, the maintainer can for example perform extra software quality assurance activities in this class (like unit testing or code inspection).
3.2 Bug as entity
Figure 5(a) shows the bugs of the Equinox Framework created in 2010. We can observe that all bugs from 2010 were fixed (i.e., there are no bugs in blue), that only two bugs remained open for more than three months (bugs going to red), and that complex bugs (long width) are dispersed in time. Figure 5(b) shows the bugs of the Eclipse JDT Core created in 2005. Similar to the Equinox Framework, all bugs were fixed and few bugs remained open for more than three months. In addition, most bugs have low complexity (short width). However, in a detailed analysis, we can also observe that the highlighted bug (ID 89096) is quite complex. More specifically, the developer team changed 75 classes in order to fix this particular bug, which is related to a performance problem in the resource bundle mechanism (a requirement scattered by the classes of the JDT Core).
3.3 Bug ranking
3.4 Granger ranking
Based on these results, we can conclude that metrics related to complexity (WMC), coupling (CBO, RFC, and FAN-OUT), and LOC tend to impact in the occurrence of defects in the Equinox Framework and Eclipse JDT Core systems, at least according to Granger. Conversely, metrics related to inheritance—such as DIT and NOC—tend to have a small influence in the occurrence of defects.
In this paper, we described a tool that infers and provides visualizations about causality relations between source code metrics and bugs. The BugMaps-Granger tool extracts time series of defects from such systems and allows the visualization of different bug measures, including the source code properties that Granger-caused bugs. The ultimate goal of BugMaps-Granger is to highlight changes in the source code that are more subjected to bugs, and the source code metrics that can be used to anticipate the occurrence of bugs in the changed classes. With this tool in hand, maintainers can perform at least two main actions for improving software quality: (a) refactoring the source code properties that Granger-caused bugs and (b) improving unit tests coverage in classes with more bugs.
As future work, we intend to extend BugMaps-Granger with other internal software quality metrics, including metrics associated to violations in the static architecture of software systems, as revealed by the DCL language (Terra and Valente 2009) or the ArchLint tool (Maffort et al. 2013), for example. Another possible research thread concerns the relations between defects and code smells. In this case, we intend to start by investigating the relations between defects and methods located in inappropriate classes (i.e., feature envy instances), as revealed by the JMove recommendation system (Sales et al. 2013). In addition, we plan to extend BugMaps-Granger with a new functionality for alerting maintainers about the future occurrence of defects. We intend to implement this tool as a plug-in for version control platforms, like SVN and Git. Basically, this new tool should trigger alarms whenever risky changes are committed to version control platforms.
5 Availability and requirements
To execute BugMaps-Granger, the requirements of the target system are:
Identifiers and creation dates of bugs stored in a CSV file.
URL or directory path of the version control platforms (SVN or GIT).
Additional information about BugMaps-Granger:
Project name: BugMaps-Granger.
Project home page:http://aserg.labsoft.dcc.ufmg.br/bugmaps/.
Operating system(s): MacOS, Linux, and Windows.
Programming language: Java, Smalltalk, and R.
License: BugMaps-Granger is an open source project, distributed under a MIT license.
a Since most of our visualizations make heavy use of colors, we provide high-resolution versions of these figures in a companion website: http://aserg.labsoft.dcc.ufmg.br/bugmaps.
This research is supported by grants from FAPEMIG, CNPq, and CAPES (Brazil) and INRIA (France).
- Chidamber SR, Kemerer CF: A metrics suite for object oriented design. IEEE Trans Softw Eng 1994, 20(6):476–493. 10.1109/32.295895View ArticleGoogle Scholar
- Couto C, Silva C, Valente MT, Bigonha R, Anquetil N: Uncovering causal relationships between software metrics and bugs. In 16th European Conference on Software Maintenance and Reengineering (CSMR). USA: IEEE Computer Society; 2012:223–232.Google Scholar
- Couto C, Pires P, Valente MT, Bigonha R, Anquetil N: BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs. Brazilian Conference on Software: Theory and Practice (CBSoft), Tools Session, Brazilian Computer Society, Brazil 2013a.Google Scholar
- Couto C, Maffort C, Garcia R, Valente MT: COMETS: A Dataset for Empirical Research on Software Evolution Using Source Code Metrics and Time Series Analysis. ACM SIGSOFT Softw Eng Notes 2013b, 38(1):1–3.View ArticleGoogle Scholar
- Couto C, Pires P, Valente MT, Bigonha R, Anquetil N: Predicting software defects with causality tests. J Syst Soft 2014. doi: http://dx.doi.org/10.1016/j.jss.2014.01.033 doi:Google Scholar
- Dal Sassc T, Lanza M: A closer look at bugs. In 1st Working Conference on Software Visualization (VISSOFT). USA: IEEE Computer Society; 2013:1–4.View ArticleGoogle Scholar
- D’Ambros M, Lanza M, Pinzger M: A bug’s life: Visualizing a bug database. In 4th International Workshop on Visualizing Software for Analysis and Understanding (VISSOFT). Canada: IEEE Computer Society; 2007:113–120.Google Scholar
- D’Ambros M, Lanza M: Distributed and collaborative software evolution analysis with churrasco. Sci Comput Program 2012, 75(4):276–287.MathSciNetView ArticleGoogle Scholar
- Ducasse S, Girba T, Kuhn A: Distribution Map. In 22nd International Conference on Software Maintenance (ICSM). USA: IEEE Computer Society; 2006:203–212.Google Scholar
- Ducasse S, Anquetil N, Bhatti MU, Hora A, Laval J, Girba T: MSE and FAMIX 3.0: an Interexchange Format and Source Code Model Family. Technical report, RMOD - INRIA Lille - Nord Europe, Software Composition Group - SCG; 2011.Google Scholar
- Fuller WA: Introduction to Statistical Time Series. USA: John Wiley & Sons; 1995:546–663.View ArticleGoogle Scholar
- Granger C: Some properties of time series data and their use in econometric model specification. J Econometrics 1981, 16(6):121–130.MathSciNetView ArticleGoogle Scholar
- Hora A, Couto C, Anquetil N, Ducasse S, Bhatti M, Valente MT, Martins J: Bugmaps: A tool for the visual exploration and analysis of bugs. In 16th European Conference on Software Maintenance and Reengineering (CSMR Tool Demonstration). USA: IEEE Computer Society; 2012.Google Scholar
- Hovemeyer D, Pugh W: Finding bugs is easy. SIGPLAN Notices 2004, 39(12):92–106. 10.1145/1052883.1052895View ArticleGoogle Scholar
- Maffort C, Valente MT, Anquetil N, Hora A, Bigonha M: Heuristics for discovering architectural violations. In 20th Working Conference on Reverse Engineering (WCRE). USA: IEEE Computer Society; 2013:222–23.Google Scholar
- Moose platform 2014.http://www.moosetechnology.org
- MSBVAR package 2012.http://cran.r-project.org/web/packages/MSBVAR/index.html
- Nierstrasz O, Ducasse S, Grba T: The story of Moose: an agile reengineering environment. In 10th European Software Engineering Conference (ESEC). USA: ACM; 2005:1–10.Google Scholar
- Sales V, Terra R, Miranda LF, Valente MT: Recommending move method refactorings using dependency sets. In 20th Working Conference on Reverse Engineering (WCRE). USA: IEEE Computer Society; 2013:232–241.Google Scholar
- Sliwerski J, Zimmermann T, Zeller A: Hatari: Raising risk awareness. In 10th European Software Engineering Conference (ESEC). USA: ACM; 2005:107–110.Google Scholar
- Tavares A, Valente MT: A gentle introduction to OSGi. ACM SIGSOFT Softw Eng Notes 2008, 33(5):1–5.View ArticleGoogle Scholar
- Terra R, Valente MT: A dependency constraint language to manage object-oriented software architectures. Softw: Pract Exp 2009, 32(12):1073–1094.Google Scholar
- VerveineJ parser 2014.http://www.moosetechnology.org/tools/verveinej
- Wettel R: Visual exploration of large-scale evolving software. In 31st International Conference on Software Engineering (ICSE). USA: IEEE Computer Society; 2009:391–394.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.