Discrimination Analysis for Predicting Defect-Prone Software Modules

Ying Ma; Ke Qin; Shunzhi Zhu

doi:10.1155/2014/675368

2014 Discrimination Analysis for Predicting Defect-Prone Software Modules

Ying Ma, Ke Qin, Shunzhi Zhu

J. Appl. Math. 2014: 1-14 (2014). DOI: 10.1155/2014/675368

Abstract

Software defect prediction studies usually build models without analyzing the data used in the procedure. As a result, the same approach has different performances on different data sets. In this paper, we introduce discrimination analysis for providing a good method to give insight into the inherent property of the software data. Based on the analysis, we find that the data sets used in this field have nonlinearly separable and class-imbalanced problems. Unlike the prior works, we try to exploit the kernel method to nonlinearly map the data into a high-dimensional feature space. By combating these two problems, we propose an algorithm based on kernel discrimination analysis called KDC to build more effective prediction model. Experimental results on the data sets from different organizations indicate that KDC is more accurate in terms of $F$ -measure than the state-of-the-art methods. We are optimistic that our discrimination analysis method can guide more studies on data structure, which may derive useful knowledge from data science for building more accurate prediction models.

Citation

Download Citation

Ying Ma. Ke Qin. Shunzhi Zhu. "Discrimination Analysis for Predicting Defect-Prone Software Modules." J. Appl. Math. 2014 1 - 14, 2014. https://doi.org/10.1155/2014/675368