sal: an effective method for software defect prediction

46
SAL: AN EFFECTIVE METHOD FOR SOFTWARE DEFECT PREDICTION Paper ID: 116 Sadia Sharmin, Md Rifat Arefin, M. Abdullah-Al Wadud, Naushin Nower, Mohammad Shoyaib Institute of Information Technology (IIT), University of Dhaka, Bangladesh T u e s d a y , J u l y 5 , 2 0 2 2

Upload: sadia-sharmin

Post on 16-Feb-2017

155 views

Category:

Software


0 download

TRANSCRIPT

Page 1: SAL: An Effective Method for Software Defect Prediction

May 1, 2023SAL: AN EFFECTIVE METHOD FOR

SOFTWARE DEFECT PREDICTION

Paper ID: 116

Sadia Sharmin, Md Rifat Arefin, M. Abdullah-Al Wadud, Naushin Nower, Mohammad Shoyaib

Institute of Information Technology (IIT),University of Dhaka, Bangladesh

Page 2: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

2

CONTENTS Background Motivation Problem Specification Literature Review Methodology Result Analysis and Discussion Future Work

Page 3: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

3

BACKGROUNDSoftware Defect

Any flaw or imperfection in a software work product or software process

Software Defect PredictionAn approach to find out the defected part earlier before

testing/releasing the product

Page 4: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

4

AN OVERVIEW OF SOFTWARE DEFECT PREDICTION PROCESS

Data Set

Pre-processing

Attribute Selection

Testing Data

Prediction Result

Training Data

Prediction Model

Training

Page 5: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

5

MOTIVATION

Identifying the software bugs in an early stage

Allocating the test resources efficiently

Minimizing the cost of software development

Improving the quality and productivity of software

Page 6: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

6

WHY NEED PRE-PROCESSING Noisy Data Outliers Missing value or Conflicting value Inconsistency

Page 7: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

7

WHY NEED ATTRIBUTE SELECTION Attributes are not equally important Some are highly responsible Some may decrease the performance

Page 8: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

8

LITERATURE REVIEW Pre-processing

Log filtering ( Song et al. [2])Min-max normalization (Nam et al. [1])Z-score normalization (Nam et al. [1])

Page 9: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

9

LITERATURE REVIEW Attribute Selection Method

A threshold-based feature selection(Wang et al. [3])A hybrid attribute selection method (Gao et al. [4])A general defect prediction framework ( Song et al. [2])

Page 10: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

10

METHODOLOGYSAL: Selection of Attribute with Log filtering

Pre-processing

ln (n+ ) where = 0.01

Page 11: SAL: An Effective Method for Software Defect Prediction

ATTRIBUTE SELECTION PROCESS May 1, 2023

11

Attribute Set

Attribute

Ranking

Best Set Selectio

n

Page 12: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

12

ATTRIBUTE RANKINGA1A2A3A4A5………An

Page 13: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

13

ATTRIBUTE RANKINGA1A2A3A4A5………An

A1 0.564A2 0.764A3 0.685A4 0.798A5 0.892… …….… …….

An 0.789

Individual Balance value

Page 14: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

14

ATTRIBUTE RANKINGA1A2A3A4A5………An

Individual Balance value

A1A2A3A4A5………An

A1A2A1A3…….…….A3A1A3A2…….…….AmAn

Pair wise combinatio

n

A1 0.564A2 0.764A3 0.685A4 0.798A5 0.892… …….… …….

An 0.789

Page 15: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

15

ATTRIBUTE RANKINGA1A2A3A4A5………An

A1 0.034A2 0.034A3 0.456A4 0.348A5 0.784… …….… …….

An 0.789

Individual Balance value

A1A2A3A4A5………An

A1A2A1A3…….…….A3A1A3A2…….…….AmAn

Pair wise combinatio

n

A1A2 0.896

A1A3 0.734

…… …..…… …..A3A1 0.587A3A2 0.669…… …..…… …..

AmAn 0.897

Pair wise Balance value

Page 16: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

16

ATTRIBUTE RANKINGA1A2A3A4A5………An

A1 0.034A2 0.034A3 0.456A4 0.348A5 0.784… …….… …….

An 0.789

Individual Balance value

A1A2A3A4A5………An

A1A2A1A3…….…….A3A1A3A2…….…….AmAn

Pair wise combinatio

n

Pair wise Balance value

Average Balance value

For each attribute

A1A2 0.896

A1A3 0.734

…… …..…… …..A3A1 0.587A3A2 0.669…… …..…… …..

AmAn 0.897

A1 0.765 A2 0.534A3 0.679A5 0.987A4 0.869… .…..… .…..An 0.897

Page 17: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

17

ATTRIBUTE RANKINGA1A2A3A4A5………An

A1 0.034A2 0.034A3 0.456A4 0.348A5 0.784… …….… …….

An 0.789

Individual Balance value

A1A2A3A4A5………An

A1A2A1A3…….…….A3A1A3A2…….…….AmAn

Pair wise combinatio

n

Pair wise Balance value

Average Balance value

For each attribute

Average Balance Value = (Individual value +

Average value of n pair)/2

A1 0.765 A2 0.534A3 0.679A5 0.987A4 0.869… .…..… .…..An 0.897

A1A2 0.896

A1A3 0.734

…… …..…… …..A3A1 0.587A3A2 0.669…… …..…… …..

AmAn 0.897

Page 18: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

18

ATTRIBUTE RANKINGA1A2A3A4A5………An

A1 0.034A2 0.034A3 0.456A4 0.348A5 0.784… …….… …….

An 0.789

Individual Balance value

A1A2A3A4A5………An

A1A2A1A3…….…….A3A1A3A2…….…….AmAn

Pair wise combinatio

n

Pair wise Balance value

A1 0.765 A2 0.534A3 0.679A5 0.887A4 0.869… .…..… .…..An 0.897

Average Balance value

For each attribute A5 0.887

A4 0.869A10 0.765A8 0.750A9 0.696… .…..… .…..An 0.523

SortedBalance value in

decreasing order

A1A2 0.896

A1A3 0.734

…… …..…… …..A3A1 0.587A3A2 0.669…… …..…… …..

AmAn 0.897

Page 19: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

19

SELECT BEST SET OF ATTRIBUTESA5

A4 A10 A8 A9

.….. .…..An

Ranking of Attributes

Best Set of Attributes

Page 20: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

20

SELECT BEST SET OF ATTRIBUTESA5

A4 A10 A8 A9

.….. .…..An

Ranking of Attributes

Best Set of Attributes

Page 21: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

21

SELECT BEST SET OF ATTRIBUTESA5

A4 A10 A8 A9

.….. .…..An

Ranking of Attributes

Best Set of Attributes

Page 22: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

22

SELECT BEST SET OF ATTRIBUTESA4 A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

Page 23: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

23

SELECT BEST SET OF ATTRIBUTESA4 A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

Page 24: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

24

SELECT BEST SET OF ATTRIBUTESA4 A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

Page 25: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

25

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

A4 2nd ranked

Page 26: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

26

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

A4 2nd ranked

A5A4

Page 27: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

27

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887 (previous)

A4 2nd ranked

A5A4 0.891 (new)Combined Balance value

Page 28: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

28

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887 (previous)

A4 2nd ranked

A5A4 0.891 (new)Combined Balance value

new value > previous value

Page 29: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

29

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5

Best Set of Attributes

A5 1st ranked 0.887

A4 2nd ranked

Page 30: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

30

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

Page 31: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

31

SELECT BEST SET OF ATTRIBUTES

A10 A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

Page 32: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

32

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

A10 3rd ranked

Page 33: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

33

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

A10 3rd ranked

A5A4A10

Page 34: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

34

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

A10 3rd ranked

A5A4A10 0.856 (new)Combined Balance value

Page 35: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

35

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A40.891

(previous)

A10 3rd ranked

A5A4A10 0.856 (new)Combined Balance value

new value < previous value

Page 36: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

36

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of Attributes

A5A4 0.891

A10 3rd ranked Discarde

d

Page 37: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

37

SELECT BEST SET OF ATTRIBUTES

A8 A9

.….. .…..An

Ranking of Attributes

A5,A4

Best Set of AttributesContinue this process…….

Page 38: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

38

SELECT BEST SET OF ATTRIBUTES

A5,A4,A9,A12,A7

Best Set of Attributes

Page 39: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

39

RESULT AND DISCUSSIONS Data set : NASA MDP repository and PROMISE repository Classifier : Naïve Bayes Performance Metrics : Balance , AUC Programming Language : Java Machine Learning Tool : WEKA

Page 40: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

40

PERFORMANCE MEASUREMENT SCALES

Confusion MatrixPredicted

Actual TP FNFP TN

Page 41: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

41

RESULT AND DISCUSSIONS

Comparison of AUC values of

different methods

Date set

[5] [6] [7]

      Lowest Highest  CM1 0.702 0.723 0.550 0.724 0.7946KC1 0.79 0.790 0.592 0.800 0.8006KC2 - - 0.591 0.796 0.8449KC3 0.677 - 0.569 0.713 0.8322KC4 - - - - 0.8059MC1 - - - - 0.8110MC2 0.739 - - - 0.7340MW1 0.724 - 0.534 0.725 0.7340PC1 0.799 - 0.692 0.882 0.8369PC2 0.805 - - - 0.8668PC3 0.78 0.795 - - 0.8068PC4 0.861 - - - 0.9049PC5 - - - - 0.9624JM1 - 0.717 - - 0.7167AR1 - - - - 0.8167AR3 - - 0.580 0.699 0.8590AR4 - - 0.555 0.671 0.8681AR5 - - 0.614 0.722 0.925AR6 - - - - 0.7566

Page 42: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

42

RESULT AND DISCUSSIONSDataset [2] [8] [9]

CM1 0.695 0.663 0.5500 0.680JM1 0.585 0.678 - 0.6152KC1 0.707 0.718 - 0.7244KC2 - 0.753 - 0.7835KC3 0.708 0.693 0.6037 0.7529KC4 0.691 - - 0.7036MC1 0.793 - - 0.6904MC2 0.614 0.620 - 0.6847MW1 0.661 0.636 0.7202 0.6577PC1 0.668 0.688 0.5719 0.7040PC2 - - 0.7046 0.7468PC3 0.711 0.749 0.7114 0.7232PC4 0.821 0.854 0.7450 0.8272PC5 0.904 - - 0.9046AR1 0.411 - - 0.6651AR3 0.661 - - 0.8238AR4 0.683 - - 0.7051AR6 0.492 - - 0.5471

Comparison of Balance values

of different methods

Page 43: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

43

FUTURE WORK Cross-project defect prediction Using other publicly available datasets

Page 44: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

44

REFERENCES[1] Nam, Jaechang, Sinno Jialin Pan, and Sunghun Kim. "Transfer defect learning."

In Proceedings of the 2013 International Conference on Software Engineering, pp. 382-391. IEEE Press, 2013.

[2] Song, Qinbao, Zihan Jia, Martin Shepperd, Shi Ying, and Shi Ying Jin Liu. "A general software defect-proneness prediction framework." Software Engineering, IEEE Transactions on 37, no. 3 (2011): 356-370

[3] Wang, Huanjing, Taghi M. Khoshgoftaar, and Naeem Seliya. "How many software metrics should be selected for defect prediction?" In FLAIRS Conference. 2011

[4] Gao, Kehan, Taghi M. Khoshgoftaar, and Huanjing Wang. "An empirical investigation of filter attribute selection techniques for software quality classification." In Information Reuse & Integration, 2009. IRI'09. IEEE International Conference on, pp. 272-277. IEEE, 2009.

[5] Wahono, Romi Satria, and Nanna Suryana Herman. "Genetic Feature Selection for Software Defect Prediction." Advanced Science Letters 20, no. 1 (2014): 239-244.

[6] Abaei, Golnoush, and Ali Selamat. "A survey on software fault detection based on different prediction approaches." Vietnam Journal of Computer Science 1, no. 2 (2014): 79-95.

Page 45: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

45

REFERENCES [7] Ren, Jinsheng, Ke Qin, Ying Ma, and Guangchun Luo. "On software defect

prediction using machine learning." Journal of Applied Mathematics 2014 (2014). [8] Wang, Shuo, and Xin Yao. "Using class imbalance learning for software defect

prediction." Reliability, IEEE Transactions on 62, no. 2 (2013): 434-443. [9] Khan, Jobaer, Alim Ul Gias, Md Saeed Siddik, Md Hafizur Rahman, Shah Mostafa

Khaled, and Mohammad Shoyaib. "An attribute selection process for software defect prediction." In Informatics, Electronics & Vision (ICIEV), 2014 International Conference on, pp. 1-4. IEEE, 2014

Page 46: SAL: An Effective Method for Software Defect Prediction

May 1, 2023

46