TUESDAY, AUGUST 22, 2017
Font Size
Featured Predicting Code Inspection Faults

Predicting Code Inspection Faults

Inspections are one of the most common methods of review performed in software teams. The goal of code inspection is to identify software faults early in the software development lifecycle. Teams are faced with the challenge, however, of determining whether those inspections are effective. One way to quantify this is by predicting the total number of faults that can be found in an inspection. A prediction model can be applied to evaluate an inspection process and refine the achieved quality level.

Inspection fault density (the number of operational faults detected divided by lines of code inspected) is used by development teams to evaluate the effectiveness of code inspections. Teams are required to re-inspect code wherever an inspection does not meet the inspection fault density guideline – which can potentially be a waste of resources. Alternatively, a block of code under inspection can be passed without trying to find (hidden, but existing) faults once the inspection fault density guideline is satisfied. Hence, there is a need for a fault prediction model based on various factors associated with software development and inspection. This article describes the development of such a model.

Prediction Model Foundation

The number of faults detected during code inspection is characterized by the Poisson distribution, which models counts of events having a constant mean arrival rate over a specified interval. The inspection model based on Poisson distribution assumes that the mean rate of fault introduction and fault detection is constant over a specified length of code block. In practice, however, there are several potential problems with this assumption:

  1. Different inspections generally cover code blocks of different lengths.
  2. The rates of fault introduction and detection probably vary over different coding languages. Other variables may also influence these rates.
  3. The fault detection rate will vary according to the inspector’s skill and preparation.

All of these problems are addressed by the statistical procedure called generalized linear modeling (GLM). Similar to regression, GLM is a response variable that depends on at least one continuous predictor and multiple categorical variables. This is a generalization of the Poisson model where the rate of detection also depends on other factors present in the model.

To develop a model that predicts the expected the numbers of faults to be found in an inspection, potential factors that influence fault detection need to be identified. Based on historical knowledge, the following eight potential factors were identified for further investigation:

  1. Type of inspected file: .c file / .h file / combination of both .c and .h
  2. Inspected size in lines of code (LOC): Formal inspections with greater than or equal to 40 LOC.
  3. Inspected size: This factor is considered to incorporate nonlinearity in the proposed model. It was observed that the number of faults detected follow a nonlinear trend with respect to the inspected size, or LOC.
  4. Type of work: Inspections done on a new feature (newly written code) or port feature (code taken from existing library).
  5. Complexity of inspected item: Essentially, this is a measure of the extent to which the code is prone to defects. As a measure of complexity, the average number of development and branches panic branches were used. (These are both types of revision control. A development branch is under development, not released officially and used to develop new features/modules. A panic branch is created when a change request is reported to rectify an operational fault in the system.) If a particular block of code has undergone frequent changes, then the probability of it being prone to defects is also high. Depending on the average numbers of development and panic branches, inspections were classified into one of three categories: 1) high-risk inspection (average of 10 development branches and average of 2 panic branches), 2) medium- risk inspection (average of 5 development branches and average of one panic branch) and 3) low-risk inspection (without any development and panic branches).
  6. Programming language: Inspections done on code written in C or C++ language.
  7. Development teams: Inspection carried out by Team A or Team B.
  8. Platform: Platform used to develop the software, such as C and D (technological platforms based on which software/user interfaces are developed).

Model Development

Hypothesis testing was carried out to determine the significance levels of the aforementioned factors. The results are shown in Table 1.

Table 1: Hypothesis Testing Results for Potential Factors
Factors Statistical Hypothesis p-value Conclusion
File Type (.h and .c file) H0 : p.h = p.c
H1 : p.h ? p.c
0.045 Statistically significant
Type of work (new and port) H0 : pnew = pport
H1 : pnew ? pport
0.00 Statistically significant
Programming language (C and C++) H0 : pc = pc++
H1 : pc ? pc++
0.312 Fail to reject null hypothesis
Development Teams (A and B) H0 pA = pB
H1 pA ? pB
0.368 Fail to reject null hypothesis
Platform (C and D) H0 : pD = pD
H1 : pD ? pD
0.620 Fail to reject null hypothesis
Where p* is the inspection fault density for the specified criteria

Based on the hypothesis testing results, it was concluded that two factors – file type and type of work – have a significant effect on the faults detected in an inspection. Next a GLM model was developed, using the numbers of operational faults as the response variable. The results of the GLM model are shown in Table 2.

Table 2: Parameter Estimates for GLM Model (Log Link Function)

Parameter

Estimate

Standard error

95% Wald Confidence Interval

Hypothesis Test

Lower limit

Upper limit

Wald chi-square

Df

Significance

(Intercept)

-0.108

0.5458

-1.178

0.962

0.039

1

0.843

[file =.c ]

-0.656

0.4532

-1.544

0.233

2.093

1

0.048

[file =.c and.h]

-0.46

0.4195

-1.282

0.362

1.202

1

0.023

[file = .h]

0

[work = new]

0.579

0.2607

0.068

1.09

4.928

1

0.026

[work = port]

0

Inspected size

0.003

0.0009

0.001

0.005

10.342

1

0.001

Inspected size squared

-8.58E-07

4.37E-07

-1.72E-06

-1.31E-09

3.853

1

0.05

Average development branches

0.03

0.0248

-0.019

0.079

1.441

1

0.03

Average panic branches

0.288

0.4396

-0.573

1.15

0.43

1

0.012

R2 adjusted = 60.03%

Prediction Charts

The results of the statistical analyses shown in Table 2 were used to generate prediction charts with 95 percent confidence limits. The upper and lower limits were calculated based on the parameter estimate of 95 percent Wald confidence interval shown in Table 2. Since the inspected file type, work type, average numbers of development branch and average numbers of panic branch are significant predictors of faults (from Table 2), two separate prediction charts for each combination of these factors were generated and are shown in Figures 1 and 2.

Figure 1: Inspection Fault Prediction Chart – Low Risk

Figure 1: Inspection Fault Prediction Chart – Low Risk

Figure 2: Inspection Fault Prediction Chart – High Risk

Figure 2: Inspection Fault Prediction Chart – High Risk

A similar inspection fault prediction chart may be plotted for different combinations of predictor variables based on the requirements of software development.

Model Validation

Model validation was done on inspections as an ongoing project. Based on the existing guidelines for inspection fault density, 28 inspections were found to be free of defects or bugs. The parameters of these 28 inspections were fed into the GLM model and seven inspections were selected for re-inspection. Development teams were able to find additional operational faults in four of these seven inspections. However, this model correctly identified 16 inspections as bug-free and saved approximately 200 staff hours, which was a significant improvement.

Future Work

This model predicts the expected number of faults based on the detection ability of teams so this is still a guideline. This guideline can be converted to goals by including the testing defects, which are simply faults that were not detected in inspection. The model explains around 60 percent of overall variation in the predicted operational faults. Improvements in variability can be obtained by adding more predictor variables such as inspection rate, preparation rate, team size and the experience level of the inspection team.

Register Now

  • Stop this in-your-face notice
  • Reserve your username
  • Follow people you like, learn from
  • Extend your profile
  • Gain reputation for your contributions
  • No annoying captchas across site
And much more! C'mon, register now.

Leave a Comment



Comments

Jeff Larsen

Excellent article and very well written.

Thanks for sharing, Jeff

Reply
Shibnarayan

Thanks for your comments Jeff.

Regards,
Shiv

Reply
BRAJA GOPAL SAHOO

First of all it is a very good article.

My observation is that you consider constant mean arrival rate over a specified interval for your model but that arrival rate may change depending on Type of work i.e., New feature or Port feature.

Reply
BRAJA GOPAL SAHOO

Secondly, you can mentioned R Square value also for your GLM model and comparing R Square and Adj R Square one can check model parsimony wrt predictor variables

Reply
Dasharath Koulage

Very good article.

Reply
Shibnarayan

Thanks for your comments BG. Yes, assumption of constant fault introduction is not practically feasible. So GLM model is used instead of homogenious Poission process modelling. The basic assumption here is that mean arrival rate of fault is not constant and varies depending on the length of inspected code/experience level of inspectors/ work type-new or port etc.

Regarding your second comment…yes, you are right. Providing R-sq value is important to check if the model is over-fitted or not. In this case R-sq value was close to R-sq-adj. R-sq was 63% and R-sq-adj was 60.03%.

Again Thanks a lot for your feedback/suggestions.

Regards,
Shib

Reply
Russell Lindquist

Thank you for this article. I haven’t seen much written about human errors in regards to using the Poisson distribution. It was good to see how you looked into potential variables influencing predicted error density.

—Russell

Reply
Shibnarayan

Thanks for your feedback.

Regards,
Shiv

Reply


5S and Lean eBooks
GAGEpack for Quality Assurance
Six Sigma Statistical and Graphical Analysis with SigmaXL
Six Sigma Online Certification: White, Yellow, Green and Black Belt
Lean and Six Sigma Project Examples
Six Sigma Online Certification: White, Yellow, Green and Black Belt

Find the Perfect Six Sigma Job

Login Form