According to respondents of the 11th Annual iSixSigma Global Salary Survey, the average annual worldwide Black Belt (BB) salary declined $224, a decrease of less than 1 percent. The difference is not statistically significant, indicating that salaries have not substantially changed over the past year. Only time will tell if this stagnation is the beginning of a reversal in the overall upward trend of average salaries in the decade since this research was first undertaken, or simply part of the pattern of continued gradual increase over time (Figure 1).
The average salary of Champions, on the other hand, increased dramatically to $127,500, the highest average salary worldwide for Champions since the iSixSigma salary survey first began in 2004. However, with so few Champion data points, not too much can be read into this. Last year the average salary for Champions was $90,833, and $88,929 in the 2012 report (Figure 2).
Although BPs may not be in Six Sigma roles currently, nearly half of these professionals (47 percent) are certified as BBs and 29 percent are certified as MBBs (Figure 3).
Data for the Annual iSixSigma Global Salary Survey is collected from the iSixSigma Job Shop, where participants answer several required questions (e.g., location, highest level of education, salary range, etc.). Only information from those who provided or updated their information within the prior 12 months was included in the analysis.
This years salary survey includes responses from 1,391 participants.
Please note: Survey participants provide salary information by ranges, beginning at < $20,000, then $20,001 to $25,000, continuing at increments of $5,000 to the final salary range of > $200,000. In analyzing the salary data, each range was converted to the median salary for that specific range. For example, < $20,000 was calculated to be $17,500; $20,001 to $25,000 was calculated to be $22,500; and so on.
Black Belt: Fulltime professional who leads Six Sigma projects. Typically has four to five weeks of classroom training in methods, statistical tools and team skills. Sometimes provides coaching and Six Sigma expertise to Green Belts.
Master Black Belt: An expert in Six Sigma methodology and statistical tools who provides strategic Six Sigma guidance within a specific function or business unit. An MBB often has prior experience as a BB. Responsible for coaching, mentoring and/or training BBs, an MBB often helps the Six Sigma Deployment Leader and Champions keep the initiative on track.
Champion: Middle or seniorlevel executive who sponsors a specific Six Sigma project or effort, ensuring that resources are available and crossfunctional issues are resolved.
Deployment Leader: Seniorlevel executive responsible for implementing Six Sigma enterprise wide. Typically reports to higher Clevel executives. Responsible for developing, implementing and maintaining a standardized, companywide quality system focused on customer satisfaction, defect prevention and continuous improvement.
Quality Professional and Quality Executive: While not universally regarded as Six Sigma roles, QPs and QEs may have Six Sigma responsibilities and qualifications.
Business Professional: Although they may not be in Six Sigma roles currently, the majority of these professionals possess Six Sigma certification and may have project experience.
Rehoboth, Massachusetts (July 20, 2014) – Six Sigma Integration, Inc. (www.sixsigmaintegration.com) announces the release of Lean Six Sigma for Supply Chain Management, Second Edition. Sales of the book have already begun. Orders are being accepted at:
http://www.mhprofessional.com/product.php?isbn=0071793054
Lean Six Sigma for Supply Chain Management, Second Edition is fully revised to cover recent dramatic developments in supply chain improvement methodologies, this strategic guide brings together the Six Sigma and Lean manufacturing tools and techniques required to eliminate supply chain issues and increase profitability. This updated edition offers new coverage of enterprise kaizen events, big data analytics, customer loyalty metrics, security, sustainability, and design for excellence.
The structured 10Step Solution Process presented in the book ensures that clear goals are established and tactical objectives are consistently met through the deployment of aligned Lean Six Sigma projects. Written by a Master Black Belt and Lean Six Sigma consultant, this practical resource also provides an inventory model and Excel templates for download at www.mhprofessional.com/LSSSCM2.
Lean Six Sigma for Supply Chain Management, Second Edition, covers:
Hardcover: 400 pages
Publisher: McGrawHill Professional; 2 edition (May 13, 2014)
Language: English
ISBN10: 0071793054
ISBN13: 9780071793056
Complete with roadmaps and checklists, this book will help busy Supply Chain and Lean Six Sigma professionals discover more efficient ways manage and analyze business processes, and ultimately—increase overall operational efficiency.
James William Martin is president of consulting firm Six Sigma Integration, Inc. As a Lean Six Sigma consultant and master black belt for 10 years, he has trained and mentored more than 3,000 belts, executives, and deployment champions worldwide in a dozen different industries. He is also the author of Unexpected Consequences: Why the Things We Trust Fail; Lean Six Sigma for the Office, Operational Excellence: Using Lean Six Sigma to Translate Customer Value through Global Supply Chains; and Measuring and Improving Performance: Information Technology Applications in Lean Systems. He has also served as an instructor at the Providence College Graduate School of Business since 1988. His degrees are: M.S. in mechanical engineering from Northeastern University, M.B.A. from Providence College, and B.S. in industrial engineering from the University of Rhode Island.
A simulation study was undertaken to quantify the relationship between guard banding, percent tolerance (also known as the precisiontotolerance [P/T] ratio), and the probability of misclassification – all in the presence of varying distributions for both part values and gage error. A response surface designed experiment was utilized to generate a balanced set of factor level combinations. The following four factors were used to summarize the important characteristics of the gage and part distributions:
1. True value capability index, P_{pk}. This factor describes variance of the true value population with respect to specification limits, including the centering of the true value mean, µ, within the specification limits.
Where CPL = the difference of the center line and the LSL (lower specification limit) and CPU = the difference of the USL (upper specification limit) and the center line:
2. The ratio
Where P_{p} is described by equation (9b). This factor describes the “centeredness” of the true value population within the specification limits.
3. The ratio
This factor describes gage variance with respect to true value variance. This is effectively the inverse of percent process [see equation (4) in Part 1] divided by 100 percent.
4. Guard band count taken as k, where k (11), is the number of gage standard deviations (σ_{g}) taken within each specification limit to establish guard banded specifications.
Assuming normal distributions for gage variance and true value populations, each of these four factors can be used to establish probability density functions for gage variability and “true” value population with respect to specification limits, but without having absolute values of gage variance and true value population. (Note: In reality, there is no such thing as a “true” value since GR&R only seeks to understand the precision of the measurements versus some measurement space [either the tolerance or the observed spread of the parts – which also includes the gage variation]. References to a “true” value imply that accuracy was studied but that is not included in this article.)
These probability functions can be used to calculate percent tolerance and probability of misclassification. The four factors can be combined to establish a space of typical gage variance, true value population variance and guard banding, which are then used to map percent tolerance and probability of misclassification over the various combinations of these factors. Once these two gage metrics are mapped over combinations of these factors, relationships between the two metrics can be established over the same space. If a 1sided specification equation is not available, however, the same probability distribution functions can be established and the same mapping is still possible.
These four factors are not independent from one another, and when combined to map and compare gage metrics they will not form an orthogonal comparison grid, where grid lines are perpendicular to intersection. However, comparison between gage metrics over ranges of each factor provides a means to draw general conclusions about the effectiveness of each metric in various circumstances without having absolute measured values. The lack of orthogonality must be taken into account when making this comparison, but it does not preclude obtaining some useful information as a result of the study.
Percent tolerance and probability of misclassification can be modeled over combinations of the four factors using response surface analysis, in which models can be represented using contour plots of the fitted surfaces. The type of response surface design of experiment (DOE) chosen for study is a central composite design (CCD).^{1} In this type of design, factorial combinations of factors – along with center points and axial points – are used to structure study inputs. The resulting data can be used to fit models involving primary factors, their interactions and second order polynomial terms. The ratio defined in (10) can vary over multiple orders of magnitude; for this simulation the input was converted to a natural log scale that forced a range over multiple orders of magnitude to be input on a linear scale. Experimental results can be used to estimate multifactor regression equations, which can then be used to numerically predict responses over the design space.
The design space is chosen to represent typical ranges of equations (8), (9), (10) and (11), while avoiding conditions where combinations of (10) and (11) would satisfy (7), thereby resulting in null values for probability of misclassification. The values of axial points are selected as 1.1 times larger than the extent of primary factorial points away from the center point for similar reasons. The lower axial point for (11) is set to zero to avoid a negative value. The values of the factorial points, center points and axial points for the three remaining inputs to the design are shown in Table 1.
Table 1: DOE Inputs for Response Surface Mapping Percent Tolerance and Probability of Misclassification  
Factor  Center  Factorial Lower, Upper  Axial Lower, Upper 
P_{pk }/ P_{p}  1.0  0.5, 1.0  0.45 
P_{pk}  1.0  0.5, 1.5  0.45, 1.55 
1.5  0, 3  0.15, 3.15  
Guard band k  1.0  0, 2  0, 2.1 
Given a value for each of the four factors and known specification limits, the values of process mean, gage variance and process variance can be calculated. From here, percent tolerance and probability of misclassification can be estimated. An LSL of zero and USL of 100 were used to numerically simulate probability distribution functions for gage variance and true value population based on combinations of equations (8a), (9a) and the natural log of (10). These probability distribution functions were used to calculate percent tolerance according to equation (5) and to estimate the probability of misclassifying a bad unit as good and a good unit as bad for each combination of the four factors in the CCD DOE. All outputs were found to vary over more than 2 orders of magnitude and, as a result, the outputs were analyzed using a natural log scale to simplify analysis. (Population simulation, probability of misclassification estimation and CCD DOE analysis were done with Minitab v16.)
The DOE analysis of variance (ANOVA) table provides information regarding significant terms and lackoffit for each of the outputs studied. Significant terms are taken as having a p value less than 0.05.^{1} The predicted R^{2} is chosen to determine lackoffit and usefulness of the model to predict results. Predicted R^{2} captures the percentage of a response variation explained by relationships with inputs using predicted model output versus observed output to quantify lack of fit. The second order polynomial term for (9a) was found to be significant to the probability of good observed bad and percent tolerance. The influence of (9a) on percent tolerance is due to nonorthogonality of input factors. Based on observation of insignificant first order terms and interaction terms for (9a), the DOE input factors were reduced and interaction terms including (9a) were removed. The first order and second order term associated with (9a) were left in subsequent analysis due to the significance of the second order term in the model for probability of good observed bad. Goodness of model fit and factor significance for each of the three measurement system analysis metrics using the reduced model terms are shown in Table 2.
Table 2: Goodness of Model Fit and Factor Significance  
Bad Observed as Good  Good Observed as Bad  Percentage Tolerance  
R% predicted  97.4  98.44  99.99 
pvalues  
Constant  0.166  0.001  0 
P_{pk}  0.68  0.008  0 
P_{pk} / P_{p}  1  1  1 
0.005  0  0  
Guard Band k  0  0.068  0.024 
P_{pk} * P_{pk}  0  0  0 
P_{pk} / P_{k} * P_{pk} / P_{k} 
0.057  0.002  0 
*  0.67  0  0.01 
Guard Band k * Guard Band k  0.188  0.491  0.01 
P_{pk} *  0.007  0  1 
P_{pk} * Guard Band k  0  0  1 
* Guard Band k  0.028  0.109  1 
Response surface contour plots for the P/T ratio and the probability of bad misclassified as good are overlaid in Figure 4 over the studied range of and P_{pk}. Two overlaid contour plots are drawn for guard band k = 0 and 2 respectively. Both plots have a fixed value P_{pk} ⁄ P_{p} = 1.
The bands defined by the adjacent contour lines indicate sensitivity of each output to the input factors on each axis. The probability of misclassifying bad as good shows the most sensitivity to P_{pk} and is relatively insensitive to . The opposite is true for P/T ratio. This trend holds true for both plots at two different guard band values. This sensitivity analysis establishes that the probability of misclassification is more dependent on the probability that a value is bad or good, as opposed to the probability that the measured value is different from the true value. Curvature is shown in the P/T ratio response, which indicates sensitivity to P_{pk} and ; this curvature is due to nonorthogonally of the input factors. According to equation (5), the P/T ratio is not dependent on process standard deviation – and is only dependent on process mean for onesided specifications. In this model, however, gage standard deviation is established based on a ratio with process standard deviation, and process standard deviation is an input to the factors on each plot axis.
The influence of guard banding on each of the two outputs is established by comparing the two plots in Figure 4. P/T ratio does not change as a function of guard banding, which is expected according to equation (5). The probability of misclassifying bad as good changes such that the probability is reduced for lower values of process capability. Guard banding has more influence on the probability of misclassifying bad as good when the probability is greater than 1 in 1,000,000; for values equal to or less than this level, guard banding has a smaller influence on reducing the probability of misclassification (i.e., at a higher process capability).
The difference in sensitivity of each output over the plot range at both values of guard banding illustrates four conditions for gage precision, as defined by P/T ratio, and probability of misclassification. They are:
P/T ratio is within typical acceptance limits and the probability of misclassifying bad as good is relatively small.
P/T ratio is larger than typical acceptance limits; however, the probability of misclassifying bad as good is relatively small.
P/T ratio is larger than typical acceptance limits and the probability of misclassifying bad as good is relatively large.
P/T ratio is within typical acceptance limits and the probability of misclassifying bad as good is relatively large.
For conditions 1 and 3, the P/T ratio and probability of misclassifying bad as good agree in their assessment of gage suitability for decision making. In condition 1, the gage is generally considered suitable. In condition 3 the gage is generally considered illsuited for decision making. For conditions 2 and 4, the P/T ratio and probability of misclassifying bad and good disagree in their assessment of gage suitability for decision making. In condition 2, the gage is considered imprecise; however, the underlying true value population is sufficiently far away from specification values as to minimize the probability of misclassifying bad as good. This condition may avoid risk of false acceptance by misclassifying nonconforming values as good, but additional cost may reside in the probability of misclassifying good as bad. In condition 4, the gage is considered precise, but the underlying true value population is close enough to specification values such that the probability of misclassifying bad values as good remains high. Here the gage may be precise enough to differentiate values within the specification tolerance, but the magnitude of measurement error is still large enough to warrant significant risk in using the measurement system to sort values, where the sort condition is based on specification limits.
Guard banding reduces the probability of misclassifying bad as good, thereby increasing the suitability of a measurement system for making effective decisions at lower values of process capability. The probability of misclassifying bad as good is not reduced to zero over the entire range of P_{pk} shown. Even for guard banding at 2σ_{g}, the probability of misclassifying bad as good can remain relatively high at low process capability.
Guard banding has been shown to increase the probability of misclassifying good as bad. This is illustrated in Figure 5 in which response surface contour plots for the probability of good misclassified as bad are overlaid with the same contours shown in Figure 4. Plot ranges of and P_{pk} are the same as in Figure 4. Two overlaid contour plots are drawn for guard band k = 0 and 2, respectively. Both plots have a fixed value P_{pk} ⁄ P_{p} = 1.
As in Figure 4, the bands defined by the adjacent contour lines indicate sensitivity of each output to the input factors on each axis. The probability of misclassifying good as bad is nearly equally sensitive to P_{pk} and ; probability of misclassifying good as bad increases as P_{pk} or decrease. The influence of guard banding on the probability of misclassifying good as bad is seen by comparing the two plots in Figure 5. When no guard banding is applied, the probability of misclassifying good as bad is less than 1 percent over the range of P_{pk} and , where P/T and the probability of misclassifying bad as good would be considered generally acceptable. When guard banding is applied, the probability of misclassifying good as bad is found to increase at lower values of P_{pk} and . For extremely low values of P_{pk} and shown in the plot, guard banding is shown to satisfy the condition defined by equation (7); the result indicates that all values would be classified as bad. Comparing the two probabilities of misclassification illustrates that as guard banding is applied, the probability of bad misclassified as good will decrease, but the probability of good misclassified as bad will increase. This tradeoff is most prevalent at low values of P_{pk} and .
The four conditions previously established can be summarized to include information on the probability of misclassifying good as bad.
P_{pk} < 1.3  P_{pk} > 1.3  
> 2.0  Condition 4

Condition 1

< 2.0  Condition 3

Condition 2

For conditions 1 and 3, the P/T ratio and probabilities of misclassification agree in their assessment of gage suitability for decision making. Based on this analysis, a gage is suited for decision making in condition 1. For condition 2, the imprecision of the gage does not jeopardize the risk of false acceptance. A significant risk of false reject (i.e., excess scrap), however, may exist, and the gage’s suitability for decision making is conditional based on the circumstances surrounding the measurement. For condition 4, the gage’s precision may not effectively eliminate the risk of misclassifying good values as bad or bad values as good; the gage may not be considered suitable for applications of sorting values within a population where the sorting condition is based on specification limits.
The aforementioned DOE simulation approach and results are based on normal distribution assumptions for underlying true value populations and gage error. Although gage error typically follows a normal distribution, underlying true value populations may not be. Typical GR&R results should be relatively insensitive to nonnormal data sets, where the probability of misclassification will be sensitive.
Lognormal true value populations and any true value population with significant skew will not have symmetric probability density functions about the distribution arithmetic mean. As a result, these DOE simulations cannot be generalized based on the exact same factors; results need to be based on absolute distribution position relative to specification values. Initial attempts to repeat this trial using nonnormal distributions suggest general agreement with these results, except that the results are dependent on the absolute position of the population within the specification limits.
Comparison of GR&R metrics, namely P/T ratio, with the probabilities of misclassification reveals that precise measurement systems may still generate results with significant probabilities of misclassification. A DOE study using numeric simulation revealed that the most significant risk of misclassifying measured values exists when a gage is considered precise with respect to specification tolerance, but the population of values being measured resides near or outside of specification tolerance. Where true value population is defined by a P_{pk} capability metric less than 0.75, a gage may be classified as suitable with a P/T ratio less than 10 percent, while probabilities of misclassifying good values as bad or bad values as good may be greater than 1 in 10,000.
Guard banding influences probabilities of misclassification, and can be used to reduce the probability of misclassifying bad values as good at the cost of increasing the probability of misclassifying good values as bad. Guard banding has the most influence on probability of misclassification for imprecise measurement systems and when a population of values being measured has low capability index.^{2}
1. Montgomery, D. C., and G. C. Runger. “Gage Capability and Designed Experiments Part II: Experimental Design MEthods and Variance Component Estimation.” Quality Engineering 6 (1993b): 289305.
2. Taylor, Wayne. “Generic SOP – Statistical Methods for Measurement System Variation, Appendix B Gage R&R Study.” Wayne Taylor Engerprises, Inc., n.d
Measurement system analysis has been a major part of process characterizations and improvements, with key guidance from the original Automotive Industry Action Group reference manual. There has been some critical review, particularly of summary metrics such as gage repeatability and reproducibility (GR&R) and percent tolerance (also known as precisiontotolerance ratio) and what are considered acceptable values.^{1}^{,}^{2}^{,}^{3} In industries such as medical devices, measurement system analyses can influence decisions that can critically affect the lives of patients. Consequently, the impact of measurement system analyses on decision making, as related to the probability of misclassification, is of vital importance.
This article contains a description of available GR&R metrics, and their applicability to two broad cases of comparative study: 1) comparing one population to another and 2) comparing individual values about a specification. These cases represent the most common applications of measurement systems. Examples of these two applications include comparing two groups based on a change from one group to the next and comparing individual values to a specification for the case of making accept or reject decisions.
Precision describes the relative spread in measured values for a common item. The precision of a measurement system is commonly assessed using a GR&R. GR&R studies quantify gage variance, and this variance can be compared to a targeted measurement range. The range may encompass the variability observed over the GR&R study, the variability of a separate population or a specification interval. In each case, a gage is generally considered precise enough when gage variation is small compared to the measurement range of interest.
A GR&R study is a designed experiment in which factors influencing gage variance are studied using a full factorial design.^{4}^{,}^{1} GR&R studies typically include two factors: 1) a random factor representing parts and 2) a factor influencing measurement variation (operator is commonly chosen as the second factor). Additional factors may be added, where each factor may represent additional sources of variation. These additional factors may include measurement instruments, environment, etc. A GR&R study can identify contributions to gage variance in terms of repeatability and reproducibility based on the factorial structure of the controlled experiment. Repeatability describes the contribution to gage variance from a measurement system when the same part is measured multiple times with all other factors held constant. Reproducibility describes the relative contribution to gage variance from additional factors and interactions between additional factors and parts.
Gage error can be estimated using randomeffects analysis of variance (ANOVA) methods provided certain assumptions can be met:
The total measurement variance σ_{m}^{2} can be defined as
In equation (1) σ_{p}^{2} and σ_{g}^{2} represent variance from the measured population, and variance from the measurement system respectively. Gage error can be defined by contributions from repeatability and reproducibility error as follows:
In equation (2), σ_{repeat}^{2} and σ_{reproduce}^{2} respectively represent variance from repeatability and reproducibility. Reproducibility variance can be further broken down based on contribution from each contributing factor (e.g., operators) and interactions between factors (e.g., operators and parts). Measurement standard deviation σ_{m} is the square root of gage variance.
Gage performance is commonly defined by comparing the relative percentage that gage error consumes over a measurement range. Three different percentages can be readily calculated using gage standard deviation (σ_{g}) obtained from ANOVA analysis of a GR&R study:
Percent study variation =
Percent process =
Percent tolerance 2sided specification =
Percent tolerance 1sided upper specification =
Percent tolerance 1sided lower specification =
In equation (3), σ_{s} is the observed standard deviation of combined measurements taken during a GR&R study. In equations (5a), (5b) and (5c), 6σ_{g} in the numerator can be substituted with 5.15σ_{g} depending on the desired proportion of the distribution in the comparison range. Under the assumption of normality, a value of 5.15 in the numerator implies 99 percent of the measurement range, while 6 implies 99.73 percent of the measurement range.
In all of the above equations, a lower percentage implies a gage error is smaller compared with the comparison range; therefore, the gage is better able to distinguish between measured values over the same range. The percentage type chosen to define measurement system acceptance depends on the intended purpose of the gage. Each percentage implies different information regarding gage suitability for a specific use.
In all of the above equations, a lower percentage implies a gage error is smaller compared with the comparison range; therefore, the gage is better able to distinguish between measured values over the same range. The percentage type chosen to define measurement system acceptance depends on the intended purpose of the gage. Each percentage implies different information regarding gage suitability for a specific use.
Percent R&R provides insight as to measurement system capability to distinguish parts from one another over the range of measured variation observed during the gage study. The denominator in (3) depends on the range of measured values used in the particular GR&R study. Using the same measurement system, a larger measured range of parts will result in a larger total study variance and lower percent study variation. Given the need to determine gage suitability to compare parts over a range in subsequent studies – and given that parts chosen in the gage study represent populations to be measured in the future – percent study variation may be used to base gage acceptance.
Percent process provides insight as to measurement system capability to distinguish parts over a historic process operating range, where process operating range is typically defined as 5.15 or 6 process standard deviations. If the parts chosen for the gage study are representative of historic process variation then percent study variation will be approximately the same as percent process. In many cases, however, the parts chosen for GR&R study include a narrower or broader range rather than the projected or observed process operating range. In these cases, percent process will be larger or smaller than percent study variation. Percent tolerance provides insight as to measurement system capability to distinguish parts from one another over a range of part acceptance defined by specification limits. In the case of a 1sided specification, the range is defined by twice the distance between population mean and specification limit. Equations (5a), (5b) and (5c) show that as the specification tolerance narrows (or the distance between the population mean and specification limit narrows), the same gage will consume more of the measurement range of interest, and percent tolerance will increase. The intended purpose of many measurement applications is to compare values to an acceptance range. Percent tolerance is chosen as a common metric to accept a gage as suitable for most measurement applications.
Criteria for acceptable percentages for equations (3) through (5) vary depending on application. Multiple sources list a gage as acceptable when the percentage is less than 10 percent (i.e., the gage precision error is an order of magnitude smaller than the measurement range of interest) and unacceptable when it is greater than 30 percent.^{2}^{,}^{5}^{,}^{6}^{,}^{7} In the case where the percent is between 10 percent and 30 percent, acceptance may be conditional on other influencing factors – such as cost or risk associated with measurement error, including the risk of measurement misclassification error.
In each of these cases gage precision is established by comparing gage variation to a measurement range of interest. When the gage precision error is small compared to the measurement range of interest, the gage is well suited to differentiate or compare measured values within the range. (The cases described above have not expressly defined gage suitability for differentiating measured values about a specification.)
To establish gage suitability to effectively measure a value for the purpose of comparing this value with a specification limit, consider a zone on each side of the specification defined by 2.33σ_{g}. Units more than 2.33σ_{g} away from either side of the specification have no less than a 99 percent chance of being properly classified about the specification value. Using normal distribution assumptions and equation (5), a GR&R percent tolerance value less than 25 percent can ensure that values more than 10 percent outside of the specification tolerance are rejected more than 99 percent of the time. It follows that a smaller gage R&R percent tolerance will increase the confidence in appropriately classifying values more than 10 percent of the specification tolerance away from a specification.^{8} This further reinforces the usefulness of percent tolerance as universal measurement system acceptance criteria. Low percent tolerance, however, does not eliminate all uncertainty regarding the possibility of gage misclassification error.
Values that fall within 2.33σ_{g} of a specification have a higher chance that gage error will misclassify the measured value to the incorrect side of the specification. There are two possible results for misclassification when considering a specification limit:
Figure 1 shows misclassification types due to gage error for measured values adjacent to a specification limit assuming zero measurement bias.
These types of misclassification are expressed as conditional probabilities, in that the estimated value of the part is assumed to be fixed. Given that real manufacturing scenarios involve an entire distribution of part values, gage adequacy is more completely evaluated through the joint distribution of part values and gage error.
The probability of misclassifying a bad part as “good” can be approximated by counting the occurrence that an estimated value outside of the specification tolerance measured inside the specification limits. The probability of misclassifying a good part as “bad” can be found in similar fashion, where the count about the specification limit is switched. To obtain reasonable approximation of small probabilities, the number of random draws in the simulation must be of sufficient size.
The probability of misclassification will increase if 1) the probability that a measured value resides near a specification value increases, 2) the gage error increases or 3) both increase. Comparing the results of probability of misclassification, which are based on both part population and gage error taken with respect to specification values, to percent tolerance, which is only dependent on gage error, it follows that percent tolerance provides partial information for ensuring that measured values are classified correctly.
In addition, the probability of misclassification increases as the part population encroaches on a specification value. As percent tolerance is independent of part population, it is possible to achieve low values of percent tolerance (less than 10 percent), while the probability of misclassification and associated rates of misclassification may be unacceptably large. Using percent tolerance as a single metric for measurement system capability provides limited information in the case of sorting values that reside close to a specification. This suggests that probability of misclassification values can be used to supplement percent tolerance metrics to ensure measurement systems are applied appropriately in sorting applications. It is important to note that probability of misclassification alone does not provide any insight into gage precision. Comparison of percent tolerance with probability of misclassification suggests that both metrics should be leveraged to establish gage capability in the case where a gage is used to sort values about a specification.
When there is a need to reduce the risk of misclassifying bad values as good, the specification range used to make acceptreject decisions can be narrowed – forcing the misclassification of bad as good to occur less frequently within the narrowed specification range. This concept is commonly referred to as guard banding, and is shown in Figure 2.
For guard banding to effectively reduce the probability of misclassifying bad as good, the specification amount to be pulled into the specification range must be a function of gage variance. As gage variance increases, the guard banded region must be sufficiently within the allowable specification range to encompass gage error. Thus, the guard banded region will include the measurement range where misclassification is most likely to occur adjacent to the specification value. Guard banding can be quantified as a relative count of gage standard deviation by which specification tolerance is reduced within the allowable specification tolerance. For the purposes of this article, guard banding is the count of gage standard deviations with respect to a single specification. For example, guard banding 2σ_{g} within each side of a 2sided spec would reduce the specification tolerance by 4σ_{g}.
Although guard banding decreases the probability of misclassifying bad values as good, it also increases the probability of misclassifying good values as bad. As the specification range is narrowed, any values that fall between the guardbanded specification and the nominal specification will have a higher probability of being misclassified as bad. Guard banding introduces a tradeoff between reducing risk of false acceptance and increasing scrap rate. This tradeoff is shown in Figure 3.
A limit to the tradeoff encountered with guard banding can be reached if gage error is 1) sufficiently large with respect to specification range, and 2) the guard band count of gage standard deviations taken within the specification tolerance consumes the entire specification range. The point at which guard banding will consume the entire specification range can be obtained by substituting equation (6) into the denominator of (5a), and solving for k where k is equal to the number of gage standard deviations taken in from each specification limit. The result is provided in equation (7).
The effect of guard banding on probabilities of misclassification can be estimated except in the following circumstance: The specification used to classify observations from the measured distribution as good or bad is the original specification value and the specification used to establish classification as accepted or rejected is the guard banded value.
1. Montgomery, D. C., and G. C. Runger. “Gage Capability and Designed Experiments. Part I: Basic Methods.” Quality Engineering 6 (1993a): 115135.
2. Automotive Industry Action Group. Measurement System Analysis: Reference Manual, 3rd Edition. Southfield, MI: American Society for Quality Control, 2003.
3. Wheeler, Donald J. “An Honest Gage R&R Study.” SPC Inc, 2008.
4. Montgomery, D. C., and G. C. Runger. “Gage Capability and Designed Experiments Part II: Experimental Design MEthods and Variance Component Estimation.” Quality Engineering 6 (1993b): 289305.
5. Engel, J., and B. De Vries. “Evaluating a wellknown criterion for measurement precision.” Journal of Quality Technology 29 (1997): 469476.
6. Majeske, Karl D., and Chris Gearhart. “Approving Measurement Systems when Using Derived Values.” Quality Engineering 18 (2006): 523532.
7. Erdmann, Tashi P, Ronald J.M.M. Does, and Soren Bisgaard. “Quality Quandaries: A Gage R&R Study in a Hospital.” Quality Engineering 22 (2010): 4653.
8. Taylor, Wayne. “Generic SOP – Statistical Methods for Measurement System Variation, Appendix B Gage R&R Study.” Wayne Taylor Engerprises, Inc., n.d.
]]>When it comes to Six Sigma certification, companies take varied approaches – some leave the responsibility for certification completely on the Six Sigma professional, while others make the certification process an internal part of the company’s Six Sigma initiative. Overall, the results of iSixSigma’s survey of certification show that Six Sigma practitioners are interested in obtaining certification for reasons that are not necessarily tangible. Six Sigma professionals who are certified report they receive respect and admiration from their colleagues and increased job satisfaction from being certified, and are considered a more credible Six Sigma practitioner. Throughout the report, findings are compared to a similar iSixSigma research study conducted in 2007.
Of the different Six Sigma roles within an organization, process owners are by far the least likely group to be certified – 67 percent of respondents in this role reported no certification.
In comparison, Green Belts (GBs) and Champions, the next least certified groups, reported no certification at the rates of 21 percent and 22 percent, respectively. At the other end of the spectrum, 85 percent of Black Belts (BBs) and 99 percent of Master Black Belts (MBBs) hold some level of Six Sigma certification, most often matching their current role (Table 1).
Table 1: What is the highest level of Six Sigma certification you have earned? (N = 446)  
Current Role  
Process Owner  Green Belt  Black Belt  Master Black Belt  Champion  Deployment Leader  
No certification  67%  21%  15%  1%  22%  23% 
Green Belt  10%  75%  5%  3%  22%  7% 
Black Belt  19%  4%  77%  19%  11%  26% 
Master Black Belt  5%  0%  3%  76%  33%  50% 
Champion  0%  0%  0%  1%  11%  4% 
The most common requirement for receiving Six Sigma certification is completing a training curriculum. For the majority of certified BBs (86 percent), GBs (84 percent) and MBBs (84 percent), passing a test was also required in order to receive their certification. Completing one or more Six Sigma projects was required for 96 percent of BBs and 92 percent of MBBs. Coaching projects was required for 36 percent of certified BBs and 83 percent of MBBs (Table 2).
Table 2: Which of the following were required in order to receive your Six Sigma certification? (Check all that apply.) (N = 419)  
Certification  
Green Belt  Black Belt  Master Black Belt  
Completion of training curriculum  78%  92%  98% 
Passing a test  84%  86%  84% 
Complete one or more Six Sigma projects  67%  96%  92% 
Coaching Black Belt projects  3%  10%  83% 
Coaching Green Belt projects  9%  36%  64% 
Why are Six Sigma practitioners working toward certification? The most popular reasons are the same now as in 2007 – being a more credible Six Sigma expert, followed by the desire to get ahead with a company, for the challenge, or because the company requires it. Fiftyeight percent of MBBs stated they are currently seeking certification to become a more credible Six Sigma expert; 54 percent of BBs stated they are seeking BB certification for the same reason. Only 2 percent of MBBs said they are going through the certification process to get a job at a different company, while 19 percent stated they are doing so to get ahead within their own companies (Table 3).
Table 3: Which of the following best describes why you sought or are seeking certification?  
Results from 2007 (N = 1,077)  Results from 2013 (N = 442)  
To be a more credible Six Sigma expert  31%  41% 
To get ahead within my company  20%  15% 
For the challenge  12%  8% 
My company requires/required it  10%  8% 
For the prestige of being Six Sigma certified  9%  7% 
To get a new job at a different company  7%  6% 
Other  10%  8% 
Survey Methodology: iSixSigma staff designed the survey with input from Six Sigma practitioners. Six Sigma professionals were invited by email to participate in the survey. The survey drew responses from 451 individuals. Some reported totals do not add to 100 percent because of rounding and survey questions that allowed more than one response to be selected.
Click here to purchase the complete research report on the iSixSigma Marketplace, which includes additional findings including:
• Professional benefits experienced within a year of certification
• Number of projects required in order to receive certification
• How certification was achieved (classroom, online or blended)
• What certifying bodies are used
• And more!
When implementing SPC, activities and responsibilities should be clearly established. In the first loop (control loop, see Figure 1) of SPC, an operator will look at subgroups to address any items that are out of control in the process. When the operator is not able to handle the outofcontrol variables in the first loop or if preventive action is required the second level should be activated (improve loop). At this stage, production management or engineering personnel can take appropriate actions to avoid recurrence of the problem.
In modern production systems there is not much overcapacity of operators and engineers; the time to work on action plans and improvement activities resulting from outofcontrol variables is limited. Clear priorities need to be set by management so that operators, production management and engineers will address the correct problems. This is the third (and final) SPC loop planning.
During the implementation of SPC, organizations may have problems managing the amount of items that are out of control in the early stages. The usual reasons for the large number of outofcontrol factors include the following:
But what is more important in terms of this reason is that the outofcontrol variables on the control chart monitoring the process average will place the emphasis on the wrong chart. The more important control chart is the one monitoring process dispersion (commonly the range or sigma chart) and, in general, this chart should be in control first before attention is given to the average chart.
Too many outofcontrol points is one of the most common reasons why SPC implementations fail. If there are 30 outofcontrol points in a shift and there is only time to address three or four points, then it seems futile to make the effort as progress can never be made. With such dismal prospects for addressing all the problems, operators essentially throw their hands up in defeat. There should never be more outofcontrol points than can be addressed.
There are two ways to address the problem of too many outofcontrol factors at the early stages of an SPC implementation:
The advantage of the first option is that SPC will be used as it is intended to address critical variables. The disadvantage of this option is that severe quality problems can appear on quality characteristics that are not being monitored, in which case limited data would be available for further analysis of the process and for reporting to customers and management.
The advantage of the second option is that all quality data is monitored and can be used for analysis and reporting. The disadvantage of this option is that more effort and knowledge is required to manage the control charts and the calculation of the control limits.
The next section describes an approach to managing control charts during SPC implementation when dealing with the second option.
At the start of an SPC implementation all efforts must be aimed at controlling the variation of the dispersion (via a range or sigma control chart). When the dispersion of the process is out of control, the average of the process is likely also out of control. This is because there is a relation between the variation in the dispersion and the variation of the average as shown in this equation:
Problems on the dispersion chart should be addressed first. When control limits on the average chart are calculated in the standard way (three sigma from the average), outofcontrol variables on the dispersion chart will also lead to outofcontrol factors on the average chart. These outofcontrol factors on the average chart are not an indication of changes in the process average, however, but are a logical result of changes in the dispersion. When operators without adequate knowledge of SPC see outofcontrol variables on both average and dispersion charts they will likely start working on the problem on the average chart because such problems are easier to address just adjust the process.
The way to avoid this issue is to begin charting without putting the limits of the average chart at three sigma. There are three options:
When the C_{p} value is high enough, the third method is preferred because the limits are still calculated based on the process variation. This method still gives an early warning when a disturbance of the process average will lead to defective products.
When the process average fluctuates, a certain amount of fluctuation can be allowed before starting to look for the cause as long as a required (as determined by the customer or industry) C_{pk} value is achieved. The control limits may be set at a level such that the C_{pk} is higher than a required value. In other words, the distance between upper specified limit (USL) and the process average should be higher than:
where
is the estimated standard deviation of the population
In Figure 2 the dotted red lines are the control limits. The solid red lines are the modified control limits based on a required C_{pk} value of 1.67 (used in the automotive industry). The yellow lines reflect the upper and lower specification limits although normally these are not shown.
The modified control limits can be calculated as follows. The process average is allowed to shift to a level where the distance between the USL and the process average is
The distance between the average and the control limit is
This means that the distance between the USL and the modified upper control limit is as follows:
or
The requirement for using this formula is that the C_{p} value of the process is higher than the required C_{pk}. If that is not the case, then the modified control limits will be too close to the process average and result in false alarms.
Modified control limits are not recommended for every situation. If it is desirable to find special causes of variation, calculate the control limits in the standard way. Use modified control limits as a method to distinguish between priorities among the different process characteristics being monitored.
]]>Consider the following example. In the Midwest a manufacturing company with more than 1,500 employees worked a roundtheclock 24/7 schedule. The company manufactures a product that is sticky; it requires a highvelocity, highvolume production through one of 14 different cutting and packing machines.
Four teams worked rotating 12hour shifts on those 14 different machines. Employee morale was consistently a problem due to the long shifts in particular, some employees ended up working most weekends. At the time, the throughput efficiency was about 65 percent.
The finance department calculated that if the throughput efficiency of the product increased to 70 percent, enough could be manufactured in 16 hours of production to meet customer demand. Then the company could move to a system of two eighthour shifts giving more free time to the workers and saving the company money.
In this highvelocity (more than 2,000 pieces a minute) environment, the product needs to enter the first machine in correct alignment in order to move through a series of rotating metallic parts, where it then meets and is wrapped by paper.
Over time, the product gets out of alignment due to its high speed of motion. Due to its sticky nature, the product can also get stuck on the metallic parts of the machinery; this slows down or stops the process as a whole. At that point, an operator must stop the machinery, open up a gear box and clear out the jam by removing paper and cleaning off residual sticky product, or by troubleshooting via a number of other methods.
This slow down or stoppage happened about five to 10 times every hour per machine.
The operators are experienced and highly adept at finding and correcting the problems of the jammed machinery. They bring the process back up as quickly as possible (and their speed increases over time), which improves the overall efficiency and throughput of the line.
The plant engineers knew that efficiency varied with product type some brands are stickier than others. The stickier the product, the more clogs in the gears of the machinery can be expected. Certain machines perform better than other machines. And some operators are better than others. But which of those variables was most important? And by how much?
Answering these questions was critical each 1 percent movement on efficiency would save approximately $250,000 according to the finance department.
Data captured by program logic controllers on each of the machines included how many pieces of product they wrapped and at what times. The company had data on the following variables:
The company needed to solve the original problem of which variable was the biggest driver of efficiency loss among the following:
Machine difference was clear from the data collected over a sixmonth period as shown in Table 1.
Table 1: Average Throughput Percentages Per Machine  
Machine  Average Throughput (%) 
G  80.17 
R  79.47 
A  75.10 
Q  73.87 
E  64.92 
C  63.77 
B  62.87 
D  62.41 
P  61.09 
F  59.26 
O  58.11 
S  57.86 
H  56.41 
Note: One machine was down during this period so data was collected for only 13 machines. 
One machine was tracking well at 80.17 percent throughput while another was tracking poorly at only 56.41 percent throughput.
The variance in throughput by the top 20 percent (by volume) of product flavors was recorded and is shown in Table 2.
Table 2: Average Throughput Percentages by Flavor  
Flavor  Average Throughput (%) 
c  87.58 
b  86.57 
a  76.75 
d  76.37 
e  73.69 
h  65.82 
g  59.88 
i  58.54 
f  54.63 
j  50.98 
Flavor c ran well in the machines to the point of an 87.58 percent throughput, while flavor j struggled with only 50.98 percent throughput.
The operator tied to each machine was also looked at; Table 3 shows the top 5 percent and bottom 5 percent of performers.
Table 3: Average Throughput Percentages by Operator  
Operator  Average Throughput (%) 
Alfred  82.48 
Betty  81.56 
Charlie  78.78 
David  77.49 
Pat  64.42 
Quinten  62.94 
Randy  61.27 
Steve  56.56 
The best operator performed at an 82.48 percent throughput while the worst was at 56.56 percent.
The three variables show a wide range in performance. The real question was: Are the machine efficiency numbers driven by who the operator is and/or what flavor is being run?
It was known that the more machines an operator worked on, the worse that operators overall performance was.
Table 4 shows the results of a binary logistic regression. Based on the number of lines worked, Table 4 displays the probability of a given operator making the goal of 70 percent throughput in the past six months of work.
Table 4: Probability of Operator Meeting 70 Percent Throughput Goal  
Number of Lines Worked  Probability of Meeting Goal (%) 
1  79.99 
2  66.62 
3  48.50 
4  30.77 
5  17.34 
6  9.01 
7  4.46 
8  2.16 
9  1.03 
The probability of operators meeting the production goal of 70 percent throughput dipped below 50 percent if they worked on three or more lines. It seemed that the company needed operators to master one to two machines rather than working on several.
The next step was to determine which of the four variables (machine, flavor, operator and hour of day) had the most significant influence on percentage throughput. The results of a general linear model ANOVA (analysis of variance) are shown in Figure 1.
This analysis showed that all of the variables are statistically significant, including all of the interactions except the following:
Table 5: Estimated Effects and Coefficients for Throughput (%)  
Term  Effect  Coeff  SE Coff  T  P 
Constant  60.927  1.0199  59.74  0.0000  
Machine  9.140  4.570  0.6951  6.57  0.0000 
Operator  15.303  7.652  0.6466  11.83  0.0000 
Flavor  7.045  3.523  0.5133  6.86  0.0000 
Hour  16.803  8.402  0.9949  8.44  0.0000 
Line*flavor  6.192  3.096  1.5202  2.04  0.052 
Operator*flavor  9.617  4.809  1.5010  3.20  0.001 
Line*flavor*hour  4.389  2.195  1.4884  1.47  0.140 
Operator*flavor*hour  9.641  4.820  1.4883  3.24  0.002 
The ANOVA revealed the variables with the bigger impacts, but further analysis was needed to show the magnitudes of these differences (see Table 5).
The results show that hour and operator are the most significant variables at two times the impact of flavor or machine.
The significant variables ordered by impact are shown in Table 6.
Table 6: Significant Variables by Magnitude of Coefficient  
Factor(s)  Coefficient 
Hour  8.40 
Operator  7.65 
Operator*flavor*hour  4.82 
Operator*flavor  4.80 
Line  4.57 
Flavor  3.52 
Line*flavor  3.1 
The hour at which the work is performed has the biggest impact. This was due to the drop off in throughput percentage at 5 a.m. and 5 p.m. when shift changes occurred.
During the turnover at shift changes, a large dropoff in throughput occurred. Operators were the next largest driver.
Operators mattered more than flavor and machine when it came to how efficiently the plant ran; operator interaction with flavor had a bigger impact than the line and or the flavor acting alone. This showed that in order to improve the throughput, attention needed to focus on the operators as is the case in a lot of manufacturing problems.
Although the assumption was that the problem would lie with the machinery or other technology, this issue was one of human resources at its core.
It was known which operators run best on which machines. As the plant operated, however, workers came in for their shifts and went to whichever machine they wanted or to whichever happened to be open and began their shifts. But what if at the start of each shift the operators went to the machines they performed the best on? Would that correspond to an immediate uptick in efficiency?
Figure 3 shows the data for each operator on each machine and how each operator performed over the past six months for one particular shift of operators.
Using this data, the company could assign a good operator for each machine. (Surprisingly, sometimes the same operators perform just as well on two machines as one.)
Just putting the operators on their best machines raised the throughput percentage average to 71.79 percent, just above the goal of 70 percent. This allowed the operators to change to eighthour shifts.
As a 1percent efficiency increase saved $250,000, the average throughput for the prior 12 months of 65 percent equated to approximately $1.7 million annualized. Just putting the right people in the right spots saved millions of dollars. In addition, employees were happier with the new work schedule.
After a sixweek experiment, the throughput had increased to more than 73 percent and the plant made the shift changes permanent.
]]>Since its founding, TriState has grown to three product lines with a fourth in process. Along with pipe guides, TriState is a manufacturer of industrial trailers for two Fortune 500 original equipment manufacturers (OEMs) and also is a component supplier for the rail car industry. Its newest division, TriState Automation, is putting lessons learned within TriStates own manufacturing facility to use for others in the automation of welding and material handling using robotics.
As is the case with any Lean organization, TriState works to keep the lessons of the methodology fresh continuous improvement is a journey rather than a standalone event. Such is the case for the trailer product line, which represents one of the more complex elements of TriStates business. TriState manufactures an estimated 4,000 trailers each year with 95 percent of volume represented by six models. The remaining 5 percent of the business is comprised of several specialty models. Customers expect delivery within two weeks of order placement and often sooner. Considering the challenges of mix, volume spikes and aggressive lead time expectations, TriState maintains excellent supplier ratings including a 12month perfect supplier scorecard for quality and delivery with one of its key OEMs.
So why the need to take continuous improvement to the next level for trailers at TriState? While the performance results remain strong, the effort to maintain such strong performance is not as systematic as desired. There are occasional material expedites, gaps in standard work, and both 5S (sort, straighten, shine, standardize, sustain) and safety performance remain opportunities for improvement. The trailers team has gone to work in early 2014 to review this value stream for improvement. Many of the structural process improvements will be applied to the other business units as well.
As a starting point, standard work has been developed for TriStates safety process. This process outlines all of the requirements for an annual program that consists of training, an updated first aid management process and performance metrics that are part of every team members merit objectives.
The team has been on a streak of work days without lost time due to injury and is working to keep this streak ongoing. An updated 6S (the usual 5S plus safety) process was initiated in April 2014. This process is simpler than a traditional 1to5 rating scale for each element; it deploys a redyellowgreen (pooraverageoutstanding) rating system for each category. Safety standard work scorecards for each business unit reinforce the need for daily tracking of safety performance including first aid occurrences, timely completion of accident reporting and implementation of monthly preventionoriented job hazard analyses in Figure 1.
The next challenge for TriState was to simplify the management and execution of the seasonal mix elements of the trailer business. The team began by implementing a new visual master schedule. After several attempts, the team concluded the schedule could only be managed in twoweek buckets due to the dynamic nature of customer requirements. Review of the master schedule revealed patterns and trends which supplemented the development of the future state map.
The current state process for component parts was mapped, including quantifying the baseline production rates for individual components. This data will become important during future state process design as make quantities are smartsized against market demand trends.
Development of the future state map noted three internal shipping destinations for trailers or trailer components within the facility: a kitted trailer line, a final assembled trailer line and a component kit line. These shipping destinations are supported by a fabrication and welding operation adjacent to shipping destinations. Personnel are not shared between the fabrication and shipping groups but are shared across the three shipping destinations. Each shipping area competes for some mix of common components, however, which often results in conflicts. The process lacked a systematic method for having the visibility for part mix and then allocating to each of the three areas accordingly.
The team performed a market rate of demand (MRD) analysis (or forecast) based on the past four years of sales data. Takt times for each of the three consumption points were developed based on peak seasonality trends. The MRD analysis indicated the peak season can last more than four months each year.
The team then identified the need to upgrade the existing demandbased pull system. The legacy Kanban system used cards as the replenishment signal system. Like many Kanban cardbased systems, a onesizefitsall approach was initially deployed in the spirit of standard work and consistency. While initial results were great, over time the team learned one signal method was not always optimal for each demandbased pull part family.
The main issues facing the legacy Kanban system were lot sizing (too many or too few) and a lack of prioritization processes for different make items manufactured in the same cell or by the same people. In addition, variances in part processing created challenges. For example, some parts come directly from a paint line to assembly, while others require secondary processing. The legacy system had difficulties dealing with these differences and had essentially been abandoned by supervision; a manual system had taken over. The three internal shipping destinations often competed for like components, leading to material shortages and bad feelings within the team.
The new material pull system utilizes returnable carts (as shown in Figure 2), which are dedicated to each of the three endshipping destinations. Upon consumption, each cart is returned to the appropriate make department in a dedicated swimlane location to facilitate consumptionbased replenishment. Cart sizing was determined based on safety considerations, MRD requirements, space constraints and makepart processing times. Multiple carts are used so the system can incrementally replenish and be throttled up or down depending on seasonality. Part families are kitted on the same cart when it makes sense to ensure complete kits for families of parts always remain whole.
This process is being developed by a crossfunctional team of senior managers, project engineers, supervision and shop floor team members. The team is starting with a series of four parts and will move to the balance of end items incrementally. The goal is to have the right parts at the right place at the right time nothing else. A tremendous side benefit of this system is the returnable carts that serve as standard work mechanisms. Cart sizes are in quantities that represent one half or full shift increments. These quantities aligned with MRD requirements with the first parts implemented in the new system. Once make items are mastered, the team will move on to the challenges faced by the supply chain.
Another objective of this improvement initiative is to greatly simplify the work and get supervisors out of firefighting mode and deeper into process improvement mode. Team metrics boards (as shown in Figure 3) have been created, and weekly Gemba walks have begun focusing on the questions: How is your business unit performing? and How is the team working to improve the business as compared to last week? All of these efforts are in support of customer initiatives to reduce cost within their own supply chain of which several TriState customers have rigorous and formally established and measured annual expectations.
TriState places a strong emphasis on not only what work is being accomplished but also how the work is being accomplished. The TriState quarterly bonus system was amended this year to place a sizeable emphasis on how team members work together in a collaborative spirit. This element of the bonus system is known as the harmony factor.
And no Lean journey would be complete without recognition and celebration. Each month that TriState meets key business objectives, the team celebrates with a Friday afternoon shutdown complete with creative luncheon events. The results secured in pursuit of savings generated through Lean initiatives keep TriState strong and well positioned for growth.
]]>As heat increases, corn kernel popping peaks and the scrambling activity accelerates like a tsunami. If not carefully watched, the popcorn kernels may burn, create that awful smell, and start smoking and trigger a fire alarm (waste).
This popcorntsunami analogy is similar to email strings. The seemingly calm flow of a typical day’s email flow may escalate to a tsunami level in a matter of minutes. For example, there might be a complaint email from a top customer who copied the CEO to elevate the heat factor. What happens next may lead to an emailpopcorntsunami if no escalation path has been defined. Sound familiar?
At first glance, email communication may seem more efficient and faster than the old fashioned inperson discussion or telephone calls. This is probably true until the email distribution goes out of control and becomes an emailpopcorntsunami!
How can an organization prevent this waste?
Company ABC practice of forwarding a complaint to all process owners was analyzed as part of a streamlining initiative.
Waste
It was determined that the average number of workers in this email distribution list was 50 but only 5 employees had the ability to investigate a complaint.
Deliverable
Quantify waste and recommend actions to reduce waste by at least 50 percent.
Actions
Measures of Success
Measure of email waste and savings after implementation of initiative are shown in the table below.
Number of People Receiving An Email Before the Streamlining Initiative  Number of People Receiving An Email After the Streamlining Initiative  Savings*  
100  5  
Average time to read each email = 0.5 minutes  Total time for 100 people read one email = 50 minutes  Total time for 5 people to read one email:
2.5 minutes

237 minutes per week or 205 hours per year

* Frequency of five emails per week.
]]>Due to its large rural area, the City has 21 fire stations, comprised of both fulltime fire fighters and volunteer fire fighters (VFFs). (Note: Although called volunteers, VFFs receive a minimum two hours of pay for responding to calls.) Of the 21 stations, 19 are staffed predominantly with volunteers.
When a call for assistance is made to the central 911 center, the call is directed to the individual fire stations. Each volunteer has a oneway radio and responds to a call by going to the fire station.
A significant amount of variation existed in the number of VFFs responding to medical calls. Existing City guidelines required four VFFs to respond onscene to any medical call; however, in a twoyear period, more than four VFFs responded to 77 percent of calls. In addition to this overresponse, in 17 percent of calls, VFFs were not even required upon their arrival (for example, an ambulance arrived first and tended to any medical needs).
Medical calls include both highacuity and nonlife threatening calls. The City wanted to determine the cost of VFFs attendance at nonlife threatening calls and to review the necessity of the VFF response.
In this improvement project, the City wanted to:
At this point, the first step for the City was to review its cost of poor quality. In 2012 and 2013, 77 percent of medical calls resulted in more than four VFF responding. The cost of having more than 4 VFFs attend these calls of this excess in response was $131,140 annually. The cost of sending VFFs to nonlife threatening calls during the same period was $130,648 annually.
The cost of vehicle fuel, maintenance and repairs associated with attending nonlife threatening calls in addition to false alarms (medical aid not required once on scene or a true patient false call) was $74,400 annually during the twoyear period under review.
Descriptive statistics a method of statistical analysis of numeric data, discrete or continuous, that provides information about centering, spread and normality was used to measure and establish the baseline for:
Table 1: Baseline Metrics  
Metric  Baseline 
Number of VFFs attending medical calls  Mean = 6.74 Median = 7 St Dev = 2.75 Max = 20 Min = 1 Displayed fluctuations in number of VFFs responding 
Number of calls based upon immediate* vs. timed criteria**  Immediate = 244 calls (24%) 10minute timed criteria = 875 calls (43%) Unknown*** = 928 calls (45%) Displayed fluctuation in types of response calls 
Total number of calls per year  In 2012 and 2013, there were 1,000 medical calls responded to by VFFs in each calendar year 
*An immediatelevel call includes asphyxia, CPR administration is required, a defibrillator is required, patient is in shock or vital signs are absent. ** A timedcriteria call is one where the patient is not in life threatening danger and the fire fighter would only respond if the ambulance is delayed by 10 minutes. *** Unknown calls include medical aid not required on arrival, medical false alarm, a medical call that results in no action required, and the coding of the medical call as other medical call by the fire fighter postevent. 
The descriptive statistics analysis identified a median of seven VFFs responding per call. The baseline measure showed that nearly half of all calls were recorded in a way that made it difficult to ascertain the acuity of the medical call based on the coding in the system that included the following options: not required on arrival, medical false alarm, medical call no action required or other medical call.
The control chart in Figure 1 displays a process that is erratic, as evidenced by the number of times the process fails (indicated by the red circles). Predicting the future performance of the process was therefore impossible.
The starred points in Figure 2 reflect the root causes that needed to be addressed in the future process for deployment of VFFs to medical calls. The root causes were determined at a team brainstorming session.
Next, the team wondered if certain fire stations had a higher turnout of VFFs to calls than others.
The causeandeffect analysis (Figure 2) identified that there was no active management of the number of VFF attending a call (see Manpower branch). A KruskalWallis oneway analysis of variance test was performed here since there was continuous data with independent samples. The KruskalWallis analysis proved that there is a difference in the number of VFFs attending based upon fire station location.
After this, the team looked at whether VFFs were arriving after an ambulance was already on scene (which then precluded the need for fire fighters).
The Z value estimate was used to test for the proportion of times in the future that VFFs would arrive on scene ahead of an ambulance. This test was performed because in the causeandeffect diagram (Figure 2), VFFs were noted as attending medical calls when an ambulance was already on scene and was therefore a waste of manpower, machinery and material.
The resulting Z value estimate indicated a 56 percent to 66 percent probability that VFFs will arrive ahead of the ambulance at future calls (Table 2). Therefore, 34 percent to 44 percent of the time when the VFF attends a medical call they are creating a duplication of service, which is an unnecessary cost to the municipality.
Table 2: Graphical Tools  
Group  Rank Sum  Observation 
Omemee  172918.5  199 
Bobcaygeon  137164.5  160 
Dunsford  76147.5  51 
Emily  79920  82 
Bethany  65993.5  76 
Pontypool  137587  118 
Janetville  113523.5  95 
Little Britain  100346  148 
Oakwood  80563  76 
Cameron  63823  71 
Woodville  80864  84 
Kirkfield  425680  297 
Carden  22613  51 
Norland  203740  142 
Kinmount  26640.5  40 
Coboconk  106645.5  138 
Burnt River  71502  79 
Baddow  35163.5  48 
Fenelon Falls  95353  92 
H statistic  448.9919 
Degrees of freedom  18 
Pvalue  0.00 
Chisquared critical  28.8693 
Selecting and Testing Solutions
With the problems identified and measured, the team next embraced the Pugh matrix (Figure 3 below) a decisionmaking tool used to formally compare concepts based on customer needs and functional criteria. Using that tool, the City determined that the optimum solution for how to manage the number of VFFs responding to medical calls was to enact steps consistently among all fire stations in conjunction with the central 911 operators:
Controlling the New Process
After the new process was implemented, a control plan was used to complete the project work and hand off the improved process to the process owner (the executive assistant to the Fire Chief), with procedures for maintaining the gains. The implementation process required some convincing by the Fire Chief to the volunteer fire fighters and to the City Council. Council members needed assurance that no patient would be left vulnerable and in need of care. The Paramedic Chief spoke to Council as the subject matter expert to assure Council that all calls were being monitored and that at any time, a paramedic supervisor had the authority to call in the fire services if need be.
Results
During the project measurement phase, the baseline median was 7. With the process improvements implemented starting January 6, 2014, the median as of March 14, 2014, had decreased to 5. The percentage of medical calls that VFFs are responding to has been reduced by approximately 45 percent, since VFFs now attend only highacuity medical calls. The new process to monitor and control the number of responding VFFs – as well as the types of calls VFFs respond to resulted in initial savings of $43,944 for the first two months of 2014, with projected annual savings of $261,788.
]]>