Accurate predictions, optimal decisions, and an explanation of root cause and its effects on a process are just a few of the many types of solutions managers are expected to deliver when complex problems appear. Yet, discriminant analysis, one of the most powerful tools used to solve problems in a process, is often neglected in college engineering courses and certification programs. As a result, discriminant analysis remains underappreciated and underutilized.
The goal of discriminant analysis is to define a discriminant function that assigns an observation to one of two classes. The applications of the analysis are practically infinite, but in order to build such a function, practitioners first need a complete data set with both observations and their true class membership, or classification.
The following example outlines the creation of a data set and the use of discriminant analysis. A director of operations has been plagued with the same problem for months. The process is highly capable and internal scrap is low, but roughly 3 percent of the shipped products fail when the customer installs them into an assembly. Although the parts are within customer specifications, they are still failing. It may take months for a detailed chemical analysis to provide the director with a true root cause, but even then, there are no guarantees. However, upon graphing the last batch of customer returns against a random sample of parts ready for shipment, a clear pattern emerges (Figure 1).
It is much cheaper to scrap production inside the facility than it is to be charged for the customer’s complete assembly that failed because of a defective product. Therefore, the director’s goal is to be able to separate those units that are likely to fail as soon as they come off the production line.
At first glance, however, it appears that the situation is hopeless. Customer returns and normal parts are overlapping in the scatter plot. In situations like this, statistical analysis software can be of great help. By using software with a discriminant analysis feature, practitioners can separate production with a high degree of accuracy and minimal cost.
The software will analyze the full data set and produce two group membership functions. These equations are constructed in a way that will minimize the misclassification of the sample data set. As shown in Figure 2, the functions are able to achieve a 91 percent accuracy level. Because each data point goes through both membership functions, practitioners can assign the data point to the function that returns the higher value. This solution can mitigate risk until the director of operations can identify the true root cause of the failures.
Thanks to the findings of the discriminant analysis, the director has a statistically valid procedure for classifying future production volume. The next step is to monitor the performance of this new procedure. Although the next batch of returns should be smaller because of the implementation of the new decision support tool, practitioners will perform another round of discriminant analysis when they receive the next batch of parts returned by the customer. The accuracy of this tool will be directly proportional to the sample size used to construct it. In every sense, creating this decision support tool is a continuous improvement project in itself.
Humans are excellent classifiers by nature. By the time they learn to crawl, children are already classifying objects as interesting or boring, food as tasting good or bad, and activities as pleasurable or not pleasurable. Unlike the environment that managers operate within, however, children receive immediate feedback when they make a less than optimal choice, aiding in their classification. This feedback is what allows them to learn and continually make better decisions in the future.
Managers are often not afforded this luxury of automatic feedback because the optimal threshold they can use to classify future observations correctly will never be known. But by using the discriminant analysis function within statistical software packages, they can classify future production volume and come closer to the optimal threshold.