Some practitioners mistakenly believe that it is not necessary to transform data before creating an individuals control chart when the underlying process distribution response is not normal. An individuals control chart, however, is not robust to non-normally distributed data. Therefore, it is important to use an alternate control charting approach.

### Necessary Transformation

Consider a hypothetical application of the individuals control chart involving an accounts receivable department sending invoices to customers for payment. The difference between payment date and due date often follows a lognormal distribution.

The following data can be considered a random selection of one invoice daily for 1,000 days, where the payment date for the invoice was subtracted from its due date. Therefore, for instance, a positive value of 10 indicates that an invoice payment was 10 days late.

In this example, 1,000 points were randomly generated from a lognormal distribution with a location parameter of 2, a scale parameter of 1 and a threshold of 0 (i.e., lognormal 2, 1, 0). The distribution from which these samples were drawn is shown in Figure 1. In this simplified illustration, it is considered that nobody paid early, where the threshold would be equal to zero. A normal probability plot of the 1,000 sample data points is shown in Figure 2.

From Figure 2, it is possible to reject the null hypothesis of normality technically, because of the low p-value, and physically, because the normal probability plotted data does not follow a straight line. This is also logically consistent with the problem setting, where a normal distribution for the output of such a process is not necessarily expected. A lognormal probability plot of the data is shown in Figure 3.

From Figure 3, a practitioner would not reject the null hypothesis of the data being from a lognormal distribution because the p-value is not below the criteria of 0.05 and the lognormal probability plotted data tends to follow a straight line. Hence, it is reasonable to model the distribution of this variable as lognormal.

If the individuals control chart is robust to the non-normality of data, an individuals control chart of the randomly generated data should be in statistical control. In the most basic sense, using the simplest run rule (a point is “out of control” when it is beyond the control limits) such data would be expected to give a false alarm three or four times out of 1,000 points, on average. Further, a practitioner could expect false alarms below the lower control limit to be equally likely to occur as false alarms above the upper control limit.

Figure 4 shows an individuals control chart of the randomly generated data.

The individuals control chart shows many out-of-control points beyond the upper control limit. In addition, the individuals control chart shows a physical lower boundary of 0 for the data, which is well within the lower control limit of -22.9. If no transformation is needed when plotting non-normal data in a control chart, then a practitioner would expect to see a random scatter pattern within the control limits, which is not the case in Figure 4.

Figure 5 shows a control chart using a Box-Cox transformation with a lambda value of 0, the appropriate transformation for lognormally distributed data.

This control chart is much better behaved than the control chart in Figure 4. Almost all 1,000 points are in statistical control. The number of false alarms is consistent with the design and definition of the individuals control chart control limits.

### Finding the Process Capability Metric

By using a lognormal probability plot, it is possible to determine the best estimate process capability metric output for this fictitious process: 80 percent of all invoices are paid between 2.1 and 27.4 days beyond the due date, with a median of 7.7 days late, when no specification exists (Figure 6).