Obviously, a suggested course of action should be quite guarded, as it would be very case sensitive (i.e., circumstantial by nature).  As we all know, some consultants make their living resolving such “catch 22” situations.  We also recognize that the ability to find a balance point is more art than science.  It is much like asking the question: “Where is the balance point between fashion and exhibitionism?”  Of course, there is no “universal answer” to this question.  An answer would depend upon the underlying circumstances, as well as the value system of the individual providing the answer (not to mention the readership).  Therefore, the perceived quality of an answer would also be relative.

In short, there are many ways to handle the application scenario you have described.  It is quite possible that a full-length book could be written around your question.  Since such a dissertation cannot be provided at this juncture, let us define and interrogate one of the many possible scenarios.

Consider the case where the performance data were gathered in accordance to a rational sampling strategy.  We will say that the short-term data were fully independent and normal.  We will also say that a few anomalies and statistical inconsistencies (i.e., nonrandom variations) were observed between subgroups, but not within subgroups.  Owing to this, the long-term distribution was determined to be non-normal.  In addition, there was evidence of a sinusoidal pattern in the process mean.

Given the blocking powers of a rational sampling strategy, the nonrandom variations would be naturally forced into the SS.b term and not the SS.w term (per the convention of one-way analysis-of-variance).  With this information as a backdrop, we now better understand how the short-term standard deviation could be rationally estimated.  Thus, we could evaluate the instantaneous reproducibility in the form S.st = sqrt(SS.w / g(n-1)).  Given this, we might then seek to compute the two primary indices of short-term capability; namely, Z.st and Cp, respectively.  Clarifying our nomenclature, we must acknowledge that Z.st is a legitimate statistic and Cp is a performance figure-of-merit.

In furtherance of our discussion, we must also recognize that Cpk will always be equal to or less than Cp – owing to the nature of k.  As is well known, the relative magnitude of k is primarily dependent upon the static condition of the process center (i.e., grand mean).  However, when the quantity ng is relatively large, we also recognize that the grand mean is fairly insensitive to extreme values (of a random or nonrandom origin).  Consequently, the parameter “k” is not easily influenced by such anomalies, nor is it subject to any preexisting conditions of a statistical nature (i.e., normal distribution).  In this context, it is merely a linear correction to Cp.  Owing to the merits of these arguments, it should now be evident that Cpk can be meaningfully reported – even in the presence of several different types of statistical anomalies.