Manbir Singh
June 26, 2012Comments Off
Home › Forums › General Forums › Tools & Templates › Data Transformation
Tagged: Box Cox Transformation
This topic has 5 voices, contains 6 replies, and was last updated by
Mike Carnell 316 days ago.
| Author | Posts |
|---|---|
| Author | Posts |
| June 26, 2012 at 7:15 am #183424 | |
|
Manbir Singh @talk2manbir Reputation - 76 Rank - Aluminum
|
It is advised that if data is not normal then we do data transformation to make to Normal. Commonly used transformation is Box Cox. what is the need to transform data when we have non parametric tests available. |
| June 26, 2012 at 9:01 am #183431 | |
|
Joel Smith @joelatminitab Reputation - 974 Rank - Copper
|
Manbir- Whether or not you should use a transformation depends greatly on what tool you are using. For example, it is not necessary on a control chart unless your data are highly skewed or you are getting nonsensical control limits and even then a different chart may solve the issue without transformation. With respect to nonparametric tests, there are two things to consider. The first is that although normality may be an assumption of a test in the mathematical sense, research shows that many of those tests (1- and 2-Sample T, One-way ANOVA, etc.) are very robust and that normality is not required. Second, nonparametric tests are almost across-the-board much less powerful tests than their parametric counterparts, so you give up alot when switching to one. Capability Analysis is one tool that is highly sensititive to the distribution assumed. If your data is non-normal, then Capability is available for many other distributions. If your data doesn’t seem to fit a distribution and is bi-modal or has some other very unusual shape, your process may not be in control or your data may represent more than one process. Good luck, Joel |
| June 26, 2012 at 9:03 am #183432 | |
|
Robert Butler @rbutler Reputation - 2138 Rank - Silver
|
If that is what has been advised then it’s not very good advice. The first thing you need to do is plot the data – histogram, normal probability plot, time plot, whatever makes sense, and look at what you have. Once you know what it looks like you then need to have some understanding of what it is that you want to do with the data since, in many instances, data normality isn’t an issue. As for non-parametric tests – they are not a cure all. They too have issues and you need to know what they are before you decide to use them |
| June 26, 2012 at 10:05 pm #183454 | |
|
Manbir Singh @talk2manbir Reputation - 76 Rank - Aluminum
|
Thanks Robert & Joel. This helps a lot. |
| June 26, 2012 at 10:19 pm #183456 | |
|
Chris Seider @cseider Reputation - 3002 Rank - Titanium
|
If you are talking 5,10, 15, 20+% defective, don’t even bother coming up with the right distribution to get the most appropriate process capability. As @rbutler stated it well, go look for your causes and begin to eliminate the reasons. Customers don’t care what distribution the data follows, they care about the % out of specification. If you do a basic process capability (normal) using Minitab, just use the % or ppm observed in the lower left box. Make this your primary metric, and do what rbutler advised. Some advocate transformations, but I advise my belts to begin to solve the problems and then rationalization can occur as to which distribution might make sense.
|
| July 5, 2012 at 2:26 pm #183720 | |
|
Darth @Darth Reputation - 1285 Rank - Silver
|
@joelatminitab Hey Joel, hope all is well. Circling back to the last comment the poster made, he concluded that he could go ahead and do ANOVA if the normality wasn’t too bad. Let’s not forget about the notion of equal variances. That might be a bigger issue than normality, eh? |
| July 6, 2012 at 5:28 am #183733 | |
|
Joel Smith @joelatminitab Reputation - 974 Rank - Copper
|
@Darth – That’s right. Alternatively, the Welch Test is not sensitive to differences in variation. The Assistant Menu in Minitab uses the Welch Test for One-Way ANOVA while the Stat Menu uses the traditional F-Test. If the equal variances assumption is not met, I’d recommend Welch first before doing a nonparametric. |
| July 6, 2012 at 8:39 am #183735 | |
|
Darth @Darth Reputation - 1285 Rank - Silver
|
@joelatminitab Just making sure you are still on top of things :-). |
| July 7, 2012 at 10:31 am #183752 | |
|
Mike Carnell @Mike-Carnell Reputation - 3168 Rank - Titanium
|
@Darth @joelatminitab I am sure Joel feels much better knowing that you are checking up on him. It looks like you have picked up a Canadian accent? |
© Copyright iSixSigma 2000-2013. User Agreement. Any reproduction or other use of content without the express written consent of iSixSigma is prohibited. More »