# The Dangers of Johnson (and Other) Transformations

Six Sigma – iSixSigma Forums General Forums Tools & Templates The Dangers of Johnson (and Other) Transformations

Viewing 13 posts - 1 through 13 (of 13 total)
• Author
Posts
• #54001

Karlsson
Participant

Hi,

I have a quick question for those of you with ore in depth statistical understanding than I do.

When I first went through my six sigma training before I got my green belt, I was taught to NEVER use a Johnson or other transformation in order to analyze non-normal data. The instruction was to instead try to find out why my data was non-normal and see what I could do to fix it.

I have lived by this mantra now for several years, but I was challenged on it the other day in a project in which there was no desire to take extra time and samples to figure out their deliverables. It was much easier for them to just do a Johnson transformation and be done with it.

This is when I realized, that I don’t quite understand what the dangers are well enough in order to make an educated decision when or why it is safe to use a Johnson transformation.

Does anyone have a good understanding of what – exactly – the dangers are when it comes to these transformations, and why it is a good idea to avoid them?

Thanks,
Matt

0
#192686

Mike Carnell
Participant

@mkarlsson The advice to fix all your non-normal data was just plain stupid advice. There are a lot of things in this world that are not normally distributed. Time is frequently not normally distributed particularly when it is bounded by zero on one end (actually it is always bounded by zero – always on the same end even – but the data doesn’t always bump up against it).

It is always good to run through several iterations of graphs just to look at the data in different ways to try to understand it. I wouldn’t get wrapped around the axel just because it was not normal. I would call your instructor and let him know what crap advice he handed out (I wasn’t your instructor, was I? If I was don’t call).

The other thing I don’t understand is why you want to transform things? There are tools to analyze non-normal data. If you have Minitab there is Capability for non-normal data, Non Parametric tests, Levines test. The help menu is good in terms of telling you the assumptions. Stop screwing around making the analysis a science project. Taking the result of the analysis and doing something with it is where the fun (and your future) is. Sitting at your computer transforming stuff that doesn’t need to be transformed makes you look like a nerd.

Just my opinion.

0
#192687

Robert Butler
Participant

It’s an interesting mantra, it’s boilerplate, and it is simultaneously right and wrong. There’s nothing dangerous about transforms. The only danger, and this is what your instructor was trying to make you understand, is the blind application of a transform (or any other statistical tool for that matter)without looking at the data (that is, really looking at it). In short, he/she wanted to make sure that you do first things first.

First thing – plot the data and look at it and spend some time understanding the process from whence the data came.
a. Plots should include a histogram, a boxplot, and a normal probability plot.
b. You should know how various distributions look when they are plotted in the above manner.

Second thing – if the data is non-normal how is it non-normal? Is it multimodal? Is it truncated? Does it appear to have a natural upper or lower bound? Do the tails look “too heavy”? etc.

Third thing – given the items listed above you should know what to do when confronted with any of them.

For example:

Multimodal – probably means multiple feeds of some kind therefore you are making multiple products – better find out the story between the modes before doing anything else.

Truncated – who’s cutting off the tails of the product distribution and why?

Apparent natural upper/lower bound – why – many processes have natural bounds and if you are operating too close to those bounds your distribution will always be non-normal – see Bothe – Measuring Process Capability Chapter 8 “Measuring Capability for Non-Normal Data” for lots of examples.

Tails too heavy – why – does it matter?

etc.

After satisfying yourself that the data is representative of the process when everything is under control then, and only then, should you give some thought to the need for data transformation.

If you do transform you will need to know what the transform is doing for you and if it matters. For example, if your data is truncated (typically happens when supplier is cherry picking material lots) all the transforms in the world won’t make that data normal – it is a truncated whatever and it will remain so.

If you do transform – run whatever analysis you are running with both transformed and untransformed data to see if anything changes with respect to outcomes or actions that might be taken as a result of the analysis. If nothing changes then you might want to ask why you would want to bother with a transform in the first place.

One final thought. If your crew is running a Johnson transform – which one are they running?

0
#192688

MBBinWI
Participant

@mkarlsson – Both Mike and Robert have given you excellent advice (as would be expected). I’m more concerned that you have a team “in which there was no desire to take extra time and samples to figure out their deliverables. It was much easier for them to just do a Johnson transformation and be done with it.” Does that mean that they have done what Robert describes above? If so, then plow on, if not, how are you going to determine what to do? Seek first to understand.
There is one very real risk in transforming data, and that is that you also need to transform the spec limits for a capability study. This often confuses people and they try to “untransform” them (even putting in reference lines in Minitab). This can cause more problems than originally existed.
There is nothing inherently right or wrong with transformation, you just need to use it appropriately.

0
#192718

Darth
Participant

In summary:
1. Try to understand the data and see if there are some obvious reasons why it might not be normal. No rule says all data and all processes are normally distributed.
2. Try using non parametrics.
3. If the first two don’t really give you what you need, then cautiously transform the data keeping in mind that the units of measure will be totally different than the original data and can cause some confusion if looked at by someone unfamiliar with transformed data.

0
#192742

Member

Yeayahhhhh!
Yeayahhhhh!
The voice of reason prevails!
Kudos to all.

0
#192743

MBBinWI
Participant

@spazwhatsup – Got a problem, dude?

0
#192744

Darth
Participant

“Any distribution can be characterized by four parameters, whose calculations are the same for any distribution:”

You then provide a generic description of the mean, s.d., skewness and kurtosis. I would beg to differ that the calculations for the s.d. of a binomial and Poisson are quite a bit different than for the Normal. While all distributions have descriptors of central tendency, variation and shape I feel your sentence is misleading and possibly incorrect.

0
#192746

Katie Barry
Member

Note to all: The link that @[email protected] provided (and @Darth is referring to) has since been removed. We ask that forum participants do not promote products, services or businesses on the discussion forum, including articles contained on their professional websites. [ Forum Etiquette guidelines – https://www.isixsigma.com/topic/forum-etiquette/ ]

0
#192748

Paul Keller
Participant

Katie: Sorry about that. I was simply trying to provide more information than can easily be typed into this forum.

0
#192759

Member

Very good point Paul.

0
#192773

Darth
Participant

@[email protected] Whoops, now that Katie has removed the links, I can’t really follow up other than to question what you mean by a general formula for calculating s.d. We are all familiar with the formula for continuous data but the calculations for discrete data are all very different eg: s.d. for a Poisson is sq. root of lambda while the s.d. for a binomial utilizes the sq. rt. using p and (1-p)etc. I do agree on a more macro perspective that worrying about an uncontrolled process makes little sense but then again, Shewhart designed the control chart to be very robust to distribution shape and well documented by Wheeler and his writings.

0
#192777

Paul Keller
Participant

As I said, the formula for calculating the standard deviation of a set of empirical data is different than the formulas for calculating the standard deviation of a presumed distribution.
You might investigate some of the peer-reviewed journals on quality engineering, including the Journal of Quality Technology (published by ASQ), Quality Engineering (published by ASQ) and Technometrics (joint published by ASQ and ASA). They’ve been publishing peer-reviewed articles for many years, discussing the statistical basis of the control charts, their detection levels and false alarm rates. It is not magic that the control chart is robust to distribution. It is the very nature of its basis in statistical theory. Yet, each control chart, like all statistical tests, has limitations. It should be obvious that a control limit defined well below zero, when the process metric cannot physically go below zero, is modeling the process poorly. The danger of this poor model is the inability to detect real process changes, or to react to perceived process changes that do not exist. Shewhart, Deming, and many others have discussed these issues as central to the need for control charts, so it is rather ironic when a chart is incorrectly used to result in that same dreaded outcome.

0
Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic.