# Box Cox transformation

Six Sigma – iSixSigma › Forums › General Forums › General › Box Cox transformation

- This topic has 3 replies, 4 voices, and was last updated 9 years, 5 months ago by broderick.

- AuthorPosts
- February 2, 2011 at 3:51 pm #53714

JaimeParticipant@JCastillo**Include @JCastillo in your post and this person will**

be notified via email.Have a good day,

I have transformed data by Box Cox transformation but Lambda is not between 5 and -5. Can I used these transformed dat for calculating Cp, Cpk, Pp and Ppk index?. ThanksHe transformado unos datos con Box Cox tranformation pero Lambda no se encuentra entre 5 y -5. ¿Puedo usar esos datos transformados para calcular los indicadores Cp, Cpk, Pp y Ppk?. Gracias.

0February 3, 2011 at 5:16 am #191199Does your transformed data pass a normality test? If no, don’t use it.

There are lots of causes of non-normality, most have to do with a poorly controlled process. Most are not suitable for transformation. Understand the underlying reason for non-normality instead and fix the issue.

0February 7, 2011 at 12:59 pm #191202

WoodleyParticipant@gwoodley**Include @gwoodley in your post and this person will**

be notified via email.Good Morning:

Data can exist in many different “shapes”. That is why we have distributions. Typically, the first thing we test for is normality, and there are three tests within that distribution, i.s. Anderson-Darling, Ryan-Joiner or Komolgorov-Smirnov. If the tests fail, we than check for stratification (splits or layered data). If this is not the reason for non-normality, the next option is to test to find out what distribution the data is from (e.g. exponential – compound interest, acceleration, etc.) This failing, the logical sequence suggests trying the Box-Cox transformation, and lambda does go from -5 to 5, with each representing a different transformation – the input variables each raised to the power of the lambda, except zero (0), which leads to the natural (naperian) logarithm. Some software allow you to set the lambda and you could also simulate the effect of different lambda value in data sets and test each for normality.

The final choice is the Johnson Transformation. Either way, you can in most cases determine process capability. Some of transformed data sets do not fit well, but we can cross that bridge later. Minitan is, by far, a great tool to assist you.

George

0February 7, 2011 at 5:52 pm #191204I would agree with Stan and avoid the complexity of Mr Woodley’s suggestion for the moment. Simply plot the data and (assuming you have already established its stability) view the distribution’s shape relative to the process / data type.

Maybe you are working with cycle times, pricing data, etc where it isn’t reasonable to expect normality. You can do everything you want to do (eg baseline your capacity) without transformation.

Simpler the better, both from an analytics and communications perspective. Analytically, you are adding complexity where it isn’t needed and from a communications stand point, If your stakeholders eyes haven’t glazed over after you explained “sigma” they certainly will once you throw Box Cox or Johnson transformations at them.

En realidad, haciendo el complicado mas basico es nuestro trabajo verdadero, pero eso es solo mis “dos centavos”. Bien suerte.

0 - AuthorPosts

You must be logged in to reply to this topic.