Box Cox Transformation

Six Sigma – iSixSigma Forums Old Forums Europe Box Cox Transformation

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
  • #24110


    have a good day,
    I did a box cox transformation but Lambda value is not betweeen 5 and -5. What is the meaning of that? What can I do for calculating Cpk index if there is not a normal distribution?. Thanks.
    He realizado una transformación box cox pero el valor de lambda está fuera del rango de 5 y -5. ¿Qué significa esto?. ¿Qué puedo hacer para calcular el indicador Cpk si no existe una distribución normal?. Gracias.


    Alpaslan Terekli

    If you can not transform your data to normal by box-cox:
    Reason 1: Your measurement system is not enough precise to measure intermediate or extreme values. Pls check by dot plot if plot has mass of dots on min and max than you can assume measurement system is not capable to measure over than curtain range. If dot bars on plots has curtain distance between them most probably your measurement system is not good enough to measure intermediate values. Usualy such a data can not transform in to normal by box cox.
    Reason 2: Check again by dot plot. If you can realise more than 1 hill that would be also reason that you can not transform your data in to normal. More than1 hill is the out come of changes on process ( different machine, different shift etc). That can be realise by probaility plot. dots on probabilty plot would have S shape.
    Last hint: you can use Johnson transformation.
    Hope Helpfull



    Hai James,
    Box-Cox-Transformation tries to find a best Lambda (L) between -5 and 5 such that Y**L is as close as possible to a normal distribution (exception for L=0 the Log(Y) is taken).
    This ‘goes wrong’ if:

    data contains negative or 0 values: calculation is not possible; BCT refuses an outcome {add a constant to all data to make all positive values; stay on safe side for next sample}
    BCT finds that best L is not within [-5,5]: data is ‘not transformable’. This can often happen with data that is measured ‘too rougly’ like other poster mentioned {measure better}
    BCT finds L but data is still not Normal distributed. This can happen if you have several populations (camel-hump){Split populatons}.
    Make a picture of the raw data first (so you see what is going on). look for outliers and combined populations.
    If data is not normal distributed:
    if 1 population:

    find out what a fitting distribution is (in minitab Stat -> Qual Tools -> IDI) and calculate Cpk (use in minitab: Stat-> Qual Tools -> Cap Ana -> nonnormal -> choose correct function)
    Replace 3*Sigma in Cpk calculation with (P[0.9985]-P[0.0015])/2 where P[k] is the location where k*100% of the datapoints lies to the left of that location.
    if more than one population: split up and calculate each seperate. Do not combine / average the Cpk’s!!!
    Good luck

Viewing 3 posts - 1 through 3 (of 3 total)

The forum ‘Europe’ is closed to new topics and replies.