iSixSigma

statistic ofr skew data

Six Sigma – iSixSigma Forums Old Forums General statistic ofr skew data

Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • #30152

    Ex GB
    Participant

    Which statistic should I use to describe a skew population ? Some told me that we should use mean but some use median.
    Please advise
    Ex-GB

    0
    #78270

    James A
    Participant

    Morning Ex-GB,
    I’d try using kurtosis to describe the skew – the NIST website is a great source of stats type data, and it’s free – the following link will take you to the area I think you need to look atfor more information:-
    http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm
    James A

    0
    #78271

    Ex GB
    Participant

    Thanks James,
    It seem that kurtosis is used to describe the skewness of the distribution. Which parameter I should use to describe the mean of the distribution, arithmatic average, median or the others
    Regards
    ex-GB
     

    0
    #78272

    James A
    Participant

    I’d try using the arithmetical mean – if you think about it, it makes sense as it is the same as you would use in SPC – which also deals with ‘normal’ albeit skewed (on occasions) data.
    Statistician I’m not, so others may disagree, but that’s where I’d go.
    James A

    0
    #78273

    Ron
    Member

    If you are not discussing this with statisticians then use both. Also plot the data in a histogram.
    Remeber the customer does not feel the mean, therefore talking to the mean and the median and what the differences are between them is important. Also by displaying the data you can tell whether you are dealing with special cause or common cause occurrences.
     

    0
    #78277

    Marc Richardson
    Participant

    Kurtosis is a measure of the relative height of the normal, bell-shaped curve. Skewness is a measure of the relative amount of left or right shift of the mean in the distribution. By definition then, a large left or right shift in the mean of a distribution would suggest that the mean is not an accurate measure of the distribution’s central tendency. It is a strong indication that the data are not normally distributed. In these cases, it is worth analyzing the data using nonparametric methods.
     
    Marc Richardson
    Sr. Q.A. Eng.

    0
    #78278

    Ashman
    Member

    I suppose that depends on how skewed the data is.  You can use the kurtosis to figure out how skewed it is.  Once you know that, you can decide whether the mean or median is more appropriate.  Most people don’t like to go away from their usual practices so they will tell you to use the mean all the time.  But if your data is not distributed normally or uniformly, then often the median is appropriate.  Particularly if you have a very points that are skewing you data.
    EX: Sample A has 1000 data points, each equal to 10.  Therefore, mean = median = 10.  Sample B has 1000 data points, 999 equal to 10, 1 equal to 1,000,000.  median still equals 10, but the mean is now equal to 1009.99.  So you tell me which statistic better represents the data?

    0
    #78405

    Harjot
    Participant

    Here’s a simple answer to your simple question:
    Use mean and median and standard deviation to describe the data.
    For analysis, median and span might be the best measures. Use your process expertise to understand why the skew is happening. Might be because there are variations to the process or because there are more than one processes hiding under the data. Also, try segmentation to remove skew.
    All the best. Hope this helps!
    Harjot

    0
    #78414

    Ropp
    Participant

    If the data is severly skewed then the median and interquartile range may be appropriate. If you suspect that the underlying distribution is normal, then it is important to determine the cause of the skewness in your sample.
    Avoid using measures of skewness and kurtosis unless the sample size is quite large (rules of thumb range from n=>50 to n=> 100).

    0
    #78689

    Arturo Ruiz Falcó
    Participant

    The answers you have got highligth the mean / median dilemma. I suggest to perform a data transformation to achieve normality (and therefore simetry). You can do it using the Box-Cox transformation wich requieres a lot of calculations but is available in most statistical packages (e. g. Minitab). I hope it helps.
    Regards,
    Arturo

    0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.