iSixSigma

What Is Distribution?

Six Sigma – iSixSigma Forums General Forums General What Is Distribution?

This topic contains 5 replies, has 4 voices, and was last updated by  Rip Stauffer 11 months, 2 weeks ago.

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
    Posts
  • #55902

    Andy Croniser
    Participant

    Anyone want to take a shot at explaining this in simplified terms?

    I would appreciate it.

    #202082

    MBBinWI
    Participant

    @andycroniser – why don’t you provide your explanation and we can give you feedback?

    #202087

    Chris Seider
    Participant

    Try a histogram or stat>basic stats>graphical summary in Minitab.

    @MBBinWI please tell me you’re staying warm and safe–brrr it’s kinda cold up there! GO PACK GO!

    #202088

    Mike Bonnice
    Participant

    A distribution is a description of the amount of variation and the kind of variation.

    #202091

    Jay Armstrong

    It’s an arrangement of data that reflects the frequency of the values under study. Think of a forest – the heights of the trees would be one type of distribution. Some few trees will be very small – 1-3 feet – most, probably the greatest number, will be average height – 15 – 30 feet, and a very few will be greater than 30 feet. That range and frequency of the individual height variables – usually in the form of a histogram – will be the distribution.

    #202099

    Rip Stauffer

    We usually talk about a distribution in two different senses:
    1. Empirical…how does some set of collected data pile up, with a number scale that includes all the values as the x-axis and frequency (number of observations) as the y-axis? This pile of data can be called the “distribution” of observations. It is common to depict the empirical distribution in a histogram (for continuous data), or a discrete plot in the case of discrete values. We use statistics to characterise estimates of spread, center and other important parameters of these empirical piles of data (xbar, for instance, is the statistic we use to estimate mu; S for the estimate of sigma).
    2. Theoretical distributions: Models we use for inferential statistics. These models are based on mathematical operations performed using parameters for shape, center and spread (mu, for example, or sigma). If the empirical pile appears similar to the theoretical distribution, we often assume that the data from the phenomenon of interest (population or process) will be distributed similarly to the theoretical model, so we extrapolate from the empirical to the theoretical.

    A couple of important notes:
    1. Many commonly-used distribution models’ curves are asymptotic; that is, the tails of the distribution approach zero, but never actually touch zero. In a histogram, the tails do touch, and do not stretch out to infinity in one or the other (or both) directions.
    2. One important notion that’s often neglected in these discussions is that a distribution implies predictability or homogeneity…that the observations all come from the same universe (population or process cause system).
    3. From Don Wheeler: “a probability model is, at best, a limiting characteristic of an infinite sequence of data. Therefore, it cannot be a property of any finite portion of that sequence.” What this means, essentially, is that data are never “normally distributed.” The best we can say is that our observations pile up in a way that is not inconsistent with the model we are going to use to characterise them.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.