Discrete Vs Continuous Data
Six Sigma – iSixSigma › Forums › Old Forums › General › Discrete Vs Continuous Data
- This topic has 13 replies, 7 voices, and was last updated 12 years, 10 months ago by
aiman farid.
-
AuthorPosts
-
May 4, 2002 at 5:49 pm #29382
Team,
What are key differences between Discrete & Continuous Data and how are both measured and sampled?0May 5, 2002 at 11:02 pm #75156
GabrielParticipant@GabrielInclude @Gabriel in your post and this person will
be notified via email.The difference is that “continous” can take any value (for example, between two values there will ALLWAYS be infinite intermediate values). Discrete can only take certain values. For example, the natural numbers are discrete (even when there are infinite quantity of values, between any two values there are a finite quantity. Example: btween 1 and 10 you have 9 values). The real numbers are continuous. No matter how close are two values, you will find infinite values in the middle (between 1 and 2 you have 1.1, 1.01, 1.001, etc).
But the measured data is allways dicrete because the precision is never infinite (if your caliper has a reading of 0.01mm you will not find any “data” between 2.11 and 2.12, for example). However, if the precision is very good compeared with the variation of data, then it is usually taken as if the data was continuous.
By the way, you didn’t mean to ask about “attribute” and “variable”, right? It is sometimes misstaken with “discrete” and “continuous”, but it is not the same.0May 6, 2002 at 2:26 pm #75188
Marc RichardsonParticipant@Marc-RichardsonInclude @Marc-Richardson in your post and this person will
be notified via email.Discrete:
Anything that cannot be measured on an infinitely divisible scale.
Attribute: go/no go, accept/reject, male/female… (only 2 categories)
Categorical: colors, nonconformance categories in a Pareto distribution… (more that 2 categories, order does not count)
Ordinal: grades, survey rankings… (more than 2 categories, order counts)
Continuous:
Anything that can be measured on an infinitely divisible scale, regardless of the limitations of you measuring device: temperature, time, mass, length…
Marc Richardson,
Sr. Quality Assurance Engineer0May 6, 2002 at 2:35 pm #75191Let’s start at ground zero:
There are two types of data – Attribute Data and Variables Data.
Attribute data is the lowest level it is purely binary: Good Bad
We then convert the attribute data into a form a varialbes data called Discrete: We do this by counting the number of good and the number of bad. We can embellish this slightly by referring to count data, percentage data. When we convert this attribute data into a more useful form it is called discrete data.
Variables data consists of two types: Discrete, and Continuous. I previously described discrete. Continuous is the type of data you collect with measuring instruments. It is called continuous because it can take the form of continuing numerical readings (decimal values ).0May 6, 2002 at 2:59 pm #75197
GabrielParticipant@GabrielInclude @Gabriel in your post and this person will
be notified via email.I don’t understand:
“Anything that can be measured on an infinitely divisible scale, regardless of the limitations of you measuring device”
If the limitation of your measuring device is that the scale is not infinitely divided (a limitation every instrument has), how can anything be measured on an infinitely divisible scale?0May 6, 2002 at 3:20 pm #75198
Marc RichardsonParticipant@Marc-RichardsonInclude @Marc-Richardson in your post and this person will
be notified via email.The emphasis is on can be, as in theoretically can be. If you have 2 categories, go/no go, for example, you cannot create a third category in between them. It is logically impossible. If, on the other hand you want to measure the center distance between two holes, you can start with a ruler which is divided into 32nds of an inch. If this doesn’t provide enough resolution, you can use a pair of calipers which has a resolution of .001 inches. If this isn’t sufficient , you can measure it with a CMM with a resolution of .0001 inches. It is at least theoretically possible that the scale can be divided yet again to .00001, .000001 and so on, infinitely.
Marc Richardson,
Sr. Quality Assurance Engineer0May 6, 2002 at 4:05 pm #75200
GabrielParticipant@GabrielInclude @Gabriel in your post and this person will
be notified via email.Thanks! Now I understand what you meant. Anyway, it seems to me that, with those definitions, the limits between some kind of variables are somehow “grey”.
For example: If you check a diameter with a plug – gauge, is it attribute? The diameter ittself is a continuous variable (it matchs your definition), but the measurement (data) is not. A go – no go gage (with MAX and MIN) is a typical case of what is usually called “attribute”, but you have three categories, not two (small, OK, and big). Sometimes it is (was) used to use two go – no go gages for the same characteristic. One (more strict) was used to control the process, and the other (specs limit) to control the product. Then you had 5 categories: Too small (reject the product and correct the process), small (acept the product but correct the process), OK (go on), big, and too big. Is it attribute or variable?. I could put enough go – no go gages to have a better discrimination than a calipper, for example. I’m not realy sure about how it is. In a previous post I said something about “attribute” vs “variable” (it is used in SPC for example: attribute chart, variable chart), but in fact “attribute” is also a variable with few possible values.
I took the following from the electronic textbook of StatSoft:
Measurement scales. Variables differ in “how well” they can be measured, i.e., in how much measurable information their measurement scale can provide. There is obviouslysome measurement error involved in every measurement, which determines the “amount of information” that we can obtain. Another factor that determines the amount of information thatcan be provided by a variable is its “type of measurement scale.” Specifically variables are classified as (a) nominal, (b) ordinal, (c) interval or (d) ratio.
a.Nominal variables allow for only qualitative classification. That is, they can be measured only in terms of whether the individual items belong to some distinctively different categories, but we cannot quantify or even rank order those categories. For example, all we can say is that 2 individuals are different in terms of variable A (e.g., they are of different race), but we cannot say which one “has more” of the quality represented by the variable. Typical examples of nominal variables are gender, race, color, city, etc. b.Ordinal variables allow us to rank order the items we measure in terms of which has less and which has more of the quality represented by the variable, but still they do not allow us to say “how much more.” A typical example of an ordinal variable is the socioeconomic status of families. For example, we know that upper-middle is higher than middle but we cannot say that it is, for example, 18% higher. Also this very distinction between nominal, ordinal, and interval scales itself represents a good example of an ordinal variable. For example, we can say that nominal measurement provides less information than ordinal measurement, but we cannot say “how much less” or how this difference compares to the difference between ordinal and interval scales. c.Interval variables allow us not only to rank order the items that are measured, but also to quantify and compare the sizes of differences between them. For example, temperature, as measured in degrees Fahrenheit or Celsius, constitutes an interval scale. We can say that a temperature of 40 degrees is higher than a temperature of 30 degrees, and that an increase from 20 to 40 degrees is twice as much as an increase from 30 to 40 degrees. d.Ratio variables are very similar to interval variables; in addition to all the properties of interval variables, they feature an identifiable absolute zero point, thus they allow for statements such as x is two times more than y. Typical examples of ratio scales are measures of time or space. For example, as the Kelvin temperature scale is a ratio scale, not only can we say that a temperature of 200 degrees is higher than one of 100 degrees, we can correctly state that it is twice as high. Interval scales do not have the ratio property. Most statistical data analysis procedures do not distinguish between the interval and ratio properties of the measurement scales.0May 6, 2002 at 5:39 pm #75208
Marc RichardsonParticipant@Marc-RichardsonInclude @Marc-Richardson in your post and this person will
be notified via email.For example: If you check a diameter with a plug – gauge, is it attribute? The diameter ittself is a continuous variable (it matchs your definition), but the measurement (data) is not.
I would say it is attribute data, the focus here is on the measurement, not on what is being measured.
A go – no go gage (with MAX and MIN) is a typical case of what is usually called “attribute”, but you have three categories, not two (small, OK, and big).
I would call this discrete, categorical data.
Sometimes it is (was) used to use two go – no go gages for the same characteristic. One (more strict) was used to control the process, and the other (specs limit) to control the product. Then you had 5 categories: Too small (reject the product and correct the process), small (acept the product but correct the process), OK (go on), big, and too big. Is it attribute or variable?.
I would still call it discrete, categorical data.
I could put enough go – no go gages to have a better discrimination than a calipper, for example.
And now you are into the grey area you mentioned. At some point, if you had a set of pin gages in say, .0001 increments, you would have, for all practical intents and purposes, continuous data. But I think at this point we are splitting hairs. The important points are to make sure you have enough resolution in your measurement system, to assure it is calibrated, that the bias is not excessive, that it is stable over time and accurate throughout its measurement range and to assure that your measurements are repeatable and reproducible.
Marc Richardson
Sr. Quality Assurance Engineer0May 7, 2002 at 7:58 pm #75247Team,
Thank you for all the terrific answers.
1) Could some also tell me why the sampling methodologies are different in case of Continuous and discrete data?
2) Can someone forward a calculator for each?
3) Why do we prefer to monitor processes by Continuous data over discrete data?
Again Thanks to everyone for the great support you have extended in improving my understanding of the subject.0May 8, 2002 at 12:11 pm #75259
GabrielParticipant@GabrielInclude @Gabriel in your post and this person will
be notified via email.For me, as I said, the data is allways discrete because you have limitations in your measuring system to show infinte divisions (for example, is an axiom that there are no two parts exactly alike, but in a X-R chart a range of “zero” is possible just because your measurement is not good enough to detect the difference between the parts), but if the measuring system has a good discrimination (compared with the variation of the poulation you want to assess) then “continuous” is a good approximation. So let’s take the typical “Good-Bad” case as “discrete” and a measurement taken as “continuous” (like measuring with a caliper).
1) Could some also tell me why the sampling methodologies are different in case of Continuous and discrete data? 3) Why do we prefer to monitor processes by Continuous data over discrete data?
Simple: with each “continuous” data you have a lot more of infomation. Look the difference between saying “good” and “0.02mm from the target”, or between saying “bad” and “0.02mm above the upper specification limit”. With continuous data you can take a sample (lets’s say 200 parts) and calculate the mean and the std deviation and say “the average is 4.5 sigma from one spec limit and 7.5 sigma from the other, and the distribution has a normal shape”. With this you can guess that your process has about 3.4 non conforming parts per millon of parts, even when you didn’t find a single non onforming part in the sample. Due to statistical sampling error, the real population (process) may have lets say between 0.5 and 20 PPM. Can you imagine how many samples you should take to find such a fraction of non-conforming in the sample, to reach to the same conclution with attribute data? Not 200, nor 20000. Probably millions! If you want to monitor a process (lets take SPC), attribute data can tell you only if the fraction of non conforming in the sample is “normal” according to the normal process performance. This means that you are wiling to accept naturally non conforming product both in the process and in the sample. And, the lower the fraction of non conforming product, tha larger the sample. With “continous” data you can monitor how the process is performing (i.e. if there was a shift, if there are trends, iv variation has increased, etc) with a small sample and without the need to wait to manufacture one single non conforming part to find those “signales”.
2) Can someone forward a calculator for each?
I can’t, and I think that nobody can help on this because it depends on what you want to do with the sample (SPC, incomeing inspection, process study, DOE?).
Hope this helps.0May 9, 2002 at 8:18 pm #75347Gabriel.
Thanks for your valuable inputs!!0November 3, 2006 at 5:03 am #146458what is the definition of a discrete and a continuous data?
0August 28, 2009 at 9:33 pm #185069all,
Which has always the largest sample size “continuous” or “discrete” data?
ed0August 29, 2009 at 7:43 am #185071
aiman faridParticipant@aiman-faridInclude @aiman-farid in your post and this person will
be notified via email.The largest sample size is with “discrete” data.
0 -
AuthorPosts
The forum ‘General’ is closed to new topics and replies.