Home › Forums › General Forums › Tools & Templates › Do I Treat My Data as Discrete or Continuous?
This topic contains 7 replies, has 7 voices, and was last updated by Robert Butler 3 months, 1 week ago.
Hello!
I’m measuring the opacity of plastic PET Preforms with the help of a sensor that reads how much light is being transmitted through the preform. the readings are in % with 99% means totally transparent and 0% totally opaque.
So my question is could i use this collected data as a continuous data?..Since each preform will be having a different % reading eg: preform 1—-60%
Preform 2—-62%
Preform 3—-68%
—————
—————
—————
Etc….
What do you think? Can your measurements have fractional values or just fixed values, considering that % is fractional?
@wess702 – like many situations, it depends. The fundamental issue is how many levels of data will you have? The more you have, the more “continuous” it is. You can have a meter that reads continuous data, but if the actual readings that you observe only have 2 levels, do you have continuous or discrete data?
Generally at least 5 levels are required, and as I say, the more, the better.
I’ve seen folks in the long ago past imply that % is a discrete measurement. That’s silly in mind. % conversion, % yield, % transmission, % etc can often be a continuous measurement. Just because precision of measurement may need to be improved doesn’t make something continuous.
Use means and medians and s.d. to get an understand of how your process performs.
It is correct to treat your % measurements as continuous data. A percentage is something that “can take any value” – 0% to 33.33% or 99.99287477% or 100%. That makes it continuous.
Graph it, like with a dot plot. If you see discrete bins, then it can create challenges for your analysis. I have no source to back this up, but it appears that if there are less than 9-12 discrete bins, then statistical tools that assume continuous analysis tend to behave badly especially when you attempt to validate the assumptions (like the behavior of the residuals). Once you get above 15 discrete buckets (and again, based on observation – – not industry sources), many continuous data analysis systems seem not to notice. I tell my students look for a spread in your dot plot like peanut butter on toast. If things are smeared fairly evenly (visually) and you don’t see discrete bins, you can attempt to analyze it as continuous. Just verify you meet the assumptions of the tool you are using.
Thanks Bhavani
@nai102 When you say ” then statistical tools that assume continuous analysis tend to behave badly especially when you attempt to validate the assumptions (like the behavior of the residuals)” I’m assuming you are referring to the residual patterns you get when the Y variable has a limited number of measures, for example 1-5 Likert scale rating.
If this is what you mean then there is nothing odd about the residual behavior – what you will see is a series of slanted bands that cross the 0 reference line. The number of bands will equal the number of scale values. This form of residual plot is well known. Searle wrote a short piece about this pattern back in 1988 in The American Statistician (I think it was the August issue).
There are two schools of thought with respect to treating this kind of data as continuous – one says you can’t and the other says you can and, as far as I know, both are equally valid. If you do treat this data as continuous you will have to contend with residual plots like the ones previously mentioned. There is also a paper (I can’t lay my hands on it right now) which recommends a P < .01 rather than a P < .05 as a cut point for significance. I don’t recall the specifics of the proof for why this should be so but it is the practice I follow.
© Copyright iSixSigma 2000-2017. User Agreement. Any reproduction or other use of content without the express written consent of iSixSigma is prohibited. More »