p values
Six Sigma – iSixSigma › Forums › Old Forums › General › p values
- This topic has 15 replies, 15 voices, and was last updated 18 years, 6 months ago by
MM.
-
AuthorPosts
-
October 31, 2001 at 9:55 pm #28120
I’m having a problem understanding the concept of the ‘P-Value’ as it relates to correlation. If my understanding is correct, when evaluating a correlation with a low ‘P-Value’, then we say it’s significant. But what does this really mean? What does the value represent? Thanks
0October 31, 2001 at 10:26 pm #69636I’m guessing that you’re discussing the p-value related to the correlation coefficient. Of course, the p-value represents the probability of incorrectly rejecting the null hypothesis. If the p-value is less than some significance level, alpha, (typically practitioners use an alpha of 0.05) then we say that the result is statistically significant (at the 5% level) – i.e. the probability of incorrectly rejecting the null hypothesis is less than 5%.
For the test I think you’re alluding to, it would indicate that we would reject the null hypothesis that rho (the true correlation coeff) is equal to zero, hence there may be some evidence to suggest that a linear relation is present. Don’t ignore a scatter diagram though, of course!!!
Hope this helps.0November 1, 2001 at 1:08 pm #69640
Dave StrouseParticipant@Dave-StrouseInclude @Dave-Strouse in your post and this person will
be notified via email.KMB is exactly right, but maybe a further discussion will help.
Imagine that there is a universe of points from the process you are studying. You take a sample of those to see if you can prove or disprove correlation.
The truth that you are assuming is that there is no or null correlation.
Now you develop some test or way to mathematically relate the sample to some statistic, in this case rho. You then compare it to some reference distribution.
The probability (p) that you selected the sample in such a way that you got a sample that shows there is some correlation, i.e. that rho is not zero, when in fact it is zero, the truth you assumed ; is the p value.
In other words it is the probability that your sample indicates that the state of nature in the universe is different than the truth you assumed when your assumption was the correct one. In this case it is the probability that the rho or correlation is zero when your sample indicates that it is not zero.
Usually if we have a one in twenty (0.05) chance of making the wrong decision, we are satisfied that there is a difference , i.e. statistical significence, and we reject the null hypothesis that there is no difference. You can set this level based on your need to be right. In drug testing work for instance, a p= 0.01 is often used since the consequences of being wrong are much more severe than being wrong about a knob for a radio.
Hope that helps.
0November 1, 2001 at 1:53 pm #69642What KMB and Dave are saying is . . .
If the p-value for a correlation coefficient test is less than 0.05, it indicates that the correlation coefficient IS significantly different from zero (either positive or negative) at the alpha = 0.05 level.
This means that there is some significant amount of linear relationship between your two variables of interest.
This test uses a test statistic t0=[r*SQRT(n-2)]/SQRT(1-r^2). It has been proven that IF the true correlation coefficient is equal to zero, then t0 will follow the t-distribution with n-2 degrees of freedom.
Extreme values of this t0 statistic are indicative that t0 does not actually follow a t-distribution with n-2 df, thus it indicates that the true correlation coefficient is NOT equal to zero.
Extreme values of t0 are characterized by small p-values. Thus small p-values indicate that the true correlation coefficient is NOT equal to zero.
Ken K.0November 1, 2001 at 5:06 pm #69647
Vinny TuccilloMember@Vinny-TuccilloInclude @Vinny-Tuccillo in your post and this person will
be notified via email.What KMB and Dave and Ken are saying (just kidding):
In very, very basic terms, the p-value, if you take it as a percent, is the chance that the value you got (in this case the correlation coefficient) is due to random chance. We usually want this to be a small number like 5% (p=.05). If you get a correlation coefficient of .9, you want to know that the .9 is real and not just due to random variation – that’s where the p value can help you.
Hope I didn’t confuse things more.0November 1, 2001 at 5:29 pm #69650
Jairo BrandaoParticipant@Jairo-BrandaoInclude @Jairo-Brandao in your post and this person will
be notified via email.The p’value is a way to be sure with the decision, when we are using sample size. The p’value related to the correlation coefficient represents the probability of incorrectly rejecting the null hypotheses ( concept used a lot in advanced statistics analysis ).
For example, results less than 5% ( 0,05) means that the results is statistically significant.Thanks0November 1, 2001 at 5:51 pm #69655I agree with the other responses and therefore please excuse the redundancies in my comments below…
Hypothesis tests are typically used in the Analyze Phase to identify the critical x’s (inputs) of a process. Generally, these critical x’s are assumed to exist when we reject the null hypothesis. The significance level (alpha) used in these hypothesis tests is often set at 95% (i.e., p-value = 0.05 threshold). When a hypothesis test is performed, the p-value represents the probability of getting a sample as extreme (or worse) assuming the null hypothesis is true. We therefore reject the null hypothesis when the p-value is less than the significance level we established.
I hope this helps.0November 1, 2001 at 7:05 pm #69663
Don NannemanParticipant@Don-NannemanInclude @Don-Nanneman in your post and this person will
be notified via email.The basic notion of p value is thisWhat is the probability (p) that the association seen in the data (in this case the correlation) would have been seen by chance (i.e. if in fact there is no relationship between the variables)
Or more accurately what is the probability that you will again get this value for extent of correlation if in fact there is no correlation.If the p-value is 0.05 the common understanding is that the observed relationship would be expected in 5% of measurements if there was no correlation. – You would expect 1 out of 20 samples to sho a correlation when in fact the variables were not correlatedHope this helpsDon0November 1, 2001 at 7:39 pm #69667I always struggled with p- values until I heard this “helper”
first the null hyothesis is always no difference (ie for normality the sample is no different than the normal curve, for 2 sample t, the two samples are no different, etc)
the alternate is always there is a difference
then remember this saying : if the p is high (>.05) the null will fly – if the p is low (<.05) the null must GO!
0November 1, 2001 at 10:38 pm #69672
Mark GrubertParticipant@Mark-GrubertInclude @Mark-Grubert in your post and this person will
be notified via email.In statistics, you can virtually never get data from an entire population so you have to take samples. A P-value is just an indication that there is a high chance that the factor or data is significant although there are never any black or whites. How sensitively you want to analyze the data (or how thoroughly) is where you set your P-values at for significance. I’m not sure of your exact situation, but it sounds like, the higher you set your P-value, the more thorough you are trying to be, particularly during a DOE when you are trying to measure interaction effects. A P-value of .05 may show no significance. However a P-value of .20 may show significance. Basically, it’s an arbitrary, but well thought out, level to detect significance of your data. The higher, the more thorough.
P-values in normality tests are something a little different. If they are above .05, that means chances are good your data is normal. If it is less than .05, chances are good it’s not and you should look for other options. Again, this is just an inference made from the sample of the population you took. But you better believe, it’s usually dead on.0November 1, 2001 at 10:55 pm #69673Just for completeness, there are other null hypotheses that may be tested for. For example, we may investigate the null hypothesis that data may have come from a Normal distribution (via Anderson-Darling, Ryan-Joiner test, etc.) or that the process mean is equal to some specified value, e.g. H0: mu = 265g, etc.
Your rhyme is still valid, though one should always consider the sample size selected – would we be able to detect a difference, say, of a certain size if it were truly present? That’s why we have power and sample size computational functionality as well, of course.0November 2, 2001 at 12:47 am #69674
Tito Andrei FontanosMember@Tito-Andrei-FontanosInclude @Tito-Andrei-Fontanos in your post and this person will
be notified via email.The p-value is the smallest alpha level at which you declare significance when in fact there is none. Looking at your nomal curve, if you see that your p-value is more extreme than your alpha value (in other words, it is smaller in area than your alpha value) you reject your null hypothesis and accept your alternative. Makes sense since you already set your alpha to be small, say .01. An even smaller p-value would lead you to accept or declare significance.
But you must be careful when testing for rho greater than 0. Remember that it is easy to declare or reject significance if you have a few points sampled. Imagine you have just 3 points and judging from the way they are placed on a scatter plot, they lie in a relatively straight line. Would you be accepting this as having a linear relationship? What if you had a few more points? A good approach is to use a scatter plot in conjunction with your formal test (i.e. t-test).0November 2, 2001 at 1:57 am #69675
J. AnguloParticipant@J.-AnguloInclude @J.-Angulo in your post and this person will
be notified via email.The p-value is simply the actual level of confidence provided by the model in question (regression, hypothesis test, F test, difference of means, etc.). For example, if you decide to set you confidence level at 0.05, which means that you are willing to allow for a 5% change of erring in your final analysis (i.e. finding a significant association when one does not really exists), and the p-value is 0.0035, then your model “passes the test”. In fact, you could even state with a 1.000-0.035% confidence that your model is statistically significant.
0February 15, 2002 at 4:20 pm #72212JP,
As previously mentioned, when your P-value is low then the null must go, but the key question is “what is the null?” If we are talking linear regression the null hypotheis is: Slope = 0 (X is independent of Y). Changing the input (X) will not change the output (Y). If P < 0.05, reject the aforementioned statement and conclude as the input changes so does my output. Note, there is a 1 and 20 chance (p = 0.05) that you have made an error.0February 15, 2002 at 10:10 pm #72226
P DiddyParticipant@P-DiddyInclude @P-Diddy in your post and this person will
be notified via email.P value is the amount charged by P-diddy for his concerts…!!
P-diddy and Puff daddy concerts……..get yourself a great deal at .05 the value…..which is what we call the P-value..!!0December 19, 2003 at 3:35 pm #93708I teach my middle school-age son in this way that he can understand
Smaller P ==> More Significant Difference ==> Believe Difference in the Results
Bigger P ==> Less Significant ==> Do Not Trust the Difference in the Results becasue there is big change for random factors to make the results.0 -
AuthorPosts
The forum ‘General’ is closed to new topics and replies.