iSixSigma

p values

Viewing 16 posts - 1 through 16 (of 16 total)
  • Author
    Posts
  • #28120

    JP
    Participant

    I’m having a problem understanding the concept of the ‘P-Value’ as it relates to correlation. If my understanding is correct, when evaluating a correlation with a low ‘P-Value’, then we say it’s significant. But what does this really mean? What does the value represent? Thanks

    0
    #69636

    kmb
    Participant

    I’m guessing that you’re discussing the p-value related to the correlation coefficient.  Of course, the p-value represents the probability of incorrectly rejecting the null hypothesis.  If the p-value is less than some significance level, alpha, (typically practitioners use an alpha of 0.05) then we say that the result is statistically significant (at the 5% level) – i.e. the probability of incorrectly rejecting the null hypothesis is less than 5%. 
    For the test I think you’re alluding to, it would indicate that we would reject the null hypothesis that rho (the true correlation coeff) is equal to zero, hence there may be some evidence to suggest that a linear relation is present.  Don’t ignore a scatter diagram though, of course!!!
    Hope this helps.

    0
    #69640

    Dave Strouse
    Participant

    KMB is exactly right, but maybe a further discussion will help.
    Imagine that there is a universe of points from the process you are studying. You take a sample of those to see if you can prove or disprove correlation.
    The truth that you are assuming is that there is no or null correlation.
    Now you develop some test or way to mathematically relate the sample to some statistic, in this case rho. You then compare it to some reference distribution.
    The probability (p) that you selected the sample in such a way that you got a sample that shows there is some correlation, i.e. that rho is not zero, when in fact it is zero, the truth you assumed ; is the p value.
    In other words it is the probability that your sample indicates that the state of nature in the universe is different than the truth you assumed when your assumption was the correct one. In this case it is the probability that the rho or correlation is zero when your sample indicates that it is not zero.
    Usually if we have a one in twenty (0.05) chance  of making the wrong decision, we are satisfied that there is a difference , i.e. statistical significence, and we reject the null hypothesis that there is no difference. You can set this level based on your need to be right.  In drug testing work for instance, a p= 0.01 is often used since the consequences of being wrong are much more severe than being wrong about a knob for a radio.
    Hope that helps.  
     

    0
    #69642

    Ken K.
    Participant

    What KMB and Dave are saying is . . .
    If the p-value for a correlation coefficient test is less than 0.05, it indicates that the correlation coefficient IS significantly different from zero (either positive or negative) at the alpha = 0.05 level.
    This means that there is some significant amount of linear relationship between your two variables of interest.
    This test uses a test statistic t0=[r*SQRT(n-2)]/SQRT(1-r^2). It has been proven that IF the true correlation coefficient is equal to zero, then t0 will follow the t-distribution with n-2 degrees of freedom.
    Extreme values of this t0 statistic are indicative that t0 does not actually follow a t-distribution with n-2 df, thus it indicates that the true correlation coefficient is NOT equal to zero.
    Extreme values of t0 are characterized by small p-values. Thus small p-values indicate that the true correlation coefficient is NOT equal to zero.
    Ken K.

    0
    #69647

    Vinny Tuccillo
    Member

    What KMB and Dave and Ken are saying (just kidding):
    In very, very basic terms, the p-value, if you take it as a percent, is the chance that the value you got (in this case the correlation coefficient) is due to random chance.  We usually want this to be a small number like 5% (p=.05).  If you get a correlation coefficient of .9, you want to know that the .9 is real and not just due to random variation – that’s where the p value can help you.
    Hope I didn’t confuse things more.

    0
    #69650

    Jairo Brandao
    Participant

    The p’value is a way to be sure with the decision, when we are using sample size. The p’value related to the correlation coefficient represents the probability of incorrectly rejecting the null hypotheses ( concept used a lot in advanced statistics analysis ).
    For example, results less than 5% ( 0,05) means that the results is statistically significant.Thanks

    0
    #69655

    Tab
    Member

    I agree with the other responses and therefore please excuse the redundancies in my comments below…

    Hypothesis tests are typically used in the Analyze Phase to identify the critical x’s (inputs) of a process.  Generally, these critical x’s are assumed to exist when we reject the null hypothesis.  The significance level (alpha) used in these hypothesis tests is often set at 95% (i.e., p-value = 0.05 threshold).  When a hypothesis test is performed, the p-value represents the probability of getting a sample as extreme (or worse) assuming the null hypothesis is true.  We therefore reject the null hypothesis when the p-value is less than the significance level we established.
    I hope this helps.

    0
    #69663

    Don Nanneman
    Participant

    The basic notion of p value is thisWhat is the probability (p) that the association seen in the data (in this case the correlation) would have been seen by chance (i.e. if in fact there is no relationship between the variables)
    Or more accurately what is the probability that you will again get this value for extent of correlation if in fact there is no correlation.If the p-value is 0.05 the common understanding is that the observed relationship would be expected in 5% of measurements if there was no correlation. – You would expect 1 out of 20 samples to sho a correlation when in fact the variables were not correlatedHope this helpsDon

    0
    #69667

    Cindy
    Participant

    I always struggled with p- values until I heard this “helper”
    first the null hyothesis is always no difference (ie for normality the sample is no different than the normal curve, for 2 sample t, the two samples are no different, etc)
    the alternate is always there is a difference
    then remember this saying : if the p is high (>.05) the null will fly – if the p is low (<.05) the null must GO!
     

    0
    #69672

    Mark Grubert
    Participant

    In statistics, you can virtually never get data from an entire population so you have to take samples. A P-value is just an indication that there is a high chance that the factor or data is significant although there are never any black or whites. How sensitively you want to analyze the data (or how thoroughly) is where you set your P-values at for significance. I’m not sure of your exact situation, but it sounds like, the higher you set your P-value, the more thorough you are trying to be, particularly during a DOE when you are trying to measure interaction effects. A P-value of .05 may show no significance. However a P-value of .20 may show significance. Basically, it’s an arbitrary, but well thought out, level to detect significance of your data. The higher, the more thorough.
    P-values in normality tests are something a little different. If they are above .05, that means chances are good your data is normal. If it is less than .05, chances are good it’s not and you should look for other options. Again, this is just an inference made from the sample of the population you took. But you better believe, it’s usually dead on.

    0
    #69673

    kmb
    Participant

    Just for completeness, there are other null hypotheses that may be tested for.  For example, we may investigate the null hypothesis that data may have come from a Normal distribution (via Anderson-Darling, Ryan-Joiner test, etc.) or that the process mean is equal to some specified value, e.g. H0: mu = 265g, etc. 
    Your rhyme is still valid, though one should always consider the sample size selected – would we be able to detect a difference, say, of a certain size if it were truly present?  That’s why we have power and sample size computational functionality as well, of course.

    0
    #69674

    Tito Andrei Fontanos
    Member

           The p-value is the smallest alpha level at which you declare significance when in fact there is none. Looking at your nomal curve, if you see that your p-value is more extreme than your alpha value (in other words, it is smaller in area than your alpha value) you reject your null hypothesis and accept your alternative. Makes sense since you already set your alpha to be small, say .01. An even smaller p-value would lead you to accept or declare significance.  
           But you must be careful when testing for rho greater than 0. Remember that it is easy to declare or reject significance if you have a few points sampled. Imagine you have just 3 points and judging from the way they are placed on a scatter plot, they lie in a relatively straight line. Would you be accepting this as having a linear relationship? What if you had a few more points? A good approach is to use a scatter plot in conjunction with your formal test (i.e. t-test).  

    0
    #69675

    J. Angulo
    Participant

    The p-value is simply the actual level of confidence provided by the model in question (regression, hypothesis  test, F test, difference of means, etc.). For example, if you decide to set you confidence level at 0.05, which means that you are willing to allow for a 5% change of erring in your final analysis (i.e. finding a significant association when one does not really exists), and the p-value is 0.0035, then your model “passes the test”. In fact, you could even state with a 1.000-0.035% confidence that your model is statistically significant.

    0
    #72212

    TCJ
    Member

    JP,
          As previously mentioned, when your P-value is low then the null must go, but the key question is “what is the null?”  If we are talking linear regression the null hypotheis is: Slope = 0 (X is independent of Y).  Changing the input (X) will not change the output (Y).  If P < 0.05, reject the aforementioned statement and conclude as the input changes so does my output.  Note, there is a 1 and 20 chance (p = 0.05) that you have made an error.

    0
    #72226

    P Diddy
    Participant

    P value is the amount charged by P-diddy for his concerts…!!
    P-diddy and Puff daddy concerts……..get yourself a great deal at .05 the value…..which is what we call the P-value..!! 

    0
    #93708

    MM
    Participant

    I teach my middle school-age son in this way that he can understand
    Smaller P ==> More Significant Difference ==> Believe Difference in the Results
    Bigger P ==> Less Significant ==> Do Not Trust the Difference in the Results becasue there is big change for random factors to make the results.

    0
Viewing 16 posts - 1 through 16 (of 16 total)

The forum ‘General’ is closed to new topics and replies.