How does software normally get the p value?
November 27, 2001 at 10:36 am #28279
In almost every software, once the histogram or the normal probability plot are plotted, pvalue is given. How does this value calculated from ? the normal mean or any other data is not provided. only a row of data is key in.
Thank you.
0November 27, 2001 at 2:13 pm #70201You did not specify the test that the pvalue is associated with. It sort of sounds like you are talking about a test of normality.
There are several goodness of fit tests for normality – probably the most common are the ShapiroWilke Test (the RyanJoiner Test in MINITAB), the AndersonDarling A^2 Test, the KolmogorovSmirnov test, and the chisquare test.
The formulas are more complex than can be presented in this format, but they are all covered in Ralph B. D’Agostino & Michael A. Stephen’s book “Goodness ofFit Techniques”.
The ShapiroWilks test uses the correlation observed in the normal probability plot. The AndersonDarling and KolmogorovSmirnov tests both compare the empirical cdf to that of the best fit normal curve. The AndersonDarling test pays more attention to the tails than does the KS test.
D’Agostino & Stephen’s recommendations are that the ShapiroWilks test and AndersonDarling A^2 test are amoung the best to use. They indicate that the ShapiroWilks type tests are probably overall most powerful. MINITAB defaults to the AndersonDarling test, but I don’t really know why.
D’Agostino & Stephens go on to say:
“For testing for normality, the KolmogorovSmirnov test is only a historical curiosity. It should never be used. It has poor power in comparison to the above procedures.”
and
"For tetsing for normality, when a complete sample is available the chisquare test should not be used. It does not have good power when compared to the above tests."
0November 28, 2001 at 2:39 am #70208Dear Mr. Ken,
What I mean is that the pvalue for the Anderson , KS test or histogram is of the same formular? pvalue is the probability that we would have obtained the sample if Ho is true. What I understand is if we are testing for normality, to calculate the pvalue, we should have either the nominal mean or standard deviation for the population or some else that we can use to get the zvalue or tvalue before getting the respective pvalue from the tabulated z or t distribution data.
However in the software, what I did is I test a column of data with the normality test, either Anderson or KS test, then the normal probability plot with the A squared or Dvalue and the pvalue will be shown. The A squared for Anderson Test and the Dvalue for the KS test is calculated from the respective formulas but what about the pvalue for these tests?
Thank you for your explaination.
 
