# Conflicting P-values

Six Sigma – iSixSigma Forums Old Forums General Conflicting P-values

Viewing 10 posts - 1 through 10 (of 10 total)
• Author
Posts
• #46301

HengYu
Participant

Please help me understand how to interpret P-values.  Why is it that sometimes in the tests, I can get low P-value i.e. less than 0.05 for the group of data, yet the P-values for all the factors in the group are higher than 0.05?  Does this make sense?

0
#152780

accrington
Participant

Can you give an example?

0
#152787

The Force
Member

This can happen since the effect of individual factors maybe different when there is interaction (groups or combinations) already.

0
#152820

Heng Yu
Participant

Are you saying that supposing items A1, A2, A3 ..etc individually in Factor A may have high P-values, though Factor A, as a group, can have low P-value because, combined with Factor B, it becomes significant?

0
#152824

HengYu
Participant

Here’s the sample:

Response Information
Variable           Value           Count
Shortshipped?   Y           25 (Event)
N           22834
Total                                    22859

Logistic Regression Table
Predictor                   Coef                 SE  Coef           Z             P
Constant                     -3.24636       0.284575    -11.41       0.000
DES Region
EUR                            -3.72838      1306.88           -0.00      0.998
NA                                0.182607   0.309653           0.59      0.555
SEA                              0.144157    0.313832          0.46      0.646
SWP                             0.456364    0.304302         1.50       0.134
WA                                0.143504    0.320717         0.45      0.655

Log-Likelihood = -189.329
Test that all slopes are zero: G = 12.226, DF = 5, P-Value = 0.032

0
#152852

BTDT
Participant

HengYu:The low p-values for “constant” and “all slopes are zero” do not mean that the regression is significant. Look at the help file in Minitab or do some research on the interpretation of the results of logistic regression.Cheers, BTDT

0
#152875

BTDT
Participant

HengYu:The regression is also not working very well because you have very few shortshipped orders and especially very few shortshipped to Europe.One idea is to use a test of proportion for Europe orders vs. everywhere else. This might be enough eveidence to show the stakeholders a reason to dig into the differences in handling orders to different regions. Also consider that different order destinations can be confused with different distribution centres.When we did a project like this one, we found that there was a major difference in the number of short shipments made to domestic and international destination. This was caused by the more rigourous requirement for customs documentation for international orders. Very few international orders were divided in order to book incremental sales by the end of a particular month.Cheers, BTDT

0
#152883

HengYu
Participant

BTDT,
Firstly, thanks for your replies.  But what does the slope test tell and what is ‘Constant’?
Why look at Europe when it is zero?  I have attached the table below to show u the absolute values of the shortshipped cases.

DES Region
SSPD Cases
OK Cases
Grand Total

SWP
9
3406
3415

NA
6
5484
5490

SEA
5
5201
5206

WA
4
4170
4174

AM
1
1710
1711

EUR

2863
2863

Grand Total
25
22834
22859

0
#152921

BTDT
Participant

HengYu:Thank you for posting your data. The situation is much more clear. The biggest problem is that you have very few observations. The zero observations from Europe are and extreme. Even though it violates some of the assumptions, I would advise that you do a Chi-square test (two-way table in worksheet) on this data. It shows (p-value = 0.049) that there is a non-random relationship between the ratio of short-shipments and region.Now that the test has shown a non-random difference, look at the processes to identify why these difference may be occurring. SSPD
Cases OK Cases Total
1 9 3406 3415
3.73 3411.27
7.422 0.008 2 6 5484 5490
6.00 5484.00
0.000 0.000 3 5 5201 5206
5.69 5200.31
0.084 0.000 4 4 4170 4174
4.56 4169.44
0.070 0.000 5 1 1710 1711
1.87 1709.13
0.406 0.000 6 0 2863 2863
3.13 2859.87
3.131 0.003Total 25 22834 22859Chi-Sq = 11.126, DF = 5, P-Value = 0.049
4 cells with expected counts less than 5.Cheers, BTDT

0
#153290

Rogers
Participant

First of all, I first looked at your data from a practical view.  Whenever you are trying to analyze short or missed shipments compared to total ships you must develop the relationship between them.  I added the % column to your chart:

DES Region
SSPD Cases
OK Cases
Grand Total
% Short

SWP
9
3406
3415
0.26%

NA
6
5484
5490
0.11%

SEA
5
5201
5206
0.10%

WA
4
4170
4174
0.10%

AM
1
1710
1711
0.06%

EUR

2863
2863
0.00%

Grand Total
25
22834
22859
0.11%
You really want to focus on the % as it really tells you from a practical standpoint that you have a higher incident rate in the SWP region.  Next- what is your null?  What are you trying to prove when running the logistic regression?  Based on the data you should pool EUR and re-run since the 0 p-value is less than .05.
Hope this helps.
Deb

0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.