# Hypothesis testing for discrete data

Six Sigma – iSixSigma › Forums › General Forums › General › Hypothesis testing for discrete data

- This topic has 12 replies, 6 voices, and was last updated 8 years, 12 months ago by khatri.

- AuthorPosts
- March 18, 2011 at 10:11 pm #53761

DraperParticipant@Don-Draper**Include @Don-Draper in your post and this person will**

be notified via email.What are the most oftened used hypothesis testing for discrete data?

0March 19, 2011 at 1:29 am #191349

StrayerParticipant@Straydog**Include @Straydog in your post and this person will**

be notified via email.0March 23, 2011 at 1:44 pm #191365

MBBinWIParticipant@MBBinWI**Include @MBBinWI in your post and this person will**

be notified via email.**Don Draper wrote:**What are the most oftened used hypothesis testing for discrete data?

The one that answers the question that I’m interested in, duh!

0March 23, 2011 at 1:46 pm #191366

MBBinWIParticipant@MBBinWI**Include @MBBinWI in your post and this person will**

be notified via email.**Straydog wrote:**Cat got your tongue, Dog? Sometimes I kill myself!!!!!!!!!!!!

0March 23, 2011 at 4:37 pm #191370

DraperParticipant@Don-Draper**Include @Don-Draper in your post and this person will**

be notified via email.MbbWii,

Can you do a two sample T test (testing the means) on discrete data?

0March 23, 2011 at 5:41 pm #191372

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.How discrete is discrete? If it is binary no, if it is Likert or count sure, go ahead.

If it is something other than binary and if you are really worried – run the t-test and then run the Wilcoxon-Mann-Whitney on the same data and see what you see. The t-test is robust with respect to non-normality but if the data gets too extreme the test can fail to detect a difference in mean location when one exists. The Wilcoxon works under all conditions that would be appropriate for a t-test but it does a better job (has higher power) in cases of extreme asymmetry.

If it’s binary and you are looking at samples from two independent populations then you would want to run a test on two proportions.

0March 24, 2011 at 7:59 pm #191381

DraperParticipant@Don-Draper**Include @Don-Draper in your post and this person will**

be notified via email.Yes the data are binary. Patient has sepsis, yes / or no.

Death would be binary as well correct?Suggestions, my good friend?

How can you show a statistical imrprovement in a baseline of septic patients , after implementing a new process.

**Robert Butler wrote:**How discrete is discrete? If it is binary no, if it is Likert or count sure, go ahead.

If it is something other than binary and if you are really worried – run the t-test and then run the Wilcoxon-Mann-Whitney on the same data and see what you see. The t-test is robust with respect to non-normality but if the data gets too extreme the test can fail to detect a difference in mean location when one exists. The Wilcoxon works under all conditions that would be appropriate for a t-test but it does a better job (has higher power) in cases of extreme asymmetry.

If it’s binary and you are looking at samples from two independent populations then you would want to run a test on two proportions.

0March 24, 2011 at 8:19 pm #191382

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.The answer is too long to try to summarize in a forum of this type, however, it is not terribly involved. The exact answer to you question can be found in the book

Statistical Methods for Rates and Proportions 3rd Edition – Fleiss, Levin, Paik – Chapter 9 The Comparison of Proportions from Several Independent Samples – subsection 9.3 Gradient in proportions: samples qualitatively ordered – which is the situation you have – controls, least likely, sepsis new method – more likely, sepsis old method – most likely.

Based on your post the proportions would be: number of deaths/total number in population.0March 24, 2011 at 10:10 pm #191383

DraperParticipant@Don-Draper**Include @Don-Draper in your post and this person will**

be notified via email.Robert,

Can this be completed in Minitab 16?

Have you run such a hypothesis test?

0March 25, 2011 at 12:30 pm #191386

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.I don’t know Minitab so I can’t comment on its capabilities, however, my guess would be that it can do this – at least the first part anyway.

The first part of this kind of analysis is running a chi-square test on an mx2 contingency table. If that statistic is significant then the second part consists of the calculations needed to identify the proportions contributing to the significant difference.

Just to see if Minitab can run the first part try this sample problem (from Fleiss pp.189)

Group Smoker Count

1 0 3

1 1 83

2 0 3

2 1 90

3 0 7

3 1 129

4 0 12

4 1 70The table check would be group x smoker and the questions is the significance of the ratio of smokers to non-smokers.

If Minitab can handle mx2 contingency tables you should get a chi-square value of 12.6 and the p-value should be .0056.

The next piece would be identifying the samples or groups of samples contributing to the significance and you would need to talk to the folks at Minitab to find out if it is possible and if so how to proceed.

As for running this kind of analysis – yes – many times. I do my work in SAS.

0March 28, 2011 at 2:57 pm #191395

Andrew BanksParticipant@BBinNC**Include @BBinNC in your post and this person will**

be notified via email.Robert:

Minitab is fine with mx2 tables, either in the format you provided or in two-way format. Just have to select “cross-tab & Chi square” for the table you provided. My results match your identically (Chi square 12.6 & p=0.006).

“The next piece would be identifying the samples or groups of samples contributing to the significance and you would need to talk to the folks at Minitab to find out if it is possible and if so how to proceed.”

I had Minitab calculate the contribution to chis square for each cell, and just by looking indicates that group 4 non-smoker contributes 9.05 to the statistic. The next highest are on the order of 1. This indicates that the proportion of non-smokers with cancer in group 4 is unexpectedly high, correct (12 versus an expected count of 5.16).

Is there a next step?

0March 28, 2011 at 4:14 pm #191396

Andrew BanksParticipant@BBinNC**Include @BBinNC in your post and this person will**

be notified via email.As long as the data has only two proportions (i.e. before & after an intervention) you can use the 2-proportions test in Minitab, which uses either raw data OR summarized data.

The 2-Proportions test uses Fisher’s Exact Test which can be more accurate than the Chi Square test when counts are small (like counts of undesirable infections).

0April 6, 2011 at 10:11 am #191422

khatriParticipant@anilkhatri**Include @anilkhatri in your post and this person will**

be notified via email.Hi,

If you have a discrete data, best test you can go for is Chi square test (for both Y&X are discrete). when u go to minitab, there are three test under it…

1. Cross tabulation; used when both Y&X are in alphabet/character

2. 1 Way variable : used when Y in number and X in character/alphabet

3. 2 way : used when both Y & X are in number.Secondly, if ur Y is discrete and X is continuous, then u can go for Binary logistics regression (BLR). Please note, for this Y should be binary in nature.

Hope this help.

Regards

Anil0 - AuthorPosts

You must be logged in to reply to this topic.