To Null or Not to Null?
Six Sigma – iSixSigma › Forums › Old Forums › General › To Null or Not to Null?
 This topic has 19 replies, 11 voices, and was last updated 19 years, 9 months ago by Andrew M. Brody.

AuthorPosts

July 30, 2002 at 12:20 pm #29992
billybobParticipant@billybob Include @billybob in your post and this person will
be notified via email.Can someone please explain the Null and Fail to accept the Null statements related to the “P” value to me in plain English?
And three cheers for the Penn. Miners!
Thanks..and later,
Billybob0July 30, 2002 at 12:45 pm #77698
Ovidiu ContrasParticipant@OvidiuContras Include @OvidiuContras in your post and this person will
be notified via email.Your Null Hypothesis ,as I understand it ,is your initial statement (the suspect is innocent) .Always put what you want to prove (the suspect is guilty) in your Alternate Hypothesis .
The pvalue comes into play when asking the question : how sure are you that the suspect is innocent ?
If your pvalue is less than 0.05 , you are 5% or less sure that the suspect is innocent , and 95% or more that the suspect is guilty . Therefore you say that you reject the NH and accept the AH .
As pvalue is grater than 0.05 ,you have less confidence in accepting the AH and rejecting the NH .
Nevertheless pvalue of 0.05 is a sort of a “standard” ,you can use what value is best for you and for what you try to achieve .
Hope it helped…..0July 30, 2002 at 12:46 pm #77699
AnonymousParticipant@Anonymous Include @Anonymous in your post and this person will
be notified via email.If the “P” is low, the null has got to go!
0July 30, 2002 at 12:49 pm #77700Billybob –
The “null” term means “ain’t no difference”. That’s what we are testing. It is our hypothesis or theory or in down home our “fancy”.Since we only take a sample and the very next one we take COULD potentially change our mind, we can never accept that there is no difference. If we find one that is different, then we can reject the null and say they are different. Notice that any further samples won’t change our minds again, we’ve detected a difference! But if we don’t see this difference we will say that we “fail to reject the null”. Again, this statement leaves open the possibility that the next sample could show the difference and we will reject the null, but we have not any evidence yet to do so. Example:
You just got in a shipment of possums, say 100 of them. You say you believe they are all the same in gender, say male. That’s your null. So you get a “reference” possum, an old boar you’ve had for years. You compare 20 more possums that you randomly catch by the tail as they amble by. You look at them you know where and compare them to the “reference”. If they are anatomically the same as your reference, you will fail to reject. If one has a different set of plumbing, you will reject the null and conclude that they are not all males. By the way, you and I know that possums like to travel in mixed company, so it’s unlikely finding all twenty to be male will happen. If it does, you may have just chosen a bad sample. In that case you will have committed an error of the first kind. But that’s a discussion that will need more possums , raccoons and maybe even an odd chicken or two to finish!0July 30, 2002 at 12:50 pm #77701The null hypothesis is ALWAYS that “things” are the same. When testing normality, the null is that the distribution of interest is the same as the normal distribution. When comparing two distributions using a ttest, you assume they have the same central location. When testing a specific factor in a regression equation, the null is that the SLOPE is zero, hence no correlation. I know it gets confusing sometimes, but you simply have to remember what the null is in each case.
The pvalue is simply the probability that the null is true. In other words, do you have enough statistical evidence to show a difference from the null. In most cases, we choose a pvalue of .05 ( a 95% confidence) as the decision point.
Hope this helps!0July 30, 2002 at 1:17 pm #77703
AnonymousParticipant@Anonymous Include @Anonymous in your post and this person will
be notified via email.Ok, so if the pvalue is the probability that the null hypothesis is true, does that mean that if p = 0.95 we can say that we have 95% confidence in the null? The answer is NO!
In other words, the pvalue is NOT the probability of the null hypothesis itself, but rather the probability of obtaining the sample mean given that the null is true.0July 30, 2002 at 3:09 pm #77707Bill & Bob,
Think of the Pvalue as a percentage, i.e. pvalue of 0.05 = 5%. The Pvalue is then your chance of being wrong if you reject the null.
P.S. are Possums the US equivelent of our Scottish haggis?
Later David!
0July 30, 2002 at 3:27 pm #77709
billybobParticipant@billybob Include @billybob in your post and this person will
be notified via email.Folks,
Possums can’t be compaired to anything else. So let me try to raeson this out……If I look at a possum is say his Pvalue is .05 that he may be tonight’s supper then most likely I’ll find him on the plate?’
Later,
Billybob0July 30, 2002 at 4:02 pm #77714
Mike CarnellParticipant@MikeCarnell Include @MikeCarnell in your post and this person will
be notified via email.Billybob,
As several people have stated the null means no difference between the things being compared therefor write it with the equals sign. The decision comes with the alternate. Is it greater than, less than, etc.
Your .05 can be anything you choose it to be. The .05 is standard but if you are in something like pacemakers you might want to be using .01 or better. On the other hand if you are making those little umbrellas that go into drinks then you could run something like .2. It is a reflection of the amount of risk (of an alpha error) you are willing to assume. It will also affect your sample size that is required for a test. If you are willing to assume more risk then it requires less. In some businesses where time to market is more critical than 95% confidence (and there isn’t a life threatening situation) it may be more advantageous to accept more risk and increase your decision making velocity.
There are other times when you just do not have large amounts of data such as low volume production. You can run with the samaller sample sizes – since you have no choice. It will still give you some quantifiable level of risk/confidence which you do not if you just use experience and no data.
The number you choose becomes your decision point. If you draw out the normal distribution and draw in a line that leaves only .05 or 5% of the area under the curve in the tail area you have defined the rejection area. If your p value is less than the .05 then the line that designates that area has to be to the right of the .05. In that case you would reject the null. If it has a larger number than .05 then the line to designate the area is to the left and in statistical terms you “fail to reject” the null.
I wouldn’t get all hung up on the term fail to reject. It is purely the statistical terminology. When you make the decision on the data and move to the next step there is no difference in your action whether you fail to reject or you reject. It gets treated the same.
Good luck0July 30, 2002 at 6:05 pm #77721
GabrielParticipant@Gabriel Include @Gabriel in your post and this person will
be notified via email.Hi, Billybob. How are you doing with your GB training? (Or should I ask how is your GB training doing with you?)
I am not an expert on this, but this is what I understand (if I am wrong, please someone correct me to help both Billybob and me!).
The null hypothesis (Ho) is something that you would like to know but that you can’t prove. For example: Will this die deliver each number with exactly the same probability (1/6)? The only way to prove that would be to dice the die infinite times. Do these two machines produce parts at the same average? The only way to prove it would be to measure (free of error) every part produced or to be produced by each machine. Does this process deliver parts of the same size with a pressure of 5 than with a presure of 10? And the same question for a temperature of 40 or 80? And with the raw material supplied by A or by B? And the different combinations of this factors? (here you have a DOE). So Ho ussualy takes the shape of “A=B”, where A and B are the same statistic of two different (or potentially different) populations, or A is the statistic of a populations and B is what is thought that this statistic should be (like in the example of the dice).
Because you can’t prove Ho, you set an alternate hypothesis, Ha, that because of the shape of Ho ussualy takes one of the following shapes: a) “AB” or c) “A not equal to B”.
When you take smaples and get estimators for A and B you will not find that est(A)=est(B), even if A and B are actually equal, just because of chance (sampling error). You have two samples that you think that come from two populations with the same average. The averages of the samples will hardly be equal even when mean of the two populations are. So, no matter how close the estimators are, you can not be sure that the two populations’ statistics are equal. However, if the estimators are far one from each other, you can more than suspect that they come from different populations. Alpha is the risk of saying that they are from different populations when the difference was only by chance. If you make a test at a low alpha value (low risk, high confidence) you will ask for larger differences between the estimators to say that the populatuions’ statistics are different (or to reject the null Ho because the alternate Ha is very likely).
The “P” value is a statistic you get from the comparison of the estimators. It tells you how far could you go with alpha and still reject Ho and accept Ha.
For example, if you make a test at an alpha=0.05 and you find a P=0.08, then the difference between the two estimators is not big enough to assure, with a 95% of confidence (1alpha) that the populations’ statistics are different. But the difference would be enough if a 90% of confidence (alpha=0.1) was enough for you.
Now, not being able to reject Ho is not the same than saying that you will accept it, even when it is ussualy done this way (if you can reject it, then accept it). But, in fact, the best you can do if you don’t reject Ho is to say that you have a certain confidence that the two populations’ statistic do not differ by more than x.
An example (for concept only, without calculations):
You want to know if a die delivers as many odds as evens. Your Ho is it does (Prob(odd)=0.5), and your Ha is it does not (Prob(odd) different from 0.5). If you throw the die 100 times and find 80 odd and 20 even, this would hardly happen by chance if Ho was true, so the P value will be low, so you can say at a high confidence level that Ha is true, and then Ho is false.
Now if you find 53 odd 47 even, you can’t be so sure that odd is not as likely as even, but you can’t either say that they are. You have a high P value because, if Ho is true, such a result would be very expectable just by chance. But this result would be also very expectable if Prob(odd)=55, in which case Ho is not true, so you don’t have evidence neither to reject nor to accept Ho. What you can say with some high confidence is, for example, that Prob(odd) does not differ from 0.5 by more than 0.15. The risk of being wrong when saying this is the betta risk.0July 30, 2002 at 6:14 pm #77722
billybobParticipant@billybob Include @billybob in your post and this person will
be notified via email.Geepers…thanks for all the answers. I guess thats been my real problem all along; I don’t know Ho!
So I guess I’ll print off all the replys and when Mother and I sit on the dock tonight watching the sunset while chewing on a little possum jerky I’ll try to make some sense of all the replys.
Thanks everyone, Later
Billybob0July 30, 2002 at 6:30 pm #77724
GabrielParticipant@Gabriel Include @Gabriel in your post and this person will
be notified via email.Mike: Just a comment on “Failing to reject”.
Note that, using the same alpha level, rejecting the null with a sample of 5 is as safe as rejecting it with a sample of 500. Only that with 5 it will be harder. The problem with 5 is what we do if we can’t reject the null. Because it is hard, it is probable that this will happen even if the null is not true. But are 5 is too few to accept the null. You said a lot about the risk of rejecting the null (alpha), but waht about the risk of accepting the null (btta)?
Some times, when the test is long and/or expensive and/or you have reasons to think that you will be able to reject the null with a small sample, you make the test with a small sample knowing from the beginning that, if you fail to reject the null, you will have to make further investigation (as taking more samples) before reaching to a conclussion, and that you will not just accept the null because you can’t reject it in the first trial.
The example of the gender of the 100 possums is a good example. Your null is that thay are all males. Take 5, find 1 female, and you reject the null. If I don’t find 1 female in 5 I won’t sign that they are all male.0July 30, 2002 at 7:51 pm #77727Gabriel,
I disagree with your comments about continuing to collect samples until you reject the null hypothesis. The correct methodology would be to conduct multiple tests and look for consitency in the result of your inferential statistics.
As you increase your sample size you decrease the likelihood of a type II error and thusly you increase the Power of the test (1Beta). Looking at a greater detectable difference, increasing sample size, and minimizing variation are all standard ways of dropping type II error. The only caveat, with all else static, is that there is an inverse relationship between type I and II error. There is a higher likelihood of a type II error (and a lower overall Power) at an alpha of .01 vs. an alpha of .05.
An equally important concept to think about is what constitutes the alternate hypothesis. Standard pratice is to put not equal, but this needs to be really thought about. For instance, if I have two processes and I will switch only if there is, at the minimum equality, but prefer increased performance, I can’t make the alternate not equal. It would need to be equal and greater than. For a given level of significance, a onetailed test will be more powerful than a twotailed option since all of the alpha is concentrated in one specific tail vs. being split between the two.
Regards,
Erik0July 30, 2002 at 10:07 pm #77735
GabrielParticipant@Gabriel Include @Gabriel in your post and this person will
be notified via email.Erik
In my previous message I just wanted to point out that just accepting Ho every time you fail to reject it may not be a good practice. I didn’t intended to say that contnuing to collect samples was the right method.
However, I recognize that hypothesis testing is not my strong side. So I would appreciate if you could provide further explanation about why continuing to collect samples is not a good methodology.
For example, I want to know if a coin is fair or not. Ho: P(face)=0.5. I choose an alpha level, a minimum detectable difference, and a betta. I start trhowing the coin and after each time I write down the reult, calculate P (with all data since the first trial) and calculate betta (taking all the trials as sample size). If Palpha but the calculated betta is less than the betta I had chosen, I stop, accept Ho, and accept the betta risk (that the difference between the probability of face differs from 0.5 by more that I had declared “minimum detectable”) (if the “minimum detectable difference” is small, this will take a lot of samples even if the coin was actually fair). If P>alpha but betta is still to high, I go on with the next trial. Increasing the sample size with a constant alpha and a constant minimum detectable difference does reduce the betta risk, so soon or later I will reach to a conclusion.
Sorry if what I am saying is meaningless. What’s wrong with the previous method?
Thanks
Gabriel0July 31, 2002 at 4:20 am #77737
Mike CarnellParticipant@MikeCarnell Include @MikeCarnell in your post and this person will
be notified via email.Gabriel,
I obviously didn’t make my point very well. Let me give it another shot.
There is a huge amount of difference between running a hypothesis test with a sample size of 5 and 500. It doesn’t have anything to do with confidence. If I select an alpha of .05 and a beta of .10 and run a sample size of 5 the test will show a significant difference with a 2.3 sigma shift. If I use a 500 piece sample it will show significance witha 0.2 sigma shift. Without changing the alpha and beta and increasing sample size the sensitivity of the test increases (shows significance with less movement.)
My point was if I run with a larger alpha risk then I can make the same decision with less samples. If I choose to test for a 1.0 sigma shift at an alpha of .05 and a beta of .10, I need a sample size of 23. If I want to test for the same 1.0 sigma shift but I am willing to take more risk and set alpha at .20 and beta at .10 I only need a sample size of 15. In some situations the reduction of samples is important.
In situations where samples are at a premium, speed is important, etc. you can choose to accept more risk to expedite the decision.
As far as not rejecting the null or whatever decision you opt not to make you really only have two choices. You either reject it or you don’t. If you have set up a test properly, selected the alpha, beta and sensitivity (sample size) and you execute the plan why wouldn’t you make the decision?
If you make a decision to collect more data you have to consider other factors: first the test is becoming more sensitive as you collect more data and your process may not be the same process over an extended period of time without doing more analysis. Over time, more sources of variation are going to creep into the process and pooling data may be a bad decision all to itsself.0July 31, 2002 at 12:42 pm #77740
GabrielParticipant@Gabriel Include @Gabriel in your post and this person will
be notified via email.Sorry, Mike.
Hypothesis testing is not my strong side, so I am trying to understand.
I (think I) understand what you explain about how taking more risk help reducing the sample size. I also understand and agre that you reject Ho or don’t, no further choices. Just that (my view) not rejecting Ho is not always the same than accepting it. You can decide not to accept it nor to reject it, YET. And make further invetigation.
What I don’t understand is why should the test be necessarily set in a way that you have to reach allways to a final conclusion in the way reject/accept Ho.
Please help me on this, because I am not sure what I am saying is correct: Imagine the following scenario: I won’t reject Ho unless I can do it with a 95% of confidence (alpha 0.05) and I won’t accept Ho (or definitely decide not to reject it) unless I can prove, with a 90% of confidence (betta 0.10) that the difference is not grater than 1.0 sigma. According to your numbers, I would need a sample of 23. However, I actually do not know how large is the difference, if any, in what I am testing. If this difference happens to be 2.3 sigma a sample of 5 would be enough, and taking 23 to prove something that could have been proven with 5 would be wasting resources. So I try with 5. If I can rejet Ho at alpha 0.05 with that, great! But if I can’t, all I can say is that I am 90% confident that the difference is not grater than 2.3 sigma, and that’s not enough in this scenario. Based on the actual result of the test with 5, I can decide your next step. If I got a P of 0.07 (small, but not small enough for alpha 0.05) maybe I would go for another 5. On the other hand, if I got a P of 0.4, maybe I would go for the remaining 18 to get to 23.
Of course, I don’t mean that such a strategy is the right one allways. In fact, I wonder if it would ever be the right one. As you said, not if you have a stability issue (that makes that the individuals sampled may belong to different processes). But such an approach could (could?) be applied, for example, if you have a process you have been using up to now and that you know, and you try a new one with which you make a trial batch. May be for the process it is not a big deal to make 5 or 5000 parts, but testing can be an issue (long, expensive, etc.) As you have the full batch made at once, you won’t have the problem that, as you collect more data, things could have changed in the process.
In general, this approach would be appliable only when you are sure that you are collecting data allways from the same population (as you collect more and more), like taking the sample from the sames batch or from a stable process. In particular, this can be specially usefull when you suspect that there IS a significant difference (Ho will be finally rejected) and that this difference is pretty higher than what you are willing to accept as “minimum significant difference” and when, not sampling, but measuring or testing the sample is a problem.
By the way, it seems to me that if you decide that your conclusions will be only “reject Ho” or “can’t reject Ho”, but never “Accept Ho”, betta risk is just zero, regardless of anything else. You have no risk of accepting Ho when, in fact, the difference was greater than the “significant” one, because you are never accepting Ho.
What do you think?0July 31, 2002 at 7:05 pm #77749You ask is a p value of .95 means a confidence of 95%….. of course not. The confidence interval is 1p. You say the p value represents the probability of getting a sample mean if the null is true. That is correct, but essentially you are saying the same thing. For example, in a ttest, we assume the means of two distributions are the same. So what we are really testing is the distribution of AB. This distribution should have a hypothetical mean of 0, but we have a value. The standard deviation that we are interested in, is the standard deviation of the distribution of averages, which is the standard deviation of AB divided by the size of the sample(corresponding n’s). Hence the pvalue you get is the area under the curve for the given actual observed difference (the actual AB of sampled means) and represents the probability of getting this difference from two assumed equal distributions. The value of p equal to .05 is simply a convienent decision point and IT DOES relate to a confidence interval of 95%!
0August 2, 2002 at 2:09 pm #77796
billybobParticipant@billybob Include @billybob in your post and this person will
be notified via email.Hey everyone, thanks for your replies which gave me some insite to my pondering questions. But I recently sat down with my BB and we ran through Hypothesis Testing using my project as the souce of data. I have to say that this first hand showandtell using my project has help me understand the Ha / Ho / Null so much better. And after reading all your replies it made all the more sense and was so much easier to grab hold of. So again, thanks to you all.
Its like so many things we see encounter everyday, we can talk about problems, but until we actually touch them we never know the whole story.
Later, And don’t look at the Red Sox score from last night; “YIKES”!
Billybob0August 2, 2002 at 2:12 pm #77797
GeorgetteParticipant@Georgette Include @Georgette in your post and this person will
be notified via email.Thanks, Dave, for the fun response to what could be a dull question!!
And I also appreciate the simple, easytoremember phrase from Anonymous “If the pvalue’s low, the null’s got to go!” I learn a new thing to teach every time I go on this website!
0August 19, 2002 at 1:04 pm #78230
Andrew M. BrodyParticipant@AndrewM.Brody Include @AndrewM.Brody in your post and this person will
be notified via email.My statistical knowledge would fit in a thimble. However I remember first learning about the Null Hyp. about 10 years ago or so studying for the CQEIT. I had an interesting thought about our court system. If 95% is the AH (Guilty), and the Jury produces a Guilty verdict in a Capital case where the defendant’s verdict is based on the premise that his guilt is beyond all reasonable doubt. I would equate that with the AH which is significantly high but still leaves 5% doubt concerning guilt thus 5% of those executed are truly innocent (this could point to the actual number of innocent people being executed – There is a number , but I forget what it is.) This is also true for the NH and the actual guilty going free. Frankly, I’m not even sure this applies in this situation.
0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.