ajl
Should 2 sample t test have equal samples

Darth
Equal sample size is not needed for the 2 sample t test.  Only assumption is that the two groups are normally distributed.  You have the option of testing with the 2 samples having either equal or unequal variances.  But, each requires a different formula for computation of the t statistic.  In addition to deciding on an appropriate alpha for your test, I recommend that you also determine your Power based on the sample sizes to be sure that you can see any difference should it exist.
If you have no idea what I just said above, you better do some homework and research so that you do it correctly.

Hey Doc,
It was my understanding that the parametric HT generally had 3 underlying assumptions, that of normality (to which the test of means were largely robust), independence, and equal variance – with the latter being the one with the greatest consequences.  Based on your answer, it would appear my understanding is incomplete – could you elaborate a little for me on using the parametric HT with unequal means?  Thanks!

MBBinWI
Darth:  Leave the sarcasm to HeeBee.

Darth
No John, the assumption of equal variances applies to ANOVA not the two sample t test.  It can be done either way using a pooled s.d. if unequal.

Got it…Thanks Doc!

Szentannai
Hi ajl,
according to an emprical study done using Monte Carlo simulations and quoted in the book “The Statistical Sleuth” by Ramsey and Schafer the t-test is quite reliable even if the distributions are non-normal : “the normality assumption does not appear to be a concern even for small sample sizes, as long as the skewness is the same in the two samples and the SAMPLE SIZES ARE ROUGHLY EQUAL”Concerning the requirement that the stdev should be the same: “thus, as suggested by the theory, unequal population standard deviations have little effect on validity (of the test) if THE SAM;PLE SIZES ARE EQUAL”So, interestingly the answer to the question yes, this is probably the most important requirement for a two – sample t-test.regards
Sandor

shubhinetwork
Oh,
I agree with you about “Sample T Test” but “The Statistical Sleuth” also have capability for this.
Robert Butler
I can’t decide if you guys are just clowning around or are trying to be serious or perhaps a mixture of both.
Check any good stat book and you will find that the two sample t-test is robust with respect to normality, does not need equal samples and does not need equal variances.
If you need a reference Brownlee-Statistical Theory and Methodology in Science and Engineering 2nd edition pp. 297-303 has the details concerning samples and variance and Davies – The Design and Analysis of Industrial Experiments 2nd Edition pp. 51-54 has the details concerning normality.

Darth
I thought I gave that answer…nice and simple.  Hate guys that try to pile on and make it more confusing and show off.  Don’t always need them fancy schmancy books.  A quick Google will likely give you the answer.

StuW
Nobody has asked ajl if these experimental units can be related in some way, such that the actual approach should be a paired t-test?   In that case, equal sample sizes becomes the requirement, and the comparison is made on the difference of measurements made on “like” units.   Given the lack of statistical knowledge that is exhibited in many of the questions presented on this board, it would seem as if this would be first question to ask.
For independent samples, the requirement for equal sample sizes is not applicable, and my experience is that the variance assumption needs to be looked at in the context of the data and the specific problem being analyzed.

Taylor
Stuw agree with you, but/and John pointed out that “Independence” was a requirement for the 2 Sample T test. Not trying to bash but the poster did not ask about paired t-test.
Just me being an A\$\$ this morning, no reply necessary.
TGIF

Szentannai
The book cites an experiment, not theory. It is really simple – they generated two sets of random numbers with known characteristics then they took samples and applied a t-test. They repeteated this a lot of times for each pair of sets and how many of the confidence intervals actually contained the true difference between the means of the two sets. they repeated this for different distributions and sample sizes.Their experimental conclusion was, as I quoted, that the best results were achieved (most intervals contained the true difference) in the case where the sample sizes were not very different. It is an experimental result – take it or call it clowning and leave it.

MBBinWI
Oh, Robert – there you go again, being rational and reasonable.  You’re no fun at all.

