Can anyone please explain to me the difference between choosing to use an independent t-test or a paired t-test? Help! Thanks!
Lets take for an example 1-Sample ‘T’ test. This test performs a one sample t-test, computes a confidence interval and performs a hypothesis test of the mean when the population standard deviation (sigma) is unknown.
Paired “T” test - This is appropriate for testing the difference between two means when the data are paired and the paired differences follow a normal distribution. Paired “T” test gives a confidence interval and performs a hypothesis test of the difference between population means when observations are paired. It matches outputs that are dependent. This allows to account for variability between the pairs. Thus there error term is usually small.
Paired comparisons are used when the difference between sample variances is large or if it is unreasonable to treat population variances as being equal.
lIn paired comparisons we work with the differences of paired observations.
lUsing these differences then one can perform the one-sample t test to determine if the mean is significantly different than a specified delta (d)i.e. Mu1 – Mu2. lPaired comparisons can be used for before and after analysis of independent populations.
They also can be used in measurement system studies to see if operators are getting the same mean value across the same set of samples.
We use Two-sample “T” test when the samples are drawn independently from two populations.
Hope it helps.
Hermant makes a good start at the technical distinction. A couple things I would add would be:
1 sample t: compares a sample mean against a criterion (a fixed number)
2 sample t: compares two sample means, when the data is just two ‘pools’ of numbers with no special ordering or relationship
paired t: evaluates two sets of numbers when each number in one set has a matching ‘partner’ in the other set. It focuses on the differences between each pair, making more sensitive test as it ‘cancels out’ background differences that might be moving both members of each pair around.
If anyone should still be working to get their mind around that – I’ve always loved the story that George Box used to tell to convey the idea simply. It goes something like this —
Suppose we have a new synthetic material for making soles for shoes and we have enough to make about 100 shoes. We’d like to compare the new material with leather – using some energetic kids who are willing to wear test shoes and return them after a time for our study.
First question – what’s the best way to ‘spend’ our test material and distribute it to the kids?
We’d probably avoid the weakest plan – giving 50 kids synthetic sole shoes and 50 kids leather shoes and then collect them back, comparing the average wear in each group (this would be a two-sample t test). We’d realize that this plan suffers because of all the variation in the type of play and degree of wear due to each child. All that variation is unaccounted for when we try to see a difference in the mean of each pile of shoes. That’s the weakness of a two-sample t test.
Some intermediate plans might include trying to track how much use kids gave the shoes, etc — but the best plan is probably to give each child one shoe synthetic and one shoe leather. This makes each kid a miniature experiment unto themselves (formally called ‘blocking’ in the experimental design here). When they return the shoes we *don’t* want to put them in two piles (which would give away our new precision) — but we want to note the *difference* in wear for each pair. That’s a paired-t test. More powerful, because kid-to-kid differences in activity are ‘cancelled out’ by our focusing on just the difference in wear. If there is truly no difference in the two materials (the null hypothesis) then the differences will bounce around zero and wash each other out over the long run (giving a t near xero and a high p value). If there truly is a difference favoring one material, the delta in wear will be biased to favor that material, sometimes a little difference (for sedentary kids) and sometimes large (for active kids) but the sum of deltas will stack up away from zero (giving a large t and a small p value).
That’s the ‘allegory’ version of two sample vs paired t that I”ve found helps many to ‘get it,’
Oh — one more thing — while we ‘blocked’ by giving each kid one of each shoe — would be give each left-leather and right-synthetic?
No- because there is ‘footedness’ (just look at the heels on your shoes or the wear on gym equipment). Best to randomize the assignment of material to foot. This shows another nice DOE concept — ‘block what you can and then randomize within the blocks’
Hope that helps
THANK YOU!!!!! You finally made sense of a confusing subject!
I’m not having any problems with the math in statistics but my professor seems to be lacking in explaining why & when to use each formula. If I were the only one confused, I would say it was me but an entire class is lost!
Between you & Hermant, I can finally see the light!
Hi Dave and Hemant!
That was a very simple way of explaining a deep concept.Great.
I am with a contact center which has multiple clients.We are trying to compare the CSAT performances of our teams between 2 different programs(clients).Can you describe how to do the independedt or paired t test in this scenario?
The forum ‘Software/IT’ is closed to new topics and replies.