Unpaired (Two Sample) t Test
Menu location: Analysis_Parametric_Unpaired t.
This function gives an unpaired two sample Student t test with a confidence interval for the difference between the means.
The unpaired t method tests the null hypothesis that the population means related to two independent, random samples from an approximately normal distribution are equal (Altman, 1991; Armitage and Berry, 1994).
Assuming equal variances, the test statistic is calculated as:
- where x bar 1 and x bar 2 are the sample means, s² is the pooled sample variance, n1 and n2 are the sample sizes and t is a Student t quantile with n1 + n2 - 2 degrees of freedom.
Power is calculated as the power achieved with the given sample sizes and variances for detecting the observed difference between means with a two-sided type I error probability of (100-CI%)% (Dupont, 1990).
The unpaired t test should not be used if there is a significant difference between the variances of the two samples; StatsDirect tests for this and gives appropriate warnings. For the situation of unequal variances, StatsDirect calculates Satterthwaite's approximate t test; a method in the Behrens-Welch family (Armitage and Berry, 1994).
Assuming unequal variances, the test statistic is calculated as:
- where x bar 1 and x bar 2 are the sample means, s² is the sample variance, n1 and n2 are the sample sizes, d is the Behrens-Welch test statistic evaluated as a Student t quantile with df freedom using Satterthwaite's approximation.
Note that is often more robust to use the nonparametric Mann-Whitney test as an alternative method in the presence of unequal variances.
Example
From Armitage and Berry (1994, p. 111).
Test workbook (Parametric worksheet: Low Protein, Heigh Protein).
Consider the gain in weight of 19 female rats between 28 and 84 days after birth. 12 were fed on a high protein diet and 7 on a low protein diet.
High protein | Low protein |
134 | 70 |
146 | 118 |
104 | 101 |
119 | 85 |
124 | 107 |
161 | 132 |
107 | 94 |
83 | |
113 | |
129 | |
97 | |
123 |
To analyse these data in StatsDirect first prepare them in two workbook columns and label these columns appropriately. Alternatively, open the test workbook using the file open function of the file menu. Then select the unpaired t test from the parametric methods section of the analysis menu. Select the columns marked "High protein" and "Low protein" when prompted for data.
For this example:
Unpaired t test
Mean of High Protein = 120 (n = 12)
Mean of Low Protein = 101 (n = 7)
Assuming equal variances
Combined standard error = 10.045276
df = 17
t = 1.891436
One sided P = 0.0379
Two sided P = 0.0757
95% confidence interval for difference between means = -2.193679 to 40.193679
Power (for 5% significance) = 82.25%
Assuming unequal variances
Combined standard error = 9.943999
df = 13.081702
t(d) = 1.9107
One sided P = 0.0391
Two sided P = 0.0782
95% confidence interval for difference between means = -1.980004 to 39.980004
Power (for 5% significance) = 40.39%
Comparison of variances
Two sided F test is not significant
No need to assume unequal variances
Thus we have a difference that is not quite significant at the 5% level. The most important information is, however, conveyed by the confidence interval. The 95% CI includes zero therefore we can not be confident (at the 95% level) that these data show any difference in weight gain. As most of the interval is toward weight gain and as the test result is in the grey "suggestive" 5%-10% zone we have good evidence for repeating this experiment with larger numbers. Bigger samples will probably shrink the range of uncertainty so that the confidence interval contracts to a narrower band that excludes zero.
N.B. We did not consider a one sided P value here because we could not be absolutely certain that the rats would all benefit from a high protein diet in comparison with those on a low protein diet.