Unpaired (Two Sample) t Test

 

Menu location: Analysis_Parametric_Unpaired t.

 

This function gives an unpaired two sample Student t test with a confidence interval for the difference between the means.

 

The unpaired t method tests the null hypothesis that the population means related to two independent, random samples from an approximately normal distribution are equal (Altman, 1991; Armitage and Berry, 1994).

 

Assuming equal variances, the test statistic is calculated as:

- where x bar 1 and x bar 2 are the sample means, s² is the pooled sample variance, n1 and n2 are the sample sizes and t is a Student t quantile with n1 + n2 - 2 degrees of freedom.

 

Power is calculated as the power achieved with the given sample sizes and variances for detecting the observed difference between means with a two-sided type I error probability of (100-CI%)% (Dupont, 1990).

 

The unpaired t test should not be used if there is a significant difference between the variances of the two samples; StatsDirect tests for this and gives appropriate warnings. For the situation of unequal variances, StatsDirect calculates Satterthwaite's approximate t test; a method in the Behrens-Welch family (Armitage and Berry, 1994).

 

Assuming unequal variances, the test statistic is calculated as:

- where x bar 1 and x bar 2 are the sample means, s² is the sample variance, n1 and n2 are the sample sizes, d is the Behrens-Welch test statistic evaluated as a Student t quantile with df freedom using Satterthwaite's approximation.

 

Note that is often more robust to use the nonparametric Mann-Whitney test as an alternative method in the presence of unequal variances.

 

Example

From Armitage and Berry (1994, p. 111).

Test workbook (Parametric worksheet: Low Protein, Heigh Protein).

 

Consider the gain in weight of 19 female rats between 28 and 84 days after birth. 12 were fed on a high protein diet and 7 on a low protein diet.

 

High protein Low protein
134 70
146 118
104 101
119 85
124 107
161 132
107 94
83  
113  
129  
97  
123  

 

To analyse these data in StatsDirect first prepare them in two workbook columns and label these columns appropriately. Alternatively, open the test workbook using the file open function of the file menu. Then select the unpaired t test from the parametric methods section of the analysis menu. Select the columns marked "High protein" and "Low protein" when prompted for data.

 

For this example:

 

Unpaired t test

Mean of High Protein = 120 (n = 12)

Mean of Low Protein = 101 (n = 7)

 

Assuming equal variances

Combined standard error = 10.045276

df = 17

t = 1.891436

One sided P = 0.0379

Two sided P = 0.0757

 

95% confidence interval for difference between means = -2.193679 to 40.193679

 

Power (for 5% significance) = 82.25%

 

Assuming unequal variances

Combined standard error = 9.943999

df = 13.081702

t(d) = 1.9107

One sided P = 0.0391

Two sided P = 0.0782

 

95% confidence interval for difference between means = -1.980004 to 39.980004

 

Power (for 5% significance) = 40.39%

 

Comparison of variances

Two sided F test is not significant

No need to assume unequal variances

 

Thus we have a difference that is not quite significant at the 5% level. The most important information is, however, conveyed by the confidence interval. The 95% CI includes zero therefore we can not be confident (at the 95% level) that these data show any difference in weight gain. As most of the interval is toward weight gain and as the test result is in the grey "suggestive" 5%-10% zone we have good evidence for repeating this experiment with larger numbers. Bigger samples will probably shrink the range of uncertainty so that the confidence interval contracts to a narrower band that excludes zero.

 

N.B. We did not consider a one sided P value here because we could not be absolutely certain that the rats would all benefit from a high protein diet in comparison with those on a low protein diet.

 

P values

confidence intervals