Kendall's Rank Correlation

 

Menu location: Analysis_Nonparametric_Kendall Rank Correlation.

 

Kendall's rank correlation provides a distribution free test of independence and a measure of the strength of dependence between two variables.

 

Spearman's rank correlation is satisfactory for testing a null hypothesis of independence between two variables but it is difficult to interpret when the null hypothesis is rejected. Kendall's rank correlation improves upon this by reflecting the strength of the dependence between the variables being compared.

 

Consider two samples, x and y, each of size n. The total number of possible pairings of x with y observations is n(n-1)/2. Now consider ordering the pairs by the x values and then by the y values. If x3 > y3 when ordered on both x and y then the third pair is concordant, otherwise the third pair is discordant. S is the difference between the number of concordant (ordered in the same way, nc) and discordant (ordered differently, nd) pairs.

 

Tau (τ) is related to S by:

 

If there are tied (same value) observations then τb is used:

- where ti is the number of observations tied at a particular rank of x and ui is the number tied at a rank of y.

 

In the presence of ties the statistic τb is given as a variant of τ adjusted for ties (Kendall, 1970). When there are no ties τb = τ. An approximate confidence interval is given for τb or τ. Please note that the confidence interval does not correspond exactly to the P values of the tests because slightly different assumptions are made (Samra and Randles, 1988).

 

The gamma coefficient is given as a measure of association that is highly resistant to tied data (Goodman and Kruskal, 1963):

 

Tests for Kendall's test statistic being zero are calculated in exact form when there are no tied data, and in approximate form through a normalised statistic with and without a continuity correction (Kendall's score reduced by 1).

 

Technical Validation

An asymptotically distribution-free confidence interval is constructed for τb or τ using the variant of the method of Samra and Randles (1988) described by Hollander and Wolfe (1999).

 

In the presence of ties, the normalised statistic is calculated using the extended variance formula given by Hollander and Wolfe (1999). In the absence of ties, the probability of null S (and thus τ) is evaluated using a recurrence formula when n < 9 and an Edgeworth series expansion when n ≥ 9 (Best and Gipps, 1974). In the presence of ties you are guided to make inferences from the normal approximation (Kendall and Gibbons, 1990; Conover, 1999; Hollander and Wolfe, 1999). Note that StatsDirect uses more accurate methods for calculating the P values associated with τ than does most other statistical software, therefore, there may be differences in results.

 

Example

From Armitage and Berry (1994, p. 466).

Test workbook (Nonparametric worksheet: Career, Psychology).

 

The following data represent a tutor's ranking of ten clinical psychology students as to their suitability for their career and their knowledge of psychology:

 

 

Career Psychology
4 5
10 8
3 6
1 2
9 10
2 3
6 9
7 4
8 7
5 1

 

To analyse these data in StatsDirect you must first enter them into two columns in the workbook. Alternatively, open the test workbook using the file open function of the file menu. Then select Kendall Rank Correlation from the Nonparametric section of the analysis menu. Select the columns marked "Career" and "Psychology" when prompted for data.

 

For this example:

 

Kendall's tau = 0.5111

Approximate 95% CI = 0.1352 to 0.8870

Upper side (H1 concordance) P = .0233

Lower side (H1 discordance) P = .9767

Two sided (H1 dependence) P = .0466

 

From these results we reject the null hypothesis of mutual independence between the career suitability and psychology knowledge rankings for the students. With a two sided test we are considering the possibility of concordance or discordance (akin to positive or negative correlation). A one sided test would have been restricted to either discordance or concordance, this would be an unusual assumption. In our example we can conclude that there is a statistically significant lack of independence between career suitability and psychology knowledge rankings of the students by the tutor. The tutor tended to rank students with apparently greater knowledge as more suitable to their career than those with apparently less knowledge and vice versa.

 

P values

reference list