Compute intraclass correlation to check non-independance of dyadic data

Purpose

I paraphrased Field et al. (2012, chapter 19) below to explain the underlying idea:

In a dyadic context, the ICC represents the proportion of the total variability in the outcome that is attributable to the dyad. It follows that if a dyad has had a big effect on the dyad members within it then the variability within the dyad will be small (the dyad members will behave similarly). As such, variability in the outcome within dyads is minimized, and variability in the outcome between dyads is maximized; Therefore, the ICC is large. Conversely, if the dyad has little effect on the dyad members then the outcome will vary a lot within dyads, which will make differences between dyads relatively small. Therefore, the ICC is small too. Thus, the ICC tells us that variability within levels of a contextual variable (in this case the dyad to which a dyad member belongs) is small, but between levels of a contextual variable (comparing dyads) is large. As such, the ICC is a good gauge of whether a contextual variable has an effect on the outcome.

The following intraclass correlation concerns the measurement of non-independence for indistinguishable members based on ANOVA techniques with interval level of measurement. Please refer to Kenny, Kashy and Cook (2006, pp.34-35) for more details.

Requirements

This script require R. You need the free software R : go to R download page

Your data file need to be in a dyadic structure : go to this page to see how restructure your table in python.

Code

Read .CSV file

In [2]:
data <- read.csv("input.csv", sep = ";")
head(data)
Dyadep1p2
186
253
372
485
587
656

Create new column with mean scores of the two participants

In [3]:
data$mean <- (data$'p1'+data$'p2')/2
head(data)
Dyadep1p2mean
1 8 6 7.0
2 5 3 4.0
3 7 2 4.5
4 8 5 6.5
5 8 7 7.5
6 5 6 5.5

Create new column with distances between the two participants

In [4]:
data$distance <- data$'p1'-data$'p2'
In [5]:
data
Dyadep1p2meandistance
1 8 6 7.0 2
2 5 3 4.0 2
3 7 2 4.5 5
4 8 5 6.5 3
5 8 7 7.5 1
6 5 6 5.5-1

Compute mean of all 2n scores

In [6]:
M <- mean(c(data$'p1',data$'p2'), na.rm = TRUE)
M
5.83333333333333

Compute (mean - M)² for each dyad

In [7]:
data$"(mean-M)²" <- (data$mean - M)^2
In [8]:
data
Dyadep1p2meandistance(mean-M)²
1 8 6 7.0 2 1.3611111
2 5 3 4.0 2 3.3611111
3 7 2 4.5 5 1.7777778
4 8 5 6.5 3 0.4444444
5 8 7 7.5 1 2.7777778
6 5 6 5.5 -1 0.1111111

Compute distance² for each dyad

In [9]:
data$"distance²" <- (data$distance)^2
In [10]:
data
Dyadep1p2meandistance(mean-M)²distance²
1 8 6 7.0 2 1.3611111 4
2 5 3 4.0 2 3.3611111 4
3 7 2 4.5 5 1.777777825
4 8 5 6.5 3 0.4444444 9
5 8 7 7.5 1 2.7777778 1
6 5 6 5.5 -1 0.1111111 1

Compute sum of (mean - M)²

In [11]:
SumMeanSquare2 <- sum(data$"(mean-M)²", na.rm = TRUE)
In [12]:
SumMeanSquare2
9.83333333333333

Compute sum of distance²

In [13]:
SumDistance2 <- sum(data$"distance²", na.rm = TRUE)
In [14]:
SumDistance2
44

Compute mean square between dyads (MSb)

In [15]:
MSb <- (2*SumMeanSquare2) / (length(data$Dyade)-1)
In [16]:
MSb
3.93333333333333

Compute mean square within dyads (MSw)

In [17]:
length(data$mean[!is.na(data$mean)])
6
In [18]:
MSw <- SumDistance2 / (2*length(data$mean))
In [19]:
MSw
3.66666666666667

Compute intraclass correlation (ICC)

In [20]:
ICC <- (MSb - MSw) / (MSb + MSw)
In [21]:
ICC
0.0350877192982455

Compute F value

In [22]:
F <- MSb / MSw
In [23]:
F
1.07272727272727

Compute p value

In [24]:
p <- pf(q=F, df1=length(data$Dyad)-1, df2=length(data$Dyad), lower.tail=TRUE)
p
0.541775047119164

if p-value <.05, scores are non-independant

Références

Field, Andy, Jeremy Miles, and Zoë Field. "Discovering statistics using R." (2012).
Kenny, David A., Deborah A. Kashy, and William L. Cook. Dyadic data analysis. Guilford press, 2006.