Jan Stanstrup - 1 year ago 66

R Question

I was wondering if there is a way to turn a logical matrix of comparisons into a letter notation as used in multiple comparisons test. As in

`multcomp::cld`

The data I have looks like this:

`test_data <- data.frame(mean=c(1.48, 1.59, 1.81,1.94),CI_lower=c(1.29,1.38,1.54, 1.62),CI_upper=c(1.56,1.84, 2.3, 2.59))`

mean CI_lower CI_upper

1 1.48 1.29 1.56

2 1.59 1.38 1.84

3 1.81 1.54 2.30

4 1.94 1.62 2.59

What I am interested in is a notation that says which entries have overlapping CIs to get a final result that looks like this:

`final <- data.frame(mean=c(1.48, 1.59, 1.81,1.94),CI_lower=c(1.29, 1.38,1.54, 1.62),CI_upper=c(1.56,1.84, 2.3, 2.59),letters = c("a","ab","ab","b"))`

mean CI_lower CI_upper letters

1 1.48 1.29 1.56 a

2 1.59 1.38 1.84 ab

3 1.81 1.54 2.30 ab

4 1.94 1.62 2.59 b

I made a pitiful attempt that went like this:

`same <- outer(test_data$CI_lower, test_data$CI_upper,"-")`

same <- same<0

same <- lower.tri(same, diag = FALSE) & same

same_ind <- which(same,arr.ind = T)

groups <- as.list(as.numeric(rep(NA,nrow(test_data))))

for(i in 1:nrow(same_ind)){

group_pos <- as.numeric(same_ind[i,])

for(i2 in group_pos){

groups[[i2]] <- c(groups[[i2]],i)

}

}

letters_notation <- sapply(groups,function(x){

x <- x[!is.na(x)]

x <- letters[x]

x <- paste0(x,collapse="")

return(x)

}

)

which would gives this:

`mean CI_lower CI_upper letters`

1 1.48 1.29 1.56 ab

2 1.59 1.38 1.84 acd

3 1.81 1.54 2.30 bce

4 1.94 1.62 2.59 de

Any ideas for how to do this?

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

From David Arenburg's suggestion and this http://menugget.blogspot.it/2014/05/automated-determination-of-distribution.html nice write-up I found a solution.

```
library(igraph)
test_data <- data.frame(mean=c(1.48, 1.59, 1.81,1.94),CI_lower=c(1.29,1.38,1.54, 1.62),CI_upper=c(1.56,1.84, 2.3, 2.59))
n <- nrow(test_data)
g <- outer(test_data$CI_lower, test_data$CI_upper,"-")
g <- !(g<0)
g <- g + t(g) # not necessary, but make matrix symmetric
g <- g!=1
rownames(g) <- 1:n # change row names
colnames(g) <- 1:n # change column names
# Re-arrange data into an "edge list" for use in igraph (i.e. which groups are "connected") - Solution from "David Eisenstat" ()
same <- which(g==1)
g2 <- data.frame(N1=((same-1) %% n) + 1, N2=((same-1) %/% n) + 1)
g2 <- g2[order(g2[[1]]),] # Get rid of loops and ensure right naming of vertices
g3 <- simplify(graph.data.frame(g2,directed = FALSE))
# Calcuate the maximal cliques - these are groupings where every node is connected to all others
cliq <- maximal.cliques(g3) # Solution from "majom" ()
cliq2 <- lapply(cliq, as.numeric)
# Reorder by level order - Solution from "MrFlick" ()
ml<-max(sapply(cliq, length))
reord <- do.call(order, data.frame(
do.call(rbind,
lapply(cliq2, function(x) c(sort(x), rep.int(0, ml-length(x))))
)
))
cliq <- cliq[reord]
cliq
# Generate labels to factor levels
lab.txt <- vector(mode="list", n) # empty list
lab <- letters[seq(cliq)] # clique labels
for(i in seq(cliq)){ # loop to concatenate clique labels
for(j in cliq[[i]]){
lab.txt[[j]] <- paste0(lab.txt[[j]], lab[i])
}
}
```

```
unlist(lab.txt)
[1] "a" "ab" "ab" "b"
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**