arielle arielle - 1 month ago 9
R Question

R: Calculating row mean based on column name partial match

I have a table that looks like this:

er er.1 as as.1 as.2 rt op
a 1 6 90 8 6 4 87
b 1 8 56 7 5 5 9
c 8 7 6 4 5 9 6
d 1 0 8 6 4 3 6
e 9 7 2 4 3 89 7


I would like to calculate the row mean between the columns with partially matching names, to give a result like this:

er as rt op
a 3.5 34.66666667 4 87
b 4.5 22.66666667 5 9
c 7.5 5 9 6
d 0.5 6 3 6
e 8 3 89 7


I did find some useful tips on this question:

Calculate row means based on (partial) matching column names

but it does not seem to be working for me. Here are the commands that I used:

test <- read.table("test.txt", header=TRUE, row.names=1)

colnames <- c("er", "er", "as", "as", "as", "rt", "op")

means <-sapply(colnames, function(x) rowMeans(test [, grep(x, names(test))] ) )


This last command gives me the following error:

Error in rowMeans(test[, grep(x, names(test))]) :
'x' must be an array of at least two dimensions


Here is the dput of my data frame:

structure(list(er = c(1L, 1L, 8L, 1L, 9L), er.1 = c(6L, 8L, 7L,
0L, 7L), as = c(90L, 56L, 6L, 8L, 2L), as.1 = c(8L, 7L, 4L, 6L,
4L), as.2 = c(6L, 5L, 5L, 4L, 3L), rt = c(4L, 5L, 9L, 3L, 89L
), op = c(87L, 9L, 6L, 6L, 7L)), .Names = c("er", "er.1", "as",
"as.1", "as.2", "rt", "op"), class = "data.frame", row.names = c("a",
"b", "c", "d", "e"))


Any idea why I am getting this error and how I could fix this?

Thank you!

Answer

We can split and get the rowMeans

sapply(split.default(df1, sub("\\..*", "", names(df1))), rowMeans)
#        as  er op rt
#a 34.66667 3.5 87  4
#b 22.66667 4.5  9  5
#c  5.00000 7.5  6  9
#d  6.00000 0.5  6  3
#e  3.00000 8.0  7 89