Mohere Mohere - 1 month ago 16
R Question

Iterate over a subset of column names

I am new to R but here I have a dataframe of multiple measurements of a couple of conditions, I would like to perform a nested loop over the columns of the same condition, test if they have two true measurements (not zero) at least, if so calculate the mean of these specific conditions in a new dataset.

> sample <- list(c(8,0,12,5,0,11), c(15,5,0,10,12,13), c(1,1,0,3,0,9),
c(11,9,8,0,4,7), c(12,5,5,0,9,0), c(1,7,2,0,8,0))
> sample <- as.data.frame(sample)
> colnames(sample) <- c("x.1","x.2","x.3","y.1","y.2","y.3")


> sample
x.1 x.2 x.3 y.1 y.2 y.3
1 8 15 1 11 12 1
2 0 5 1 9 5 7
3 12 0 0 8 5 2
4 5 10 3 0 0 0
5 0 12 0 4 9 8
6 11 13 9 7 0 0


My output dataset should ideally look like this:

> Newsample
x y
1 8 8
2 2 7
3 0 5
4 6 0
5 0 7
6 11 0

Answer

We define f_rowmean function:

f_rowmean <- function(y) apply(y,1, function(x) ifelse(sum(x!=0)>=2, mean(x), 0))

And then:

data.frame(x=f_rowmean(sample[,grep("x", names(sample))]), 
           y=f_rowmean(sample[,grep("y", names(sample))]))

   # x y
# 1  8 8
# 2  2 7
# 3  0 5
# 4  6 0
# 5  0 7
# 6 11 0

EDIT

As for OP's new problem statement (in comments), suppose your data set is in df1, then you could do:

res.cols <- c("CAOV-3 Reg", "CAOV-3 Mod", "OVCAR-3Reg", "OVCAR-4Reg", "VOA1056Reg", 
"VOA4698Reg", "VOA4698Mod", "TOV112DReg", "TOV112DMod", "TOV21G Mod", 
"HCC38 Reg", "HCC38 Mod")

res <- setNames(data.frame(matrix(0,nrow(df1),length(res.cols))), res.cols)
res <- sapply(res.cols, function(x) res[,x] <- f_rowmean(df1[,grep(x, names(df1))]))