Bax Baxov Bax Baxov - 1 year ago 90
R Question

matches patterns in vector with strings in data frame

I have a data frame that contains two types cols and vector with names.
How select some rows in data frame matches with vector strings.

name = c("p4@HPS1", "p7@HPS2", "p4@HPS3", "p7@HPS4", "p7@HPS5", "p9@HPS6", "p11@HPS7", "p10@HPS8", "p15@HPS9")
expression = c(118.84, 90.04, 106.6, 104.99, 93.2, 66.84, 90.02, 108.03, 111.83)
dataset <-, expression))
nam <- c("HPS5", "HPS6", "HPS9", "HPS2")

The function should return date frame only for the specified lines
I try

but it didn't work

Answer Source

We can use paste with collapse on the 'nam', use it as pattern argument in grep, get the index and subset the 'dataset'

dataset[grep(paste(nam, collapse="|"), dataset$name),]

If we are using the OP's code, wrap the 'name' column inside a list or else the mapply will go through individual elements of 'name' and as the number elements are not the same in 'name' and 'nam', this will throw a warning about the longer argument not a multiple of length of shorter. The mapply will return a logical matrix from which we take the rowSums and check whether it is greater than 0 to get a logical vector for subsetting the rows.

dataset[rowSums(mapply(grepl, nam, list(dataset$name)))>0,]
