MFR MFR - 28 days ago 15
R Question

Doesn't grep function work with"("?

This is my dataset

userId source transactions
(dbl) (chr) (chr)
1 1 google / cpc, google / cpc 0, 1
2 2 (direct) / (none) 0
3 3 (direct) / (none) 1
4 4 google / organic, (direct) / (none) 0
5 5 google / organic 0
6 6 google / organic 0


I want to extract all of the rows contain
(direct) / (none)


and I wrote the following code:

output<-df[grep("(direct) / (none)", df$source),]


But it results in an out put with 0 observations, it work well with others such as
google / cpc
. What is wrong? Is it the problem with "("?

This is dput

dput(df)
structure(list(userId = c(1, 2, 3,
4, 5, 6, 7, 8,
9, 10), source = c("google / cpc, google / cpc",
"(direct) / (none)", "(direct) / (none)", "google / organic",
"google / organic", "google / organic", "(direct) / (none)",
"google / cpc, google / cpc, google / cpc, google / organic, google / cpc",
"(direct) / (none)", "(direct) / (none)"), transactions = c("0, 1",
"0", "1", "0", "0", "0", "0", "0, 0, 0, 0, 0", "0", "1")), .Names = c("userId",
"source", "transactions"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -10L))

Answer

( has a special meaning in regex. You should either escape it \\(

grep("\\(direct\\) / \\(none\\)", df$source)

or use fixed = TRUE which tells grep to interpret the pattern as-is.

grep("(direct) / (none)", df$source, fixed = TRUE)