Gin_Salmon Gin_Salmon - 2 months ago 19
R Question

Sub-setting data table using & and grepl

I've got a data set with column names like:

names(d)
[1] "Code" "LX(RI)" "LX(VO)" "LX(MV)" "LX(WC189)" "LX(WC035)"
[7] "NX(RI)" "NX(VO)" "NX(MV)" "NX(WC189)" "NX(WC035)" "AX(RI)"
[13] "AX(VO)" "AX(MV)" "AX(WC189)" "AX(WC035)" "SX3I(RI)" "SXI(VO)"
[19] "SXI(MV)" "TX(RI)" "TX(VO)" "TX(MV)" "TX(WC189)" "TX(WC035)"


Each column has several thousand rows associated with it. What I want to do is use
grepl
to subset the data table's columns based on those ending with RI AND retaining the Code column.

Currently I've worked out how to subset all the RI columns into a new data.table, but I can't figure out how to include the Code column.

I have currently:

RI <- d[, grepl("\\(RI", names(d)), with = FALSE]


Which gives me what I want:

names(RI)
[1] "LX(RI)" "NX(RI)" "AX(RI)" "SX3I(RI)" "TX(RI)"


I've been trying (note that I have included &Code):

RI <- d[, grepl("\\(RI&Code", names(d)), with = FALSE]


Which I want to return a data table with the following columns:

[1] "LX(RI)" "NX(RI)" "AX(RI)" "SX3I(RI)" "TX(RI)" "Code"


The above is my desired output. The code however does nothing and returns an empty data table.

A couple of questions:


  • Can I use & in grepl? If so, is my example using & incorrect?

  • If not, are there any suggestions on how to subset for both RI columns and the Code?


Answer

Try this

ab <- c("Code","LX(RI)","LX(VO)","LX(MV)","TX(RI)","NX(RI)","NX(RI)")


ab[grepl("Code|RI",ab)]

[1] "Code"   "LX(RI)" "TX(RI)" "NX(RI)" "NX(RI)"