kin182 kin182 - 2 years ago 63
R Question

How to extract columns with same name but different identifiers in R

Sorry if it is too basic, but I am not familiar with R.

I have a data frame with multiple columns having the same column names, so after being imported to R, identifiers have been added. Something like this:

A = c(2, 3, 5)
A.1 = c('aa', 'bb', 'cc')
B = c(1, 2, 5)
B.1 = c('bb', 'cc', 'dd')

df = data.frame(A, A.1, A.2, B, B.1, B.2)

A A.1 A.2 B B.1 B.2
1 2 aa TRUE 1 bb TRUE
2 3 bb FALSE 2 cc TRUE
3 5 cc TRUE 5 dd TRUE

I would like to extract all columns that have
, regardless of the identifier extension so it becomes like:

A A.1 A.2
1 2 aa TRUE
2 3 bb FALSE
3 5 cc TRUE

I know we can

df2 = df[, c("A", "A.1", "A.2")]

But I have many of this type of columns so I do not want to type in individually. I am sure there are smart ways to do this.


42- 42-
Answer Source

Try this to get all the columns with names starting with "A"

df2 = df[, grepl("^A", names( df))]

R's extraction '['-function allows the use of logical indexing in its two-argument mode. You will find the regex functions in R very useful and may I recommend reading ?regex as well as looking for examples on SO and Rhelp Archives by @G.Grothendeick

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download