Swiss12000 Swiss12000 - 4 months ago 16
R Question

Dynamic subset a dataframe after a pattern colname in R

Input (df)

> df
gender age LIST_12 LIST_24 LIST_42 anxious happy nervous
1 11 12 20 18 29 31 6 28
2 35 25 26 23 9 34 13 21
3 20 8 28 27 26 26 34 29
4 24 35 10 11 18 25 26 3
5 34 8 4 3 29 33 25 35


Desired output (dfSubset)

What would be the best way to get a subset containing only columns after LIST_ to the end. In this case I would like to subset only : anxious, happy and nervous column.

anxious happy nervous
1 31 6 28
2 34 13 21
3 26 34 29
4 25 26 3
5 33 25 35


Infos

I know that I can run the following code in order to subset only the column-names beginning with the word LIST_. But it's not what I am looking for...

dfSubset = subset(x = df, select = grep("LIST_", names(df)))
dfSubset


Reproducible source

df <- structure(list(gender = c(11L, 35L, 20L, 24L, 34L), age = c(12L,
25L, 8L, 35L, 8L), LIST_12 = c(20L, 26L, 28L, 10L, 4L), LIST_24 = c(18L,
23L, 27L, 11L, 3L), LIST_42 = c(29L, 9L, 26L, 18L, 29L), anxious = c(31L,
34L, 26L, 25L, 33L), happy = c(6L, 13L, 34L, 26L, 25L), nervous = c(28L,
21L, 29L, 3L, 35L)), .Names = c("gender", "age", "LIST_12", "LIST_24",
"LIST_42", "anxious", "happy", "nervous"), class = "data.frame", row.names = c(NA,
-5L))

Answer

You could find which column is the last one beginning with LIST, add 1, and use that number to begin a sequence to the number of columns.

df[(max(grep("^LIST", names(df))) + 1):ncol(df)]
#   anxious happy nervous
# 1      31     6      28
# 2      34    13      21
# 3      26    34      29
# 4      25    26       3
# 5      33    25      35