Diggy Detroit Diggy Detroit - 15 days ago 7
R Question

subset dataset based on multiple column name criteria

I want to know how to subset a dataframe based on multiple column name criteria. The reason is, I have a dataframe as shown below.

id team_col_code1 team_col_code2 team_col_code3......team_col_code78 Gender State team_cost_code1 team_cost_code2 team_cost_code3......team_cost_code43


I am trying to subset this dataframe such that the new dataset contains only columns containing column names containing the word col or id or Gender

I am able to create a subset based on column names containing the keyword col as shown below

new_Df <- df[, grep("col", colnames(df))]


I am not sure how to include the other two columns , id and Gender , into this subset such that the new dataset looks like this below

Expected output

id team_col_code1 team_col_code2 team_col_code3......team_col_code78 Gender


Any help is much appreciated. Thanks.

Answer

It can be as straightforward as

df[c("id", grep("col", names(df), value = TRUE), "Gender")]