tluh tluh - 4 months ago 22
R Question

Subset R data.frame by index and name in one line

Sample data.frame:

structure(list(a = c(1, 2, 3), b = c(4, 5, 6), c = c(7, 8, 9)), .Names = c("a", "b", "c"), row.names = c(NA, -3L), class = "data.frame")


Output:

df
# a b c
# 1 1 4 7
# 2 2 5 8
# 3 3 6 9


I'd like to get the first and third columns, but I want to subset by name and also by column index.

df[, "a"]
# [1] 1 2 3

df[, 3]
# [1] 7 8 9

df[, c("a", 3)]
# Error in `[.data.frame`(df, , c("a", 3)) : undefined columns selected

df[, c(match("a", names(df)), 3)]
# a c
# 1 1 7
# 2 2 8
# 3 3 9


Are there functions or packages that allow for clean/simple syntax, as in the third example, while also achieving the result of the fourth example?

Answer

Mabe use dplyr?

For interactive use - i.e., if you know ahead of time the name of the column you want to select

library(dplyr)
df %>% select(a, 3)

If you do not know the name of the column in advance, and want to pass it as a variable,

x <- names(df)[1]
x
[1] "a"

df %>% select_(x, 3)

Either way the output is

#  a c
#1 1 7
#2 2 8
#3 3 9