RCN RCN - 11 months ago 76
R Question

numeric column names in R

I have a data frame as follows:

structure(list(`104` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "yes", NA, NA, NA, NA), `15` = c(NA,
NA, NA, NA, ">= 4.0", ">= 4.0", NA, "~ 2", "~ 2", "~ 2", "~ 2",
"~ 2", "~ 2", "< 2.2", "~2.75", NA, "~2.75", "~2.75", "~2.75",
"~2.75")), .Names = c("104", "15"), row.names = 45:64, class = "data.frame")

I know that it is not best practices to have numeric column names, however it is necessary in this circumstance. I have been manipulating my data frame through retrieving columns with a backtick `

Unfortunately, I found something funny in the above data frame.

> table(testtest$`10`)


However there is no column with a name of 10, so it looks like it is retrieving

> table(testtest$`104`)


I am nervous now, and do not trust that this may pop up again without my knowing for other columns such as

Any explanation would be helpful!

Answer Source

This is due to the partial matching. To avoid it, use [[ to extract the columns


while the correct column name gives the output

 #[1] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA  
 #[12] NA    NA    NA    NA    "yes" NA    NA    NA    NA 

According to ?"$"

Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.

In general, it is better not to have a numeric column name or names that start with numbers. We can append with a non-numeric character "X" with the convenient function make.names

names(testtest) <- make.names(names(testtest))
#[1] "X104" "X15"