RCN RCN - 3 months ago 11
R Question

numeric column names in R

I have a data frame as follows:

structure(list(`104` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, "yes", NA, NA, NA, NA), `15` = c(NA,
NA, NA, NA, ">= 4.0", ">= 4.0", NA, "~ 2", "~ 2", "~ 2", "~ 2",
"~ 2", "~ 2", "< 2.2", "~2.75", NA, "~2.75", "~2.75", "~2.75",
"~2.75")), .Names = c("104", "15"), row.names = 45:64, class = "data.frame")


I know that it is not best practices to have numeric column names, however it is necessary in this circumstance. I have been manipulating my data frame through retrieving columns with a backtick `

Unfortunately, I found something funny in the above data frame.

> table(testtest$`10`)

yes
1
>


However there is no column with a name of 10, so it looks like it is retrieving

> table(testtest$`104`)

yes
1
>


I am nervous now, and do not trust that this may pop up again without my knowing for other columns such as
41
and
4100
.

Any explanation would be helpful!
Thanks

Answer

This is due to the partial matching. To avoid it, use [[ to extract the columns

testtest[["10"]]
#NULL

while the correct column name gives the output

 testtest[["104"]]
 #[1] NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA  
 #[12] NA    NA    NA    NA    "yes" NA    NA    NA    NA 

According to ?"$"

Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.


In general, it is better not to have a numeric column name or names that start with numbers. We can append with a non-numeric character "X" with the convenient function make.names

names(testtest) <- make.names(names(testtest))
names(testtest)
#[1] "X104" "X15" 
Comments