agenis agenis - 11 months ago 53
R Question

test for existing row.names and col.names in data.frame

Is there a function to determine whether a data.frame has native row names and column names or just has automatically generated ones (1 2 3 4...) ? For column names, 'automatically' means for instance when you apply "" to a matrix..

For the row names, I figured out a workaround:

has.row.names = function(df) {
!all(row.names(df)==seq(1, nrow(df)))

However, for for the column names i don't see how to do it. The difficulty is that sometimes automated col.names start with V1 V2 etc, sometimes with X1., X2..

EDIT: Why I ask this question: I need to perform this test inside a more complex function (somewhat similar to the graphical output of a PCA) that will plot the row names and column names if existing, and if not it will create more suited new names.
So it should work for "any" data.frame, with no clue of the actual names..


Answer Source

Short version: The only time a data frame would not have column names is when the attribute "names" is NULL. So the simple way to check for the existence of column names in a data frame would be something like the following.

DFHasColNames <- function(x) {
# [1] TRUE
# [1] FALSE

Extended version: For row names, you can use .row_names_info(). With the default type = 1L, a negative sign indicates the row names were generated automatically.

# [1] 32   # row names were provided 
# [1] -150 # row names were generated automatically

You can also view other information by changing the type argument.

type integer. Currently type = 0 returns the internal "row.names" attribute (possibly NULL), type = 2 the number of rows implied by the attribute, and type = 1 the latter with a negative sign for ‘automatic’ row names.

.row_names_info(mtcars, type = 0)
## ... returns attr(mtcars, "row.names")
.row_names_info(iris, type = 0)
## [1]   NA -150

For column names, it's not so easy. Generally speaking, if you see all NA values for the column names, or names(x) returns NULL, the "names" attribute of x is not set and therefore x has no (column) names.

Otherwise, a prepended X usually means the names came from make.names(), which is used by data.frame() and read.table(), read.csv() and others.

m <- matrix(1:6, 2)
# [1] "X1" "X2" "X3"
#   X1 X2 X3
# 1  1  3  5
# 2  2  4  6

whereas you generally get a prepended V from
#   V1 V2 V3
# 1  1  3  5
# 2  2  4  6

However, this is not a rule. It depends on the class of the object you're passing to, and whether or not you have changed any of the default arguments. The best thing to do would be to sift through the many methods( to see if you can discover a pattern.