Elb Elb - 3 months ago 14
R Question

Subset character columns from a data frame of characters and numbers

I have a data frame composed of numeric and non-numeric columns.

I would like to extract (subset) only the non-numeric columns, so the character ones. While I was able to subset the numeric columns using the string:

sub_num = x[sapply(x, is.numeric)]
, I'm not able to do the opposite using the
is.character
form. Can anyone help me?

Answer

Ok, I did a short try about my idea.

I could confirm that the following code snippet is working:

str(d)
 'data.frame':  5 obs. of  3 variables:
  $ a: int  1 2 3 4 5
  $ b: chr  "a" "a" "a" "a" ...
  $ c: Factor w/ 1 level "b": 1 1 1 1 1


# Get all character columns
d[, sapply(d, class) == 'character']

# Or, for factors, which might be likely:
d[, sapply(d, class) == 'factor']

# If you want to get both factors and characters use
d[, sapply(d, class) %in% c('character', 'factor')]

Using the correct class, your sapply-approach should work as well, at least as long as you insert the missing , before the sapply function.

The approach using !is.numeric does not scale very well if you have classes that do not belong in the group numeric, factor, character (one I use very often is POSIXct, for example)