R Question

dplyr join warning: joining factors with different levels

When using the join function in the

package, I get this warning:

Warning message:
In left_join_impl(x, y, by$x, by$y) :
joining factors with different levels, coercing to character vector

There is not a lot of information online about this. Any idea what it could be? Thanks!

Answer Source

That's not an error, that's a warning. And it's telling you that one of the columns you used in your join was a factor and that factor had different levels in the different datasets. In order not to loose any information, the factors were converted to character values. For example


# [1] "factor"

# NOTE these are different
# [1] "a" "b" "c" "d" "e" "f" "g"
# [1] "d" "e" "f" "g" "h" "i" "j"

m <- left_join(x,y)
# Joining by: "a"
# Warning message:
# joining factors with different levels, coercing to character vector 

# [1] "character"

You can make sure that both factors have the same levels before merging

combined <- sort(union(levels(x$a), levels(y$a)))
n <- left_join(mutate(x, a=factor(a, levels=combined)),
    mutate(y, a=factor(a, levels=combined)))
# Joining by: "a"
#[1] "factor"
