mattbawn - 1 year ago 62
R Question

# R using ignoring NA's when using unique

I am attempting to find/discard rows based on their similarity in column values and have the following example code:

``````vec1 <- c("B","D","E","NA")
vec2 <- c("B","D","E","NA")
vec3 <- c("B","C","E","NA")
vec4 <- c("B","D","E","NA")
vec5 <- c("B","NA","E","E")
vec6 <- c("B","NA","NA","NA")

mat1 <- cbind(vec1,vec2,vec3,vec4,vec5,vec6)
mat1
vec1 vec2 vec3 vec4 vec5 vec6
[1,] "B"  "B"  "B"  "B"  "B"  "B"
[2,] "D"  "D"  "C"  "D"  "NA" "NA"
[3,] "E"  "E"  "E"  "E"  "E"  "NA"
[4,] "NA" "NA" "NA" "NA" "E"  "NA"

rows = apply(mat1, 1, function(i) length(unique(i)) > 1 )
mat2 <- mat1[rows, ]
vec1 vec2 vec3 vec4 vec5 vec6
[1,] "D"  "D"  "C"  "D"  "NA" "NA"
[2,] "E"  "E"  "E"  "E"  "E"  "NA
[3,] "NA" "NA" "NA" "NA" "E"  "NA"
``````

How may I change the code above to achieve this? In the help file for
`unique`
it suggests there is an
`incomparables`
argument, is this implemented and can it be used? I don't necessarily wish to remove the
`NA`
's just ignore them.

`rows = apply(mat1, 1, function(i) length(unique(i[!(i=="NA")]))>1)`