Alexander Alexander - 2 months ago 6
R Question

Mutate and ifelse() fail becase of NA existence in column

I've encountered an issue trying to create new column with

ifelse
. Quite similar question is this dplyr error: strange issue when combining group_by, mutate and ifelse. Is it a bug?

set.seed(101)
time =sort(runif(10,0,10))
group=rep(c(1,2),each=5)
az=c(sort(runif(5,-1,1),decreasing = T),sort(runif(5,-1,0.2),decreasing = T))

df <- data.frame(time,az,group)

# time az group
#1 0.4382482 0.86326886 1
#2 2.4985572 0.75959146 1
#3 3.0005483 0.46394519 1
#4 3.3346714 0.41374948 1
#5 3.7219838 -0.08975881 1
#6 5.4582855 -0.01547669 2
#7 5.8486663 -0.29161632 2
#8 6.2201196 -0.50599980 2
#9 6.5769040 -0.73105782 2
#10 7.0968402 -0.95366733 2


in the
df
I am trying to conditional mutate
clas
column. However, since there is
NA
inside of
sw_time
all
clas
column becomes also
NA
in which
group 1
should be
nrm
in usual way.

df1 <- df%>%
group_by(group)%>%
mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]))%>%
mutate(clas=as.numeric(ifelse(sw_time<3,"nrm","abn")))

Source: local data frame [10 x 5]
Groups: group [2]

time az group sw_time clas
(dbl) (dbl) (dbl) (dbl) (dbl)
1 0.4382482 0.86326886 1 2.060309 NA
2 2.4985572 0.75959146 1 2.060309 NA
3 3.0005483 0.46394519 1 2.060309 NA
4 3.3346714 0.41374948 1 2.060309 NA
5 3.7219838 -0.08975881 1 2.060309 NA
6 5.4582855 -0.01547669 2 NA NA
7 5.8486663 -0.29161632 2 NA NA
8 6.2201196 -0.50599980 2 NA NA
9 6.5769040 -0.73105782 2 NA NA
10 7.0968402 -0.95366733 2 NA NA


thanks in advance for your actions!

Answer

By converting character class to numeric, it will result in NA. Instead, we may need to have a factor class that coerces to numeric

df %>%
    group_by(group)%>%
     mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
            clas=as.integer(factor(ifelse(sw_time<3,"nrm","abn"))))

If we are only interested in getting 'nrm', 'abn', just remove the as.integer(factor wrapping

df%>%
  group_by(group)%>%
  mutate(sw_time=abs(time[which(az<=0.8)[1]]-time[which(az>0)[1]]),
          clas=ifelse(sw_time<3,"nrm","abn"))
#        time          az group  sw_time  clas
#       <dbl>       <dbl> <dbl>    <dbl> <chr>
#1  0.4382482  0.86326886     1 2.060309   nrm
#2  2.4985572  0.75959146     1 2.060309   nrm
#3  3.0005483  0.46394519     1 2.060309   nrm
#4  3.3346714  0.41374948     1 2.060309   nrm
#5  3.7219838 -0.08975881     1 2.060309   nrm
#6  5.4582855 -0.01547669     2       NA  <NA>
#7  5.8486663 -0.29161632     2       NA  <NA>
#8  6.2201196 -0.50599980     2       NA  <NA>
#9  6.5769040 -0.73105782     2       NA  <NA>
#10 7.0968402 -0.95366733     2       NA  <NA>

We can also use data.table

library(data.table)
setDT(df)[, c("sw_time", "clas") := {
           v1 <- abs(time[which(az <= 0.8)[1]] - time[which(az > 0)[1]])
          .(v1 , c("abn", "nrm")[(v1 < 3) + 1]) },
                      by = group]

If the final output does not involve 'nrm', 'abn', we don't need the ifelse part. We can directly use as.integer(sw_time <3)

Comments