MFR MFR - 2 months ago 14
R Question

Using dplyr first function but ignoring a particular character

I wish to add the first feature in the following dataset in a new column

mydf <- data.frame (customer= c(1,2,1,2,2,1,1) , feature =c("other", "a", "b", "c", "other","b", "c"))

customer feature
1 1 other
2 2 a
3 1 b
4 2 c
5 2 other
6 1 b
7 1 c


by using
dplyr
. However, I wish to my code ignore the "other" feature in the data set and choose the first one after "other".

so the following code is not sufficient:

library (dplyr)
new <- mydf %>%
group_by(customer) %>%
mutate(firstfeature = first(feature))


How can I ignore "other" so that I reach the following ideal output:

customer feature firstfeature

1 1 other b
2 2 a a
3 1 b b
4 2 c a
5 2 other a
6 1 b b

Answer Source

With dplyr we can group by customer and take the first feature for every group.

library(dplyr)
mydf %>%
   group_by(customer) %>%
   mutate(firstfeature = feature[feature != "other"][1])


#  customer feature firstfeature
#     <dbl>   <chr>        <chr>
#1        1   other            b
#2        2       a            a
#3        1       b            b
#4        2       c            a
#5        2   other            a
#6        1       b            b
#7        1       c            b

Similarly we can also do this with base R ave

mydf$firstfeature <- ave(mydf$feature, mydf$customer, 
                                         FUN= function(x) x[x!= "other"][1])