Sebastian - 1 month ago 8

R Question

I have a data.frame with two variables. I need to group them by var1 and replace every x in var2 with the unique different value in that group.

For example:

`var1 var2`

1 1 a

2 2 a

3 2 x

4 3 b

5 4 c

6 5 a

7 6 c

8 6 x

9 7 c

10 8 x

11 8 b

12 8 b

13 9 a

Outcome should be:

`var1 var2`

1 1 a

2 2 a

3 2 a <-

4 3 b

5 4 c

6 5 a

7 6 c

8 6 c <-

9 7 c

10 8 b <-

11 8 b

12 8 b

13 9 a

I did manage to solve this example:

`dat <- data.frame(var1=c(1,2,2,3,4,5,6,6,7,8,8,8,9), var2=c("a","a","x","b","c","a","a","x","c","x","b","b","a"))`

dat %>% group_by(var1) %>% mutate(

var2 = as.character(var2),

var2 = ifelse(var2 == 'x',var2[order(var2)][1],var2))

But this does not work for my real data because of the ordering :(

I would need another approach, I think of something like checking explicit for "not x" but I did not came to a solution.

Any help appreciatet!

Answer

We can use `data.table`

. Convert the 'data.frame' to 'data.table' (`setDT(df1)`

), grouped by 'var1', we get the 'var2' that are not 'x', select the first observation and assign (`:=`

) it to 'var2'.

```
library(data.table)
setDT(df1)[, var2 := var2[var2!='x'][1], var1]
```

Or with `dplyr`

```
library(dplyr)
df1 %>%
group_by(var1) %>%
mutate(var2 = var2[var2!="x"][1])
# var1 var2
# <int> <chr>
#1 1 a
#2 2 a
#3 2 a
#4 3 b
#5 4 c
#6 5 a
#7 6 c
#8 6 c
#9 7 c
#10 8 b
#11 8 b
#12 8 b
#13 9 a
```

Source (Stackoverflow)

Comments