Saltaf Saltaf - 1 year ago 96
R Question

Comparing each element in two columns and set another column

I have a data frame (after fread from a file) with two columns (dep and label). I want to set another column (mark) with id value depending on the match. If the 'dep' entry matches 'lablel' entry, mark get the 'id' of the matched 'label'. For no match, mark get the value of its own 'id'. Currently, I have work around solution with loops but I know there should be a neat way to do it in R specifics.

trace <- data.table(id=seq(1:7),dep=c(-1,45,-1,47,-1,45,43),
label=c(99,40,43,45,47,42,48), mark=rep("",7))
id dep label mark
1: 1 -1 99 1
2: 2 45 40 4
3: 3 -1 43 3
4: 4 47 45 5
5: 5 -1 47 5
6: 6 45 42 4
7: 7 43 48 3


I know loops are slow in r and just to give example the following naive for/while works for small sizes but my data set is huge.

trace$mark <- trace<-id
for (i in 1:length(trace$id)){
val <- trace$dep[i]
j <- 1
while(j<=i && val !=-1 && val!=0){ // don't compare if val is -1/0

if(val==trace$label[j]){
trace$mark[i] <- trace$mark[j]
}
j <-j +1
}
}


I have also tried using the following approach but it works only if there is a single match.

match <- which(trace$dep %in% trace$label)
match_to <- which(trace$label %in% trace$dep)
trace$mark[match] <- trace$mark[match_to]

Answer Source

This solution might help:

trace[trace[,.(id,dep=label)],mark:=as.character(i.id),on="dep"]
trace[mark=="",mark:=as.character(id)]
# id dep label mark
# 1:  1  -1    99    1
# 2:  2  45    40    4
# 3:  3  -1    43    3
# 4:  4  47    45    5
# 5:  5  -1    47    5
# 6:  6  45    42    4
# 7:  7  43    48    3

Update:

To make sure you are not matching dep with 0 or -1 values you can just add another line.

trace[dep %in% c(0,-1), mark:= as.character(id)]

OR Try this:

trace[trace[!dep %in% c(0,-1),.(id,dep=label)],mark:=as.character(i.id),on="dep"]
trace[mark=="",mark:=as.character(id)]
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download