cimentadaj - 1 year ago 75
R Question

# Dynamically replace values of a column based on available values from another column

Suppose I have this data frame

``````set.seed(2)
df <- data.frame(c1 = sample(c(0:3,NA), 50, replace = T), c2 = sample(c(0:3,NA), 50, replace = T),
c3 = sample(c(0:3,NA), 50, replace = T), c4 = sample(c(0:3,NA), 50, replace = T))

c1 c2 c3 c4
1  0  0  1  0
2  3  0  2  1
3  2  3 NA NA
4  0 NA NA  1
5 NA  1  1  3
6 NA NA  2  1
``````

When c4 is 0, I'd like to replace it with the next available non-NA value in c3. If c3 is NA, then c2 and so on.

I'm trying to learn how to do it, so don't just throw in the answer! If it's alright, suggest possible solutions. Thanks in advance.

Edit:

Expected output:

``````head(df)
c1 c2 c3 c4
1  0  0  1  1 # This would be the only difference with the head output from above
2  3  0  2  1
3  2  3 NA NA
4  0 NA NA  1
5 NA  1  1  3
6 NA NA  2  1
``````

This is how you can do it without looping through each row:

``````c4 <- ncol(df)
inds <- max.col(!is.na(df[,-c4]) & df[,-c4]!=0, "last")
zeroinds <- which((df[,c4]==0)==T)
df[zeroinds,c4] <- df[cbind(zeroinds,inds[zeroinds])]

# c1 c2 c3 c4
# 1   0  0  1  1
# 2   3  0  2  1
# 3   2  3 NA NA
# 4   0 NA NA  1
# 5  NA  1  1  3
# 6  NA NA  2  1
# 7   0  3 NA NA
# 8  NA NA  2  2
# 9   2  3  0  3
# 10  2  3  0  1
``````

Here is how:

1. `c4` as the last column
2. We find the first non-NA and non-zero value per row before `c4`
3. Find those rows with zero in `c4` and put it in `zeroinds`
4. Replace zeros at `zeroinds` with the first non-NA and non-zero value per row