Eldar Agalarov - 9 months ago 64

R Question

I have a matrix, where rows can have NA's for all columns. I want to replace these NA rows with previous row's non-NA value and K-th column.

For example, this matrix:

`[,1] [,2]`

[1,] NA NA

[2,] NA NA

[3,] 1 2

[4,] 2 3

[5,] NA NA

[6,] NA NA

[7,] NA NA

[8,] 6 7

[9,] 7 8

[10,] 8 9

Must be transformed to this non-NA matrix, where we use 2-th column for replacement:

`[,1] [,2]`

[1,] NA NA

[2,] NA NA

[3,] 1 2

[4,] 2 3

[5,] 3 3

[6,] 3 3

[7,] 3 3

[8,] 6 7

[9,] 7 8

[10,] 8 9

I wrote a function for this, but using loop:

`# replaces rows which contains all NAs with non-NA values from previous row and K-th column`

na.replace <- function(x, k) {

cols <- ncol(x)

for (i in 2:nrow(x)) {

if (sum(is.na(x[i - 1, ])) == 0 && sum(is.na(x[i, ])) == cols) {

x[i, ] <- x[i - 1 , k]

}

}

x

}

Seems this function works correct, but I want to avoid these loops. Can anyone advice, how I can do this replacement without using loops?

agstudy suggested it's own vectorized non-loop solution:

`na.replace <- function(mat, k){`

idx <- which(rowSums(is.na(mat)) == ncol(mat))

mat[idx,] <- mat[ifelse(idx > 1, idx-1, 1), k]

mat

}

But this solution returns different and wrong results, comparing to my solution with loops. Why this happens? Theoretically loop and non-loop solutions are identical.

Answer Source

Finally I realized my own vectorized version. It returns expected output:

```
na.replace <- function(x, k) {
isNA <- is.na(x[, k])
x[isNA, ] <- na.locf(x[, k], na.rm = F)[isNA]
x
}
```

**UPDATE**

Better solution, without any packages

```
na.lomf <- function(x) {
if (length(x) > 0L) {
non.na.idx <- which(!is.na(x))
if (is.na(x[1L])) {
non.na.idx <- c(1L, non.na.idx)
}
rep.int(x[non.na.idx], diff(c(non.na.idx, length(x) + 1L)))
}
}
na.lomf(c(NA, 1, 2, NA, NA, 3, NA, NA, 4, NA))
# [1] NA 1 2 2 2 3 3 3 4 4
```