Dylan K - 9 months ago 128

R Question

I want to obtain the difference between consecutive rows in a data frame, which is what the built-in diff() function does. But my data is of the bigz class (gmp package), so I cannot use the existing function.

`class(MyData$IntIndex)`

[1] "bigz"

diff(MyData$IntIndex)

Error in r[i1] - r[-length(r):-(length(r) - lag + 1L)] :

non-numeric argument to binary operator

Perhaps there is a package with a function that could solve my problem? Or something else I could do?

Answer

Since `diff`

is an S3 generic, and pretty straightforward to implement, you can just add your own `diff.bigz`

method on the fly. Here is a very basic example for the default case of `lag = 1`

, `differences = 1`

:

```
library(gmp)
z <- as.bigz(
c("1000000000000000000000000000",
"1000000000000000000000000010",
"1000000000000000000000000021",
"1000000000000000000000000033",
"1000000000000000000000000047")
)
diff.bigz <- function(x) {
x[-1] - x[-length(x)]
}
diff(z)
#Big Integer ('bigz') object of length 4:
#[1] 10 11 12 14
```

If you want something more elaborate, translating `diff.default`

shouldn't be too difficult:

```
diff.default
# function (x, lag = 1L, differences = 1L, ...)
# {
# ismat <- is.matrix(x)
# xlen <- if (ismat)
# dim(x)[1L]
# else length(x)
# if (length(lag) != 1L || length(differences) > 1L || lag <
# 1L || differences < 1L)
# stop("'lag' and 'differences' must be integers >= 1")
# if (lag * differences >= xlen)
# return(x[0L])
# r <- unclass(x)
# i1 <- -seq_len(lag)
# if (ismat)
# for (i in seq_len(differences)) r <- r[i1, , drop = FALSE] -
# r[-nrow(r):-(nrow(r) - lag + 1L), , drop = FALSE]
# else for (i in seq_len(differences)) r <- r[i1] - r[-length(r):-(length(r) -
# lag + 1L)]
# class(r) <- oldClass(x)
# r
# }
# <bytecode: 0x62f5c78>
# <environment: namespace:base>
```