Gilles Cosyn - 7 months ago 67

R Question

I have a problem dealing with time series in R.

`#--------------read data`

wb = loadWorkbook("Countries_Europe_Prices.xlsx")

df = readWorksheet(wb, sheet="Sheet2")

x <- df$Year

y <- df$Index1

y <- lag(y, 1, na.pad = TRUE)

cbind(x, y)

It gives me the following output:

`x y`

[1,] 1974 NA

[2,] 1975 50.8

[3,] 1976 51.9

[4,] 1977 54.8

[5,] 1978 58.8

[6,] 1979 64.0

[7,] 1980 68.8

[8,] 1981 73.6

[9,] 1982 74.3

[10,] 1983 74.5

[11,] 1984 72.9

[12,] 1985 72.1

[13,] 1986 72.3

[14,] 1987 71.7

[15,] 1988 72.9

[16,] 1989 75.3

[17,] 1990 81.2

[18,] 1991 84.3

[19,] 1992 87.2

[20,] 1993 90.1

But I want the first value in y to be 50.8 and so forth. In other words, I want to get a negative lag. I don't get it, how can I do it?

My problem is very similar to this problem, but however I cannot solve it. I guess I still do not understand the solution(s)...

Basic lag in R vector/dataframe

Answer

How about the built-in 'lead' function? (from the dplyr package) Doesn't it do exactly the job of Ahmed's function?

```
cbind(x, lead(y, 1))
```

If you want to be able to calculate either positive or negative lags in the same function, i suggest a 'shorter' version of his 'shift' function:

```
shift = function(x, lag) {
require(dplyr)
switch(sign(lag)/2+1.5, lead(x, abs(lag)), lag(x, abs(lag)))
}
```

What it does is creating 2 cases, one with lag the other with lead, and chooses one case depending on the sign of your lag (the +1.5 is here to transform a {-1, +1} into a {1, 2} alternative).

Source (Stackoverflow)