hbtf.1046 hbtf.1046 -4 years ago 111
R Question

Calculation and replacement in R

I have a dataset like the following and I need to compare the value of each year (2005-2009) with the average value of (2002-2004).

Year Firm R
2002 A 30
2003 A 11
2004 A 1
2005 A 7
2006 A 15
2007 A 20
2008 A 3.5
2009 A 8
2002 B 24
2003 B 30
2004 B 25
2005 B 5.2
2006 B 11.8
2007 B 78
2008 B 90
2009 B 57


The Issue that I need to calculate the average of (2002-2004) for each firm and replace the value in years 2002-2004 with the new value (i.e. the calculated average). for example, the new dataset should be like this:

Year Firm R
2002 A 14
2003 A 14
2004 A 14
2005 A 7
2006 A 15
2007 A 20
2008 A 3.5
2009 A 8
2002 B 26.333
2003 B 26.333
2004 B 26.333
2005 B 5.2
2006 B 11.8
2007 B 78
2008 B 90
2009 B 57


I have tried to use the following code:

df$R[df$Year==2002 & df$Year==2003 & df$Year==2004] = (df$R[df$Year==2002] + df$R[df$Year==2003] + df$R[df$Year==2004])/3


but when I apply it nothing changes!!!!!?????
I hope you can help with this issue

Answer Source

You can use data.table for this if you like:

library(data.table)

year <- c(rep(seq(2002,2009,1),2))
firm <- c(rep("A",8),rep("B",8))
r <- c(30,11,1,7,15,20,3.5,8,24,30,25,5.2,11.8,78,90,57)

aa <- data.table(year,firm,r)

aa[year>=2002 & year<=2004, r:= mean(r), by = firm]

Giving this result :

    year firm        r
 1: 2002    A 14.00000
 2: 2003    A 14.00000
 3: 2004    A 14.00000
 4: 2005    A  7.00000
 5: 2006    A 15.00000
 6: 2007    A 20.00000
 7: 2008    A  3.50000
 8: 2009    A  8.00000
 9: 2002    B 26.33333
10: 2003    B 26.33333
11: 2004    B 26.33333
12: 2005    B  5.20000
13: 2006    B 11.80000
14: 2007    B 78.00000
15: 2008    B 90.00000
16: 2009    B 57.00000
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download