st19297 - 1 year ago 62
R Question

# How to mix daily and quarterly data in R?

I have quarterly fundamental data from Compustat that looks like this:

``````fund<-data.frame(quarterlydate=as.Date(c("03/31/1966","06/30/1966"), '%m/%d/%Y'),
gvkey=c(1000,1000,1001,1001), tic=c("XTL", "XTL", "ABL","ABL"),
sales=c(70,75,20,22))

> fund
quarterlydate gvkey tic sales
1    1966-03-31  1000 XTL    70
2    1966-06-30  1000 XTL    75
3    1966-03-31  1001 ABL    20
4    1966-06-30  1001 ABL    22
``````

I also have daily price data from CRSP that looks like this:

``````prices<-data.frame(dailydate=seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days"), gvkey=c(rep(1000, 181),rep(1001, 181)),
tic=c(rep("XTL",181), rep("ABL",181)),
price=floor(runif(length(seq(as.Date("1966/01/01"), as.Date("1966/06/30"), "days")), min=0, max=50)))

dailydate gvkey tic price
1 1966-01-01  1000 XTL    44
2 1966-01-02  1000 XTL    42
3 1966-01-03  1000 XTL    42
4 1966-01-04  1000 XTL    16
5 1966-01-05  1000 XTL    27
6 1966-01-06  1000 XTL    36

> tail(prices)
dailydate gvkey tic price
357 1966-06-25  1001 ABL     0
358 1966-06-26  1001 ABL    28
359 1966-06-27  1001 ABL     4
360 1966-06-28  1001 ABL    18
361 1966-06-29  1001 ABL    49
362 1966-06-30  1001 ABL     4
``````

Question:

1) How can I merge such a quarterly and daily datasets to have a dataframe like the one below?

2) How can I calculate the average quarterly price and assign the values to the quarters? (the "average_quarterly_price" variable in the table below)

I want a merged data frame like this:

`````` dailydate   quarterlydates gvkey tic  price  sales average_quarterly_price
1 1966-01-01   1966-03-31    1000  XTL     1    70   32
2 1966-01-02   1966-03-31    1000  XTL    10    70   32
3 1966-01-03   1966-03-31    1000  XTL    14    70   32
4 1966-01-04   1966-03-31    1000  XTL    29    70   32
5 1966-01-05   1966-03-31    1000  XTL     1    70   32
6 1966-01-06   1966-03-31    1000  XTL    43    70   32
.
.
.
182 1966-04-01  1966-06-31   1000  XTL    11    75   41
183 1966-04-02  1966-06-31   1000  XTL    8     75   41
184 1966-04-03  1966-06-31   1000  XTL    16    75   41
185 1966-04-04  1966-06-31   1000  XTL    14    75   41
186 1966-04-05  1966-06-31   1000  XTL    14    75   41
187 1966-04-06  1966-06-31   1000  XTL    20    75   41
.
.
.
364 1966-01-01  1966-03-31   1001  ABL    18    20   15
365 1966-01-02  1966-03-31   1001  ABL    10    20   15
366 1966-01-03  1966-03-31   1001  ABL    13    20   15
367 1966-01-04  1966-03-31   1001  ABL    13    20   15
368 1966-01-05  1966-03-31   1001  ABL    11    20   15
369 1966-01-06  1966-03-31   1001  ABL    13    20   15
.
.
.
545 1966-04-01  1966-06-31   1001  ABL    14    22   16
555 1966-04-02  1966-06-31   1001  ABL    21    22   16
556 1966-04-03  1966-06-31   1001  ABL    18    22   16
557 1966-04-04  1966-06-31   1001  ABL    18    22   16
558 1966-04-05  1966-06-31   1001  ABL    17    22   16
559 1966-04-06  1966-06-31   1001  ABL    18    22   16
.
.
.
724 1966-06-31  1966-06-31  1001  ABL    22    22   16
``````

Of course I am not sure if this is the best dataset format, and would appreciate suggestions. My ultimate purpose is to be able to use both daily and quarterly data in a single analysis. For the sake of example, I want to be able to to find stocks with quarterly Return on Assets in the top 20% percentile AND whose daily prices have been lowest in the last 10 days.

Create a `"yearqtr"` class column in each data frame and then perform a left join of the two data frames using common column names. Finally use `ave` to calculate the mean.
``````library(zoo) # yearqtr class