MichaelChirico - 8 months ago 36

R Question

I've got individual-level data for which I'm trying to summarize an outcome dynamically by group.

Example:

`set.seed(12039)`

DT <- data.table(id = rep(1:100, each = 50),

grp = rep(letters[1:4], each = 1250),

time = rep(1:50, 100),

outcome = rnorm(5000))

I want to know the simplest way to plot the group-level summary, the data for which is contained in:

`DT[ , mean(outcome), by = .(grp, time)]`

I wanted something like:

`dt[ , plot(mean(outcome)), by = .(grp, time)]`

But this doesn't work at all.

The workable option I am surviving on (which could be looped pretty easily) is:

`plot(DT[grp == "a", mean(outcome), by = time])`

lines(DT[grp == "b", mean(outcome), by = time])

lines(DT[grp == "c", mean(outcome), by = time])

lines(DT[grp == "d", mean(outcome), by = time])

(with added parameters for colors, etc, excluded for conciseness)

This strikes me as not the best way to do this--given

`data.table`

Other sources have been pointing me to

`matplot`

`DT`

`reshape`

Answer

Base **R** solution using `matplot`

and `dcast`

```
dt_agg <- dt[ , .(mean = mean(outcome)), by=.(grp,time)]
dt_cast <- dcast(dt_agg, time~grp, value.var="mean")
dt_cast[ , matplot(time, .SD[ , !"time", with=FALSE],
type="l", ylab="mean", xlab="")]
```

Result: