rashid - 11 months ago 51

R Question

I have code to fit linear models to data in R. My independent variable is AGE and my dependent variable is Costs. I am interest whether Costs in general increase with AGE in years. However for some parts my intercept is 10 whereas for others my intercept is 1000 and thus an increase in currency units is not helpful, since a slope of 1 unit per year might be a lot for an intercept of 10 and a slope of 1 currency unit per year is neglectable. Could anybody help with how to solve this issue, to standardize slopes in R to compare them after calculating them with

`lm`

`data.ex <- data.frame(Age = c(c(1:10), c(1:10)),`

Costs = c(11,12,13,14,15,12,17,18,19,20, 1001,1002,1003,1004,999,1006,1007,1008,1009,1010),

Type = c(rep("A", 10), rep("B", 10)))

pt <- ggplot(data = data.ex, aes(x=Age, y = Costs))+

geom_smooth(method="lm")+

facet_wrap(facets = "Type", nrow = 2)

plot(pt)

print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))

print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))

Answer Source

As pointed out by others, setting `scales = 'free'`

in `facet_wrap`

will make both lines more visible in the plot.

To your other question, your wording is a bit unclear, but it sounds like you're saying, "If baseline costs start at $10, then an increase of $1/year is substantial, whereas at a baseline cost of $1,000, $1/year isn't significant. How do I show that difference?"

One way would be to normalize each group against its intercept:

```
library(dplyr)
# calculate intercepts for each group and extract them:
intercept.ex <- group_by(data.ex, Type) %>%
do(data.frame(intercept = coef(lm(Costs ~ Age, data = .))[1]))
# normalize the values in each group against their intercepts
data.ex <- merge(data.ex, intercept.ex) %>%
mutate(Costs = Costs / intercept)
# Age slope = 0.1002
print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))
# Age slope = 0.001037
print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))
```

I should point out that both slopes are still *statistically significant*, since the relationship between Age and Costs is quite clear. But the *relative effect size* is very different between them.