aelwan - 5 months ago 78

R Question

Using

`df`

`library(dplyr)`

library(ggplot2)

library(ggpmisc)

df <- diamonds %>%

dplyr::filter(cut%in%c("Fair","Ideal")) %>%

dplyr::filter(clarity%in%c("I1" , "SI2" , "SI1" , "VS2" , "VS1", "VVS2")) %>%

dplyr::mutate(new_price = ifelse(cut == "Fair",

price* 0.5,

price * 1.1))

formula <- y ~ x

ggplot(df, aes(x= new_price, y= carat, color = cut)) +

geom_point(alpha = 0.3) +

facet_wrap(~clarity, scales = "free_y") +

geom_smooth(method = "lm", formula = formula, se = F) +

stat_poly_eq(aes(label = paste(..rr.label..)),

label.x.npc = "right", label.y.npc = 0.15,

formula = formula, parse = TRUE, size = 3)

I got this plot

In addition to R2, I want to add p-values to the facets as well. I can do this manually through running the regression first then getting p-values and using

`geom_text()`

Is there any faster or automated way to do that? e.g. similar to the way R2 values have been added.

The p-value I'm talking about is the

Answer

Use `stat_fit_glance`

which is part of the `ggmisc`

package in R. This package is an extension of `ggplot2`

so it works well with it.

```
ggplot(df, aes(x= new_price, y= carat, color = cut)) +
geom_point(alpha = 0.3) +
facet_wrap(~clarity, scales = "free_y") +
geom_smooth(method = "lm", formula = formula, se = F) +
stat_poly_eq(aes(label = paste(..rr.label..)),
label.x.npc = "right", label.y.npc = 0.15,
formula = formula, parse = TRUE, size = 3)+
stat_fit_glance(method = 'lm',
method.args = list(formula = formula),
geom = 'text',
aes(label = paste("P-value = ", signif(..p.value.., digits = 4), sep = "")),
label.x.npc = 'right', label.y.npc = 0.35, size = 3)
```

`stat_fit_glance`

basically takes anything passed through `lm()`

in R and allows it to processed and printed using `ggplot2`

. This website has the rundown of some of the functions like `stat_fit_glance`

: http://rpackages.ianhowson.com/cran/ggpmisc/ . Also I believe this gives model p-value, not slope p-value (in general), which would be different for multiple linear regression. For simple linear regression they should be the same though.

Here is the plot: