Jørgen K. Kanters - 6 months ago 51

R Question

I want to plot two variables against each other using ggplot. Later on I want to use a nonlinear fit, but I am having a problem with an error message I do not fully understand. I can see that others having similar problem, but I may not be bright enough to understand the answer

I have a dataset ost containing 4 variables. There is no NA's in the dataset.

using ggplot2 I want to plot the data with a regression line. For simplicity I start with the simple linear regression

`library(ggplot2)`

qt_int <- c(404, 402, 426, 392, 418, 410)

rr <- c(1000, 958, 982, 752, 824, 844)

gender <- c('male','female','female','female','female','female')

deltnr <- c(10445, 1022, 9122, 60, 246, 306)

df = data.frame(deltnr, gender, qt_int, rr)

p <- ggplot(df, aes(rr, qt_int))

p <- p + geom_point (size = 2)

p <- p + stat_smooth(method = "lm", formula = qt_int ~ rr)

p

I get the following warning message:

Warning messages: 1: 'newdata' had 80 rows but variables found have

6702 rows 2: Computation failed in: argumenter`stat_smooth()`

antyder forskelligt antal rækker: 80, 6 [En: arguments suggest

different number of rows: 80, 6]

Strangely enough it works if I omit formula (but I want later on to do nonlinear fit, so I have to get it to work)

What am I missing?

Answer

Formulas in `stat_smooth()`

should use the names of aesthetic objects (`x`

, `y`

, `colour`

, etc.), and not the names of the variables. See `help("stat_smooth")`

:

formula: formula to use in smoothing function, eg. ‘y ~ x’, ‘y ~ poly(x, 2)’, ‘y ~ log(x)’

OP wrote:

```
p <- p + stat_smooth(method = "lm", formula = qt_int ~ rr)
```

But the correct way to write the formula is:

```
p <- p + stat_smooth(method = "lm", formula = y ~ x)
```

Which produces the expected output: