Jørgen K. Kanters Jørgen K. Kanters - 1 month ago 5
R Question

stat_smooth - variable length differs

I want to plot two variables against each other using ggplot. Later on I want to use a nonlinear fit, but I am having a problem with an error message I do not fully understand. I can see that others having similar problem, but I may not be bright enough to understand the answer

I have a dataset ost containing 4 variables. There is no NA's in the dataset.

using ggplot2 I want to plot the data with a regression line. For simplicity I start with the simple linear regression

library(ggplot2)

qt_int <- c(404, 402, 426, 392, 418, 410)
rr <- c(1000, 958, 982, 752, 824, 844)
gender <- c('male','female','female','female','female','female')
deltnr <- c(10445, 1022, 9122, 60, 246, 306)
df = data.frame(deltnr, gender, qt_int, rr)

p <- ggplot(df, aes(rr, qt_int))
p <- p + geom_point (size = 2)
p <- p + stat_smooth(method = "lm", formula = qt_int ~ rr)
p


I get the following warning message:


Warning messages: 1: 'newdata' had 80 rows but variables found have
6702 rows 2: Computation failed in
stat_smooth()
: argumenter
antyder forskelligt antal rækker: 80, 6 [En: arguments suggest
different number of rows: 80, 6]


Strangely enough it works if I omit formula (but I want later on to do nonlinear fit, so I have to get it to work)

What am I missing?

Answer

Formulas in stat_smooth() should use the names of aesthetic objects (x, y, colour, etc.), and not the names of the variables. See help("stat_smooth"):

formula: formula to use in smoothing function, eg. ‘y ~ x’, ‘y ~ poly(x, 2)’, ‘y ~ log(x)’

OP wrote:

p <- p + stat_smooth(method = "lm", formula = qt_int ~ rr)

But the correct way to write the formula is:

p <- p + stat_smooth(method = "lm", formula = y ~ x)

Which produces the expected output:

enter image description here

Comments