jbssm - 28 days ago 5
R Question

# Using smooth in ggplot2 to fit a linear model using the errors given in the data

I have this data frame:

``````> dat
x         y        yerr
1 -1 -1.132711 0.001744498
2 -2 -2.119657 0.003889120
3 -3 -3.147378 0.007521881
4 -4 -4.220129 0.012921450
5 -5 -4.586586 0.021335644
6 -6 -5.389198 0.032892630
7 -7 -6.002848 0.048230946
``````

And I can plot it with the standard error smoothing as:

``````p <- ggplot(dat, aes(x=x, y=y)) + geom_point()
p <- p + geom_errorbar(data=dat, aes(x=x, ymin=y-yerr, ymax=y+yerr), width=0.09)
p + geom_smooth(method = "lm", formula = y ~ x)
``````

But what I need is to use the yerr to fit my linear model. Is it possible with ggplot2?

Well, I found a way to answer this.

Since in any scientific experiment where we gather data, if that experiment is correctly executed, all the data values must have an error associated.

In some cases the variance of the error may be equal in all the points, but in many, like the present case states in the original question, that is not true. So we must use that different in the variances of the error values for different measurements when fitting a curve to our data.

That way to do it is to attribute the weight to the error values, which according to statistical analysis methods are equal to 1/sqrt(errorValue), so, it becomes:

``````p <- ggplot(dat, aes(x=x, y=y, weight = 1/sqrt(yerr))) +
geom_point() +
geom_errorbar(aes(ymin=y-yerr, ymax=y+yerr), width=0.09) +
geom_smooth(method = "lm", formula = y ~ x)
``````