Toni - 1 year ago 116
R Question

# Fit a line to 3d point cloud in R

I have point clouds (xyz-coordinates) that I need to fit a linear model to. I thought I'd use lm() for that.

This is what I tried:

``````library(scatterplot3d)

# example points
x <- c(1,4,3,6,2,5)
y <- c(2,2,4,3,5,9)
z <- c(1,3,5,9,2,2)

# plot
s <- scatterplot3d(x,y,z, type="b")

# fit the model
ff = lm(z ~ x + y) ## in ff\$coefficients are the line paramters z, mx, ny

# create coordinates for a short line (the fit) to plot
llx = c(min(x), max(x))
lly = c(min(y), max(y))
llz = c(
ff\$coefficients[[1]] + llx[1] * ff\$coefficients[[2]] + lly[1] * ff\$coefficients[[3]],
ff\$coefficients[[1]] + llx[2] * ff\$coefficients[[2]] + lly[2] * ff\$coefficients[[3]]
)

## create 2d coordinates to place in scatterplot
p0 <- s\$xyz.convert(llx[1],lly[1],llz[1])
p1 <- s\$xyz.convert(llx[2],lly[2],llz[2])

# draw line
segments(p0\$x,p0\$y,p1\$x,p1\$y,lwd=2,col=2)
``````

Although, the red line looks convincingly like a fit I'm not sure it is. If you rotate the plot, it doesn't look very good.

``````for(i in seq(from=30, to=60, by=1)){
s <- scatterplot3d(x,y,z, type="b", angle=i)
segments(p0\$x,p0\$y,p1\$x,p1\$y,lwd=2,col=2)
Sys.sleep(0.1)
}
``````

Is this just due to 2d projection of the line?!? Can you somehow update the coordinates? I tried to give the \$xyz.convert() function an "angle" attribute, without luck.

Also, when I use only two example points the fit fails.

``````x <- c(1,4)
y <- c(2,5)
z <- c(1,3)
``````

I'd appreciate a confirmation whether I use lm() correctly.
Thanks!

[EDIT]
I learned that lm() fitted a plane to the data based on the model I gave it (z~x+y). This was not what I wanted. In fact, I misunderstood of lm() entirely. Also for 2d data lm(y~x) tries to minimize the vertical space between fit and data. But, I wanted the data to be treated as completely independent (spatial data) and minimize the perpendiculars between fit and data (as first paragraph here: http://mathpages.com/home/kmath110.htm).

The answer marked as correct does exactly this. The principle is called "principle component analysis".

`lm(z ~ x + y)`'s fit points make not a line but a plane. Your segment indeed belongs to the plane.

``````s <- scatterplot3d(x,y,z, type="b")
s\$plane3d(ff)
segments(p0\$x,p0\$y,p1\$x,p1\$y,lwd=2,col=2)

# rgl
library(rgl)
planes3d(ff\$coef[2], ff\$coef[3], -1, ff\$coef[1], col = 4, alpha = 0.3)
segments3d(llx, lly, llz, lwd=2, col=2)
``````

### [EDITED]

What you want is to fit a line to 3-dimensional data, in other words, summarize 3-dim into 1-dim. I think the line consists of principal component analyze's 1st component (i.e., mean + t * PC1, this line minimizes total least squares). I referred to "R mailing help: Fit a 3-Dimensional Line to Data Points" and "MathWorks: Fitting an Orthogonal Regression Using Principal Components Analysis".

``````x <- c(1,4,3,6,2,5)
y <- c(2,2,4,3,5,9)
z <- c(1,3,5,9,2,2)

xyz <- data.frame(x = x, y = y, z = z)
N <- nrow(xyz)

mean_xyz <- apply(xyz, 2, mean)
xyz_pca   <- princomp(xyz)

xyz_fit <- matrix(rep(mean_xyz, each = N), ncol=3) + xyz_pca\$score[, 1] %*% t(dirVector)

t_ends <- c(min(xyz_pca\$score[,1]) - 0.2, max(xyz_pca\$score[,1]) + 0.2)  # for both ends of line
endpts <- rbind(mean_xyz + t_ends[1]*dirVector, mean_xyz + t_ends[2]*dirVector)

library(scatterplot3d)
s3d <- scatterplot3d(xyz, type="b")
s3d\$points3d(endpts, type="l", col="blue", lwd=2)
for(i in 1:N) s3d\$points3d(rbind(xyz[i,], xyz_fit[i,]), type="l", col="green3", lty=2)

library(rgl)