Toni Toni - 1 month ago 16
R Question

Fit a line to 3d point cloud in R

I have point clouds (xyz-coordinates) that I need to fit a linear model to. I thought I'd use lm() for that.

This is what I tried:

library(scatterplot3d)

# example points
x <- c(1,4,3,6,2,5)
y <- c(2,2,4,3,5,9)
z <- c(1,3,5,9,2,2)

# plot
s <- scatterplot3d(x,y,z, type="b")

# fit the model
ff = lm(z ~ x + y) ## in ff$coefficients are the line paramters z, mx, ny

# create coordinates for a short line (the fit) to plot
llx = c(min(x), max(x))
lly = c(min(y), max(y))
llz = c(
ff$coefficients[[1]] + llx[1] * ff$coefficients[[2]] + lly[1] * ff$coefficients[[3]],
ff$coefficients[[1]] + llx[2] * ff$coefficients[[2]] + lly[2] * ff$coefficients[[3]]
)

## create 2d coordinates to place in scatterplot
p0 <- s$xyz.convert(llx[1],lly[1],llz[1])
p1 <- s$xyz.convert(llx[2],lly[2],llz[2])

# draw line
segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2)


Although, the red line looks convincingly like a fit I'm not sure it is. If you rotate the plot, it doesn't look very good.

for(i in seq(from=30, to=60, by=1)){
s <- scatterplot3d(x,y,z, type="b", angle=i)
segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2)
Sys.sleep(0.1)
}


Is this just due to 2d projection of the line?!? Can you somehow update the coordinates? I tried to give the $xyz.convert() function an "angle" attribute, without luck.

Also, when I use only two example points the fit fails.

x <- c(1,4)
y <- c(2,5)
z <- c(1,3)


I'd appreciate a confirmation whether I use lm() correctly.
Thanks!

[EDIT]
I learned that lm() fitted a plane to the data based on the model I gave it (z~x+y). This was not what I wanted. In fact, I misunderstood of lm() entirely. Also for 2d data lm(y~x) tries to minimize the vertical space between fit and data. But, I wanted the data to be treated as completely independent (spatial data) and minimize the perpendiculars between fit and data (as first paragraph here: http://mathpages.com/home/kmath110.htm).

The answer marked as correct does exactly this. The principle is called "principle component analysis".

Answer

lm(z ~ x + y)'s fit points make not a line but a plane. Your segment indeed belongs to the plane.

s <- scatterplot3d(x,y,z, type="b")
s$plane3d(ff)
segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2) 

# rgl
library(rgl)
plot3d(x, y, z, type="s", rad=0.1)
planes3d(ff$coef[2], ff$coef[3], -1, ff$coef[1], col = 4, alpha = 0.3)
segments3d(llx, lly, llz, lwd=2, col=2) 

enter image description here

[EDITED]

What you want is to fit a line to 3-dimensional data, in other words, summarize 3-dim into 1-dim. I think the line consists of principal component analyze's 1st component (i.e., mean + t * PC1, this line minimizes total least squares). I referred to "R mailing help: Fit a 3-Dimensional Line to Data Points" and "MathWorks: Fitting an Orthogonal Regression Using Principal Components Analysis".

x <- c(1,4,3,6,2,5)
y <- c(2,2,4,3,5,9)
z <- c(1,3,5,9,2,2)

xyz <- data.frame(x = x, y = y, z = z)
N <- nrow(xyz) 

mean_xyz <- apply(xyz, 2, mean)
xyz_pca   <- princomp(xyz) 
dirVector <- xyz_pca$loadings[, 1]   # PC1

xyz_fit <- matrix(rep(mean_xyz, each = N), ncol=3) + xyz_pca$score[, 1] %*% t(dirVector) 

t_ends <- c(min(xyz_pca$score[,1]) - 0.2, max(xyz_pca$score[,1]) + 0.2)  # for both ends of line
endpts <- rbind(mean_xyz + t_ends[1]*dirVector, mean_xyz + t_ends[2]*dirVector)

library(scatterplot3d) 
s3d <- scatterplot3d(xyz, type="b")
s3d$points3d(endpts, type="l", col="blue", lwd=2)
for(i in 1:N) s3d$points3d(rbind(xyz[i,], xyz_fit[i,]), type="l", col="green3", lty=2)

library(rgl)
plot3d(xyz, type="s", rad=0.1)
abclines3d(mean_xyz, a = dirVector, col="blue", lwd=2)     # mean + t * direction_vector
for(i in 1:N) segments3d(rbind(xyz[i,], xyz_fit[i,]), col="green3")

enter image description here