Toni - 4 months ago 30

R Question

I have point clouds (xyz-coordinates) that I need to fit a linear model to. I thought I'd use lm() for that.

This is what I tried:

`library(scatterplot3d)`

# example points

x <- c(1,4,3,6,2,5)

y <- c(2,2,4,3,5,9)

z <- c(1,3,5,9,2,2)

# plot

s <- scatterplot3d(x,y,z, type="b")

# fit the model

ff = lm(z ~ x + y) ## in ff$coefficients are the line paramters z, mx, ny

# create coordinates for a short line (the fit) to plot

llx = c(min(x), max(x))

lly = c(min(y), max(y))

llz = c(

ff$coefficients[[1]] + llx[1] * ff$coefficients[[2]] + lly[1] * ff$coefficients[[3]],

ff$coefficients[[1]] + llx[2] * ff$coefficients[[2]] + lly[2] * ff$coefficients[[3]]

)

## create 2d coordinates to place in scatterplot

p0 <- s$xyz.convert(llx[1],lly[1],llz[1])

p1 <- s$xyz.convert(llx[2],lly[2],llz[2])

# draw line

segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2)

Although, the red line looks convincingly like a fit I'm not sure it is. If you rotate the plot, it doesn't look very good.

`for(i in seq(from=30, to=60, by=1)){`

s <- scatterplot3d(x,y,z, type="b", angle=i)

segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2)

Sys.sleep(0.1)

}

Is this just due to 2d projection of the line?!? Can you somehow update the coordinates? I tried to give the $xyz.convert() function an "angle" attribute, without luck.

Also, when I use only two example points the fit fails.

`x <- c(1,4)`

y <- c(2,5)

z <- c(1,3)

I'd appreciate a confirmation whether I use lm() correctly.

Thanks!

I learned that lm() fitted a plane to the data based on the model I gave it (z~x+y). This was not what I wanted. In fact, I misunderstood of lm() entirely. Also for 2d data lm(y~x) tries to minimize the vertical space between fit and data. But, I wanted the data to be treated as completely independent (spatial data) and minimize the perpendiculars between fit and data (as first paragraph here: http://mathpages.com/home/kmath110.htm).

The answer marked as correct does exactly this. The principle is called "principle component analysis".

Answer

`lm(z ~ x + y)`

's fit points make not a line but a plane. Your segment indeed belongs to the plane.

```
s <- scatterplot3d(x,y,z, type="b")
s$plane3d(ff)
segments(p0$x,p0$y,p1$x,p1$y,lwd=2,col=2)
# rgl
library(rgl)
plot3d(x, y, z, type="s", rad=0.1)
planes3d(ff$coef[2], ff$coef[3], -1, ff$coef[1], col = 4, alpha = 0.3)
segments3d(llx, lly, llz, lwd=2, col=2)
```

What you want is to fit a line to 3-dimensional data, in other words, summarize 3-dim into 1-dim. I think the line consists of principal component analyze's 1st component (i.e., **mean + t * PC1**, this line minimizes total least squares). I referred to "R mailing help: Fit a 3-Dimensional Line to Data Points" and "MathWorks: Fitting an Orthogonal Regression Using Principal Components Analysis".

```
x <- c(1,4,3,6,2,5)
y <- c(2,2,4,3,5,9)
z <- c(1,3,5,9,2,2)
xyz <- data.frame(x = x, y = y, z = z)
N <- nrow(xyz)
mean_xyz <- apply(xyz, 2, mean)
xyz_pca <- princomp(xyz)
dirVector <- xyz_pca$loadings[, 1] # PC1
xyz_fit <- matrix(rep(mean_xyz, each = N), ncol=3) + xyz_pca$score[, 1] %*% t(dirVector)
t_ends <- c(min(xyz_pca$score[,1]) - 0.2, max(xyz_pca$score[,1]) + 0.2) # for both ends of line
endpts <- rbind(mean_xyz + t_ends[1]*dirVector, mean_xyz + t_ends[2]*dirVector)
library(scatterplot3d)
s3d <- scatterplot3d(xyz, type="b")
s3d$points3d(endpts, type="l", col="blue", lwd=2)
for(i in 1:N) s3d$points3d(rbind(xyz[i,], xyz_fit[i,]), type="l", col="green3", lty=2)
library(rgl)
plot3d(xyz, type="s", rad=0.1)
abclines3d(mean_xyz, a = dirVector, col="blue", lwd=2) # mean + t * direction_vector
for(i in 1:N) segments3d(rbind(xyz[i,], xyz_fit[i,]), col="green3")
```