Jamil - 9 months ago 66

R Question

I'm trying to code online PCA in R, there is no existing implementation of this code available, thus, it may be useful for others as well. The pseudo-code can be found here (Algorithm 1). What I've done so far is as follows:

`PCA<-function(X,k,epsilon){`

X_f<-norm(as.matrix(X),"f")

d<-nrow(X)

n<-ncol(X)

l<-floor((8*k)/(epsilon^2))

U<-matrix(0,d,l)

C<-matrix(0,d,d)

Y<-matrix(0,n,l)

for(t in 1:n){

r<-X[,t]-(U%*%t(U)%*%X[,t])

n<-C + r%*%t(r)

while(norm(n,"2") >= 2*(X_f^2)/l){

lamb<-eigen(C)$values[1]

u<-eigen(C)$vectors[,1]

U<-cbind(U,u)

#U[,which(!apply(U==0,2,all))]

C<-C-(lamb*(u%*%t(u)))

r<-X[,t]-(U%*%t(U)%*%X[,t])

}

C<-C+(r%*%t(r))

y<-matrix(0,1,l)

y<-t(U)%*%x_t

Y[t,]<-y

}

return(Y)

}

To test the code I used the famous fisher iris data:

`log.ir <- log(iris[, 1:4])`

ir.species <- iris[, 5]

ir.pca <- PCA(log.ir,50,0.2)

There seems to be a bug in the code, which is not so obvious to me, the while loop never stops, can some one please help?

Answer Source

It's because `while(norm(n,"2") >= 2*(X_f^2)/l)`

never finishes, `2*(X_f^2)/l)`

is always smaller than `norm(n,"2")`

In fact if you print out the values of these, and `debug(PCA)`

you'll see that they never change

```
function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
print(norm(n,"2") )
print(2*(X_f^2)/l)
lamb<-eigen(C)$values[1]
u<-eigen(C)$vectors[,1]
U<-cbind(U,u)
U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}
debug(PCA)
```

In general using `print`

statements inside of functions you want to debug is a good way to diagnose problems.