Jamil Jamil - 1 month ago 18
R Question

Online PCA in R

I'm trying to code online PCA in R, there is no existing implementation of this code available, thus, it may be useful for others as well. The pseudo-code can be found here (Algorithm 1). What I've done so far is as follows:

PCA<-function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
lamb<-eigen(C)$values[1]
u<-eigen(C)$vectors[,1]
U<-cbind(U,u)
#U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}


To test the code I used the famous fisher iris data:

log.ir <- log(iris[, 1:4])
ir.species <- iris[, 5]

ir.pca <- PCA(log.ir,50,0.2)


There seems to be a bug in the code, which is not so obvious to me, the while loop never stops, can some one please help?

Answer

It's because while(norm(n,"2") >= 2*(X_f^2)/l) never finishes, 2*(X_f^2)/l) is always smaller than norm(n,"2")

In fact if you print out the values of these, and debug(PCA) you'll see that they never change

function(X,k,epsilon){
  X_f<-norm(as.matrix(X),"f")
  d<-nrow(X)
  n<-ncol(X)
  l<-floor((8*k)/(epsilon^2))
  U<-matrix(0,d,l)
  C<-matrix(0,d,d)
  Y<-matrix(0,n,l)
  for(t in 1:n){
    r<-X[,t]-(U%*%t(U)%*%X[,t])
    n<-C + r%*%t(r)
    while(norm(n,"2") >= 2*(X_f^2)/l){
      print(norm(n,"2") )
      print(2*(X_f^2)/l)
      lamb<-eigen(C)$values[1]
      u<-eigen(C)$vectors[,1]
      U<-cbind(U,u)
      U[,which(!apply(U==0,2,all))]
      C<-C-(lamb*(u%*%t(u)))
      r<-X[,t]-(U%*%t(U)%*%X[,t])
    }
    C<-C+(r%*%t(r))
    y<-matrix(0,1,l)    
    y<-t(U)%*%x_t
    Y[t,]<-y
  }
  return(Y)
}

debug(PCA)

In general using print statements inside of functions you want to debug is a good way to diagnose problems.

Comments