Jamil - 1 year ago 94
R Question

# Online PCA in R

I'm trying to code online PCA in R, there is no existing implementation of this code available, thus, it may be useful for others as well. The pseudo-code can be found here (Algorithm 1). What I've done so far is as follows:

PCA<-function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
lamb<-eigen(C)\$values[1]
u<-eigen(C)\$vectors[,1]
U<-cbind(U,u)
#U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}

To test the code I used the famous fisher iris data:

log.ir <- log(iris[, 1:4])
ir.species <- iris[, 5]

ir.pca <- PCA(log.ir,50,0.2)

There seems to be a bug in the code, which is not so obvious to me, the while loop never stops, can some one please help?

It's because while(norm(n,"2") >= 2*(X_f^2)/l) never finishes, 2*(X_f^2)/l) is always smaller than norm(n,"2")

In fact if you print out the values of these, and debug(PCA) you'll see that they never change

function(X,k,epsilon){
X_f<-norm(as.matrix(X),"f")
d<-nrow(X)
n<-ncol(X)
l<-floor((8*k)/(epsilon^2))
U<-matrix(0,d,l)
C<-matrix(0,d,d)
Y<-matrix(0,n,l)
for(t in 1:n){
r<-X[,t]-(U%*%t(U)%*%X[,t])
n<-C + r%*%t(r)
while(norm(n,"2") >= 2*(X_f^2)/l){
print(norm(n,"2") )
print(2*(X_f^2)/l)
lamb<-eigen(C)\$values[1]
u<-eigen(C)\$vectors[,1]
U<-cbind(U,u)
U[,which(!apply(U==0,2,all))]
C<-C-(lamb*(u%*%t(u)))
r<-X[,t]-(U%*%t(U)%*%X[,t])
}
C<-C+(r%*%t(r))
y<-matrix(0,1,l)
y<-t(U)%*%x_t
Y[t,]<-y
}
return(Y)
}

debug(PCA)

In general using print statements inside of functions you want to debug is a good way to diagnose problems.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download