user4178184 user4178184 - 1 month ago 18
R Question

R pca ggbiplot error : replacement has 36 rows, data has 35

I'm new to R. I was trying to use pca and ggbiplot to display the pca result but somehow stuck with some errors I could not solve. Perhaps there is a problem with my data as the code works fine with other data.
I put the code and the data files I use in case you would like to recreate the scenario at the following link :-

https://drive.google.com/drive/folders/0B2jQ7Vh3S3PaZkt3Y2ZyaV9XaXc

the code : pca-plot.R
data file 1 : dat1.rda (this one works fine)
data file 2 : dat2.rda (this one has problem)

Appreciate any help.
The error i got is at the bottom.

Thank you,
--we

> g <- ggbiplot(tr.pca, obs.scale = 1, var.scale = 1,
+ groups = Ydfall,
+ ellipse = TRUE,
+ circle = TRUE)
Error in `$<-.data.frame`(`*tmp*`, "groups", value = c(1L, 1L, 1L, 1L, :
replacement has 36 rows, data has 35
> g <- g + scale_color_discrete(name = '')
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
> g <- g + theme(legend.direction = 'horizontal',
+ legend.position = 'top')
> g<- g+ geom_point(size=1, shape=1, color="black", stroke=2)
>
> print(g)
>

Answer

Your dfall from dat2.rda has NA (try which(is.na(dfall), arr.ind = T)) and it causes your problem. You used na.omit() when you used prcomp() but didn't when you made Ydfall.

Ydfall <- na.omit(dfall)[,1]   # quick fix

# but if I were you, first I would have done;
dfall <- na.omit(dfall)