marianess marianess - 4 years ago 74
R Question

A self-written code for biplot in ggplot2

I would like to have my own script that plots loadings and scores of PCA.
The main problem is that loadings and scores are not of the same unit measurements (in my data). I assume I would need to scale loadings somehow in my code.
Here I tried to have an example of biplot of PCA on iris data, but this code gives an error:


Error: Don't know how to add o to a plot


# mybiplot
# load data in
data <- (iris)
iris <- data[,1:4]
species <- data[,5]

# apply pca
pca <- prcomp(iris, center = TRUE,scale. = TRUE)

# extract scores and loadings
scores <- as.data.frame(pca$x)
loadings <- as.data.frame(pca$rotation)
label <- species

# make biplot
p = ggplot()+
geom_point(data = scores, aes(x=PC1, y=PC2, colour = factor(label)))+
geom_segment(data = loadings, aes(x=0,y=0,xend=PC1,yend=PC2),
arrow=arrow(length=unit(0.1,"cm")), color = "#DCDCDC")+
geom_text(data = loadings, aes(x=PC2, y=PC3, label=label),color="#006400")
p


I would like to get rid of this error (and understand why it have happened and what is wrong with the code), and also how can I get scores and loadings in one biplot.
biplot(PCA) obviously works, but I need to have a self-written code that is more flexible. ggbiplot() and autoplot() did not work at all.

G5W G5W
Answer Source

The problem is with your geom_text layer

geom_text(data = loadings, aes(x=PC2, y=PC3, label=label),color="#006400")

Both loadings$PC2 and loadings$PC3 have length 4, but label has length 150. These do not go together.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download