Jakub Bochenski - 4 months ago 22

R Question

I have a set of observations with 23 variables.

When I use prcomp and biplot to plot the results I run into several problems:

- the actual plot only occupies half of the frame (x < 0), but the plot is centered on 0, so half of space is wasted
- two variables clearily dominate the results, so all other arrows are clumped together and I can't read a thing

ad 1. I tried setting xlim and/or ylim, but I'm obviously doing something wrong since the plot is all messed up when I do

ad 2. Can I just somehow make the arrow labels placed more apart so that I can read them? Or maybe I could just plot the arrows without the two longest ones (kind of zoom-in)?

Addendum: is it possible to have biplot draw the labels in a different color than the arrows?

Also: is it problematic if the x and y axes are not proportional (they graph shows intervals of different length on x and y).

I think this would skew the angels between arrows, and that kind of resizing is not a similarity transformation.

Is it possible to force biplot to keep a 1:1 aspect ratio, or to draw the plot as a rectangle and not a square?

Answer

I think you can use `xlim`

and `ylim`

. Also, have a look at the `expand`

argument for `?biplot`

. Unfortunately, you did not provide any data, so let's take some sample data:

```
a <- princomp(USArrests)
```

Below the result of just calling `biplot`

:

```
biplot(a)
```

And now one can "zoom in" to have a closer look at "Murder" and "Rape" using `xlim`

and `ylim`

and also use the scaling argument `expand`

from `?biplot`

:

```
biplot(a, expand=10, xlim=c(-0.30, 0.0), ylim=c(-0.1, 0.1))
```

Please note the different scaling on the top and right axis due to the `expand`

factor.

Does this help to make your plot mare readable?

**EDIT**

You also asked whether it is possible to have different colors for labels and arrows. `biplot`

does not support this, what you could do is to copy the code of `stats:::biplot.default`

and then change it according to your needs (change `col`

argument when `plot`

, `axis`

and `text`

is used).

Alternatively, you could use `ggplot`

for the biplot. In the post here, a simple biplot function is implemented. You could change the code as follows:

```
PCbiplot <- function(PC, x="PC1", y="PC2", colors=c('black', 'black', 'red', 'red')) {
# PC being a prcomp object
data <- data.frame(obsnames=row.names(PC$x), PC$x)
plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames), color=colors[1])
plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2, color=colors[2])
datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
mult <- min(
(max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
(max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
)
datapc <- transform(datapc,
v1 = .7 * mult * (get(x)),
v2 = .7 * mult * (get(y))
)
plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color=colors[3])
plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color=colors[4])
plot
}
```

Plot as follows:

```
fit <- prcomp(USArrests, scale=T)
PCbiplot(fit, colors=c("black", "black", "red", "yellow"))
```

If you play around a bit with this function, I am sure you can figure out how to set `xlim`

and `ylim`

values, etc.