Lauren boomer Lauren boomer - 1 month ago 7
R Question

How to correlate non-numeric vectors in R

I have two non-numeric vectors that are as follows:
aluOrientation vector contains one of the two strings: Complementary and Direct
aluFamily vector contains one of the three strings: AluS, AluY, AluJ
Both are of size 36.

I want to correlate these two. That is I want to know how strong the association is between, for example, Complementary and Alu S. I know how to do a correlation test with numerical variables. But those methods, for example, kendall, does not work for a non-numeric variable.

I tried changing my vectors into factors in R. But it doesn't seem to work. Does anyone know a function or a package in R that can correlate two non-numeric vectors?

Answer
aluOrientation <- rep (c("Complementary", "Direct"), 18)
aluFamily      <- rep(c("AluS", "AluY", "AluJ"), 12)

One way to do this is factor analysis:

df1 <- data.frame(aluOrientation, aluFamily)

library(psych)
fa(r = cor(model.matrix(~aluFamily + aluOrientation  - 1, data = df1)), 
   rotate = "none", fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix
                       PA1      h2    u2 com
aluFamilyAluJ         1.73 3.0e+00 -1.99   1
aluFamilyAluS        -0.24 5.9e-02  0.94   1
aluFamilyAluY        -0.24 5.9e-02  0.94   1
aluOrientationDirect  0.00 1.0e-30  1.00   1

                PA1
SS loadings    3.11
Proportion Var 0.78

Mean item complexity =  1
Test of the hypothesis that 1 factor is sufficient.

The degrees of freedom for the null model are  6  and the objective function was  5.5
The degrees of freedom for the model are 2  and the objective function was  NaN 

The root mean square of the residuals (RMSR) is  0.23 
The df corrected root mean square of the residuals is  0.4 

Fit based upon off diagonal values = 0.57

For more details:

http://www.ats.ucla.edu/stat/r/whatstat/whatstat.htm#factor