T-T T-T - 5 days ago 6
R Question

R: Correlation calculation base on two conditions

I need to calculate the correlation for my data set based on two conditions. Below is an example:

df1 <- data.frame(
Main = c(0.0089, -0.050667, -0.030379, 0.066484, 0.006439, -0.026076),
B = c(NA, 0.0345, -0.0683, -0.052774, 0.014661, -0.040537),
C = c(0.0181, 0, -0.056197, 0.040794, 0.03516, -0.022662),
D = c(-0.0127, -0.025995, -0.04293, 0.057816, 0.033458, -0.058382)
)
df1
# Main B C D
# 1 0.008900 NA 0.018100 -0.012700
# 2 -0.050667 0.034500 0.000000 -0.025995
# 3 -0.030379 -0.068300 -0.056197 -0.042930
# 4 0.066484 -0.052774 0.040794 0.057816
# 5 0.006439 0.014661 0.035160 0.033458
# 6 -0.026076 -0.040537 -0.022662 -0.058382


I want to get the correlation between
Main
and each of the columns when 1)
Main<0
and 2) when 1) is met, columns
B
,
C
, or
D
do not equals
0
. In this example, for Main and B, row #2, 3 & 6 (df1) fit the rules; for Main and C, row #3 & 6 fit; for Main and D, row #2, 3 & 6 fit.

In Excel, this can be done by using two
if
functions. For example, to calculate the negative correlation between
Main
and
B
, I can use
{=CORREL(IF(A1:A6<0, A1:A6), IF(A1:A6<0, IF(B1:B6<>0, B1:B6)))}


My actual data set has close to 20 columns like this and will change from time to time. I'm pretty sure R can do it but I've been stuck with this problem for hours. Any advice will be appreciated.

Answer
a=as.matrix(df1)
ind=(a[,1]<0)
a[a==0]=NA
cor(a[ind,1],a[ind,-1],use="pairwise")
Comments