T-T - 1 year ago 99
R Question

# R: Correlation calculation based on two conditions

I need to calculate the correlation for my data set based on two conditions. Below is an example:

``````df1 <- data.frame(
Main = c(0.0089, -0.050667, -0.030379, 0.066484, 0.006439, -0.026076),
B = c(NA, 0.0345, -0.0683, -0.052774, 0.014661, -0.040537),
C = c(0.0181, 0, -0.056197, 0.040794, 0.03516, -0.022662),
D = c(-0.0127, -0.025995, -0.04293, 0.057816, 0.033458, -0.058382)
)
df1
#    Main       B          C          D
# 1  0.008900   NA         0.018100  -0.012700
# 2 -0.050667   0.034500   0.000000  -0.025995
# 3 -0.030379  -0.068300  -0.056197  -0.042930
# 4  0.066484  -0.052774   0.040794   0.057816
# 5  0.006439   0.014661   0.035160   0.033458
# 6 -0.026076  -0.040537  -0.022662  -0.058382
``````

I want to get the correlation between
`Main`
and each of the columns when 1)
`Main<0`
and 2) when 1) is met, columns
`B`
,
`C`
, or
`D`
do not equals
`0`
. In this example, for Main and B, row #2, 3 & 6 (df1) fit the rules; for Main and C, row #3 & 6 fit; for Main and D, row #2, 3 & 6 fit.

In Excel, this can be done by using two
`if`
functions. For example, to calculate the negative correlation between
`Main`
and
`B`
, I can use
`{=CORREL(IF(A1:A6<0, A1:A6), IF(A1:A6<0, IF(B1:B6<>0, B1:B6)))}`

My actual data set has close to 20 columns like this and will change from time to time. I'm pretty sure R can do it but I've been stuck with this problem for hours. Any advice will be appreciated.

``````a=as.matrix(df1)