T-T - 9 months ago 57

R Question

I need to calculate the correlation for my data set based on two conditions. Below is an example:

`df1 <- data.frame(`

Main = c(0.0089, -0.050667, -0.030379, 0.066484, 0.006439, -0.026076),

B = c(NA, 0.0345, -0.0683, -0.052774, 0.014661, -0.040537),

C = c(0.0181, 0, -0.056197, 0.040794, 0.03516, -0.022662),

D = c(-0.0127, -0.025995, -0.04293, 0.057816, 0.033458, -0.058382)

)

df1

# Main B C D

# 1 0.008900 NA 0.018100 -0.012700

# 2 -0.050667 0.034500 0.000000 -0.025995

# 3 -0.030379 -0.068300 -0.056197 -0.042930

# 4 0.066484 -0.052774 0.040794 0.057816

# 5 0.006439 0.014661 0.035160 0.033458

# 6 -0.026076 -0.040537 -0.022662 -0.058382

I want to get the correlation between

`Main`

`Main<0`

`B`

`C`

`D`

`0`

In Excel, this can be done by using two

`if`

`Main`

`B`

`{=CORREL(IF(A1:A6<0, A1:A6), IF(A1:A6<0, IF(B1:B6<>0, B1:B6)))}`

My actual data set has close to 20 columns like this and will change from time to time. I'm pretty sure R can do it but I've been stuck with this problem for hours. Any advice will be appreciated.

Answer Source

```
a=as.matrix(df1)
ind=(a[,1]<0)
a[a==0]=NA
cor(a[ind,1],a[ind,-1],use="pairwise")
```