wowdavers - 4 months ago 18

R Question

I'm in a situation where I need to find the correlation between two variables

`cor(dataframe$x,dataframe$y)`

`x, y`

`dataframe`

I'm wondering how I can compare values of x and their corresponding values of y for two separate groups (0's and 1's). I'm new to R, so I guess I'm wondering if there's built in functionality into the

`cor()`

`x's`

`y's`

Guess that also leads to another question (which I've googled, it's not very clear cut to me yet): what's the difference between using a vector, array and dataframe in R under these functions (i.e.

`cor()`

`t.test()`

Answer

You could compute the correlation on the subset of rows specified by the indicator column. To select a subset use `dataframe[logical_index,]`

where `logical_index`

is a vector of booleans (in R called logical). To do this you should convert the indicators to booleans.

```
logical_index <- as.logical(dataframe$indicator)
cor(dataframe[logical_index,]$x, dataframe[logical_index,]$y)
cor(dataframe[!logical_index,]$x, dataframe[!logical_index,]$y)
```

Vectors, matrixes, arrays, lists and data frames are all different primitive types of R. A clear and relative easy introduction to the differences is given by Hadley in Advanced R: http://adv-r.had.co.nz/Data-structures.html