Jamie Leigh - 1 year ago 94

R Question

I am working on a homework assignment, and am not sure I understand the question. We are using the built-in

`iris`

using

`dist()`

`(1-correlation)`

`data <- iris[1:4]`

scaled <- scale(data)

I tried using

`dist()`

`dist(scaled)`

This prints out a massive output that I am not entirely sure what to do with. I don't know how else to approach this. I don't even know what it means when it asks what is the value of the proportional factor. I am pretty sure that the correlations it wants me to compare it to is

`cor(data)`

# Sepal.Length Sepal.Width Petal.Length Petal.Width

#Sepal.Length 1.0000000 -0.1175698 0.8717538 0.8179411

#Sepal.Width -0.1175698 1.0000000 -0.4284401 -0.3661259

#Petal.Length 0.8717538 -0.4284401 1.0000000 0.9628654

#Petal.Width 0.8179411 -0.3661259 0.9628654 1.0000000

But how do I compare the massive output from the

`dist()`

I am just hoping someone can help explain the question, and point me in the correct direction.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

This prints out a massive output that I am not entirely sure what to do with.

I am just hoping someone can help explain the question, and point me in the correct direction.

You want `dist(t(scaled))`

because `dist`

takes distance between rows. Consider your scaled dataset:

```
x <- scale(data.matrix(iris[1:4]))
```

The squared Euclidean distance matrix between columns is

```
## I have used `c()` outside to coerce it into a plain vector
d <- c(dist(t(x)) ^ 2)
# [1] 333.03580 38.21737 54.25354 425.67515 407.10553 11.06610
```

The lower triangular of correlation matrix is (we want lower triangular because the distance matrix is giving lower triangular part):

```
cx <- cor(x)[lower.tri(diag(4))]
# [1] -0.1175698 0.8717538 0.8179411 -0.4284401 -0.3661259 0.9628654
```

We then just do what your question asks to compare:

```
d / (1 - cx)
# [1] 298 298 298 298 298 298
```

`iris`

dataset has 150 rows, you should realize that `298 = 2 * (150 - 1)`

.

**Update**

I had no intention to post theoretical justification here. But the down vote irritates me and I am going to do it now.

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**