Codex - 7 months ago 63

R Question

I've tried for several hours to calculate the Entropy and I know I'm missing something. Hopefully someone here can give me an idea!

EDIT: I think my formula is wrong!

**CODE:**

`info <- function(CLASS.FREQ){`

freq.class <- CLASS.FREQ

info <- 0

for(i in 1:length(freq.class)){

if(freq.class[[i]] != 0){ # zero check in class

entropy <- -sum(freq.class[[i]] * log2(freq.class[[i]])) #I calculate the entropy for each class i here

}else{

entropy <- 0

}

info <- info + entropy # sum up entropy from all classes

}

return(info)

}

I hope my post is clear, since it's the first time I actually post here.

`buys <- c("no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no")`

credit <- c("fair", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "fair", "fair", "excellent", "excellent", "fair", "excellent")

student <- c("no", "no", "no","no", "yes", "yes", "yes", "no", "yes", "yes", "yes", "no", "yes", "no")

income <- c("high", "high", "high", "medium", "low", "low", "low", "medium", "low", "medium", "medium", "medium", "high", "medium")

age <- c(25, 27, 35, 41, 48, 42, 36, 29, 26, 45, 23, 33, 37, 44) # we change the age from categorical to numeric

Answer

Ultimately I find no error in your code as it runs without error. The part I think you are missing is the calculation of the class frequencies and you will get your answer. Quickly running through the different objects you provide I suspect you are looking at `buys`

.

```
buys <- c("no", "no", "yes", "yes", "yes", "no", "yes", "no", "yes", "yes", "yes", "yes", "yes", "no")
freqs <- table(buys)/length(buys)
info(freqs)
[1] 0.940286
```

As a matter of improving your code, you can simplify this dramatically as you don't need a loop if you are provided a vector of class frequencies.

For example:

```
# calculate shannon-entropy
-sum(freqs * log2(freqs))
[1] 0.940286
```

As a side note, the function `entropy.empirical`

is in the `entropy`

package where you set the units to log2 allowing some more flexibility. Example:

```
entropy.empirical(freqs, unit="log2")
[1] 0.940286
```