TonyGW - 6 months ago 74

R Question

I'm using the dataset

`BreastCancer`

`mlbench`

I got the features in the first 10 columns, and create a vector of parameters called theta:

`X <- BreastCancer[,1:10]`

theta <- data.frame(rep(1,10))

Then I did the following matrix multiplication:

`constant <- as.matrix(X) %*% as.vector(theta[,1])`

However, I got the following error:

`Error in as.matrix(X) %*% as.vector(theta[, 1]) :`

requires numeric/complex matrix/vector arguments

Do I need to cast the matrix to double using

`as.numeric(X)`

@Zheyuan Li:

My question is different from the one you are referring to, as it does not have the same issue as I have:

`numeric/complex matrix/vector arguments`

Answer

No, I can' stand it... after quite a long-winded discussion and sort of argument under your question, I felt no better way than to reopen this and answer it.

```
## drop incomplete data with NA
dat <- na.omit(BreastCancer)
## data type convert for variables other than `ID` and `Class`
dat[2:10] <- lapply(dat[2:10], function (x) as.numeric(levels(x)))[x])
## get the matrix
X <- data.matrix(dat[2:10])
## some possible matrix-vector multiplications
beta <- runif(9)
yhat <- X %*% beta
## add prediction back to data frame
dat$prediction <- yhat
```

There are several things I don't understand though... Why don't you use `predict`

if you have a regression model? You gave an explanation but I don't get it at all. Anyway, the above should be comprehensive. If you want a data frame, there it is; if you want to use matrix-vector multiplication on legitimate numeric columns, go ahead; if you want to put prediction back to data frame, it is also done.

This line also worked for me:

`as.matrix(sapply(dat, as.numeric))`

Looks like you were lucky. The dataset happens to have factor levels as same as numeric values. In general, converting a factor to numeric should use the method I did. Compare

```
f <- gl(4, 2, labels = c(12.3, 0.5, 2.9, -11.1))
#[1] 12.3 12.3 0.5 0.5 2.9 2.9 -11.1 -11.1
#Levels: 12.3 0.5 2.9 -11.1
as.numeric(f)
#[1] 1 1 2 2 3 3 4 4
as.numeric(levels(f))[f]
#[1] 12.3 12.3 0.5 0.5 2.9 2.9 -11.1 -11.1
```

Please read about `?factor`

thoroughly.