Shaun Overton - 3 months ago 31

R Question

I have a problem controlling the object types feeding into the predict function. Here's my simplified function that generates the

`glm`

`fitOneSample <- function(x,data,sampleSet)`

{

#how big of a set are we going to analyze? Pick a number between 5,000 & 30,000, then select that many rows to study

sampleIndices <- 1:5000

#now randomly pick which columns to study

colIndices <- 1:10

xnames <- paste(names(data[,colIndices]),sep = "")

formula <- as.formula(paste("target ~ ", paste(xnames,collapse = "+")))

glm(formula,family=binomial(link=logit),data[sampleIndices,])

}

myFit <- fitOneSample(1,data,sampleSet)

fits <- sapply(1:2,fitOneSample,data,sampleSet)

all.equal(myFit,fits[,1]) #different object types

#this works

probability <- predict(myFit,newdata = data)

#this doesn't

probability2 <- predict(fits[,1],newdata = data)

# Error in UseMethod("predict") :

# no applicable method for 'predict' applied to an object of class "list"

How do I access the column in

`fits[,1]`

`myFit`

Answer

I think I am now able to recover your situation.

```
fits <- sapply(names(trees),
function (y) do.call(lm, list(formula = paste0(y, " ~ ."), data = trees)))
```

This uses built-in dataset `trees`

as an example, fitting three linear models:

```
Girth ~ Height + Volume
Height ~ Girth + Volume
Volume ~ Height + Girth
```

Since we have used `sapply`

, and each iteration returns the same `lm`

object, or a length-12 list, results will be simplified to a `12 * 3`

matrix:

```
class(fits)
# "matrix"
dim(fits)
# 12 3
```

Matrix indexing `fits[, 1]`

is valid.

If you check `str(fits[, 1])`

, it almost looks like a normal `lm`

object. But if you further check:

```
class(fits[, 1])
# "list"
```

**Em? It does not have "lm" class!** As a result, `S3`

dispatch method will fails when you call generic function `predict`

:

```
predict(x)
#Error in UseMethod("predict") :
# no applicable method for 'predict' applied to an object of class "list"
```

**This can be seen as a good example that sapply is destructive.** We want

`lapply`

, or at least, `sapply(..., simplify = FALSE)`

:```
fits <- lapply(names(trees),
function (y) do.call(lm, list(formula = paste0(y, " ~ ."), data = trees)))
```

The results of `lapply`

is easier to understood. It is a length-3 list, where each element is an `lm`

object. We can access the first model via `fits[[1]]`

. Now everything will work:

```
class(fits[[1]])
# "lm"
predict(fits[[1]])
# 1 2 3 4 5 6 7 8
# 9.642878 9.870295 9.941744 10.742507 10.801587 10.886282 10.859264 10.957380
# 9 10 11 12 13 14 15 16
#11.588754 11.289186 11.946525 11.458400 11.536472 11.835338 11.133042 11.783583
# 17 18 19 20 21 22 23 24
#13.547349 12.252715 12.603162 12.765403 14.002360 13.364889 14.535617 15.016944
# 25 26 27 28 29 30 31
#15.628799 17.945166 17.958236 18.556671 17.229448 17.131858 21.888147
```

You can fix your code by

```
fits <- lapply(1:2,fitOneSample,data,sampleSet)
probability2 <-predict(fits[[1]],newdata = data)
```