Andreas Andreas - 1 month ago 11
R Question

R formula in survfit

I am having trouble with formulas, environments, and

survfit()
.

Things work fine for
lm()
but they fail for
survfit()
.

General problem statement:



I am fitting a series of formulas to some data. So, I call the
modeling function with the formula passed as a variable. Later,
I want to work with the formula from the fitted object.

(From my naive point of view, the trouble comes from survfit not
recording the environment.)

Detailed Example



Expected behaviour as seen in
lm()
:

library("plyr")

preds <- c("wt", "qsec")

f <- function() {
lm(mpg ~ wt, data = mtcars)
}

fits <- alply(preds, 1, function(pred)
{
modform <- reformulate(pred, response = "mpg")

lm(modform, data = mtcars)
})

fits[[1]]$call$formula
##modform
formula(fits[[1]])
## mpg ~ wt
## <environment: 0x1419d1a0>


Even though
fits[[1]]$call$formula
resolves to
modform
I can
still get the original formula with
formula(fits[[1]])
.

But things fail for
survfit()
:

library("plyr")
library("survival")

preds <- c("resid.ds", "rx", "ecog.ps")

fits <-
alply(preds, 1, function(pred)
{
modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
modform <- as.formula(modform)
print(modform)

fit <- survfit(modform, data = ovarian)
})

fits[[1]]$call$formula
## modform
formula(fits[[1]])
## Error in eval(expr, envir, enclos) : object 'modform' not found


Here (and in contrast to lm-fits),
formula(fits[[1]])
does not
work!

So, my specific question is: How can I get back the formula used
to fit
fits[[1]]
?

Answer

The issue is that when x$formula is NULL, for an lm object there is a backup plan to get the formula; this doesn't exist for survfit objects

library("plyr")
library("survival")

preds <- c("wt", "qsec")
f <- function() lm(mpg ~ wt, data = mtcars)

fits <- alply(preds, 1, function(pred) {
  modform <- reformulate(pred, response = "mpg")
  lm(modform, data = mtcars)
})

fits[[1]]$formula
# NULL

The formula can be extracted with formula(fits[[1]]) which uses the formula generic. The lm S3 method for formula is

stats:::formula.lm

# function (x, ...) 
# {
#   form <- x$formula
#   if (!is.null(form)) {
#     form <- formula(x$terms)
#     environment(form) <- environment(x$formula)
#     form
#   }
#   else formula(x$terms)
# }

So when fits[[1]]$formula returns NULL, forumla.lm looks for a terms attribute in the object and finds the formula that way

fits[[1]]$terms

The survfit objects don't have x$formula or x$terms, so formula(x) givens an error

preds <- c("resid.ds", "rx", "ecog.ps")
fits <-  alply(preds, 1, function(pred) {
    modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
    modform <- as.formula(modform)
    fit <- survfit(modform, data = ovarian)
  })

fits[[1]]$formula
# NULL

formula(fits[[1]]) ## error

formula(fits[[1]]$terms)
# list()

You can fix this by inserting the formula into the call and evaluating it

modform <- as.formula(paste("Surv(futime, fustat)", 'rx', sep = " ~ "))
substitute(survfit(modform, data = ovarian), list(modform = modform))
# survfit(Surv(futime, fustat) ~ rx, data = ovarian)

eval(substitute(survfit(modform, data = ovarian), list(modform = modform)))

# Surv(futime, fustat) ~ rx

# Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
# 
#      n events median 0.95LCL 0.95UCL
# rx=1 13      7    638     268      NA
# rx=2 13      5     NA     475      NA

Or by manually putting the formula into x$call$formula

fit <- survfit(modform, data = ovarian)
fit$call$formula
# modform
fit$call$formula <- modform
fit$call$formula
# Surv(futime, fustat) ~ rx

fit
# Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
# 
#      n events median 0.95LCL 0.95UCL
# rx=1 13      7    638     268      NA
# rx=2 13      5     NA     475      NA