user918967 - 9 months ago 57

R Question

I am befuddled by the format to perform a simple prediction using R's

`survival`

`library(survival)`

lung.surv <- survfit(Surv(time,status) ~ 1, data = lung)

So fitting a simple exponential regression (for example purposes only) is:

`lung.reg <- survreg(Surv(time,status) ~ 1, data = lung, dist="exponential")`

How would I predict the percent survival at time=400?

When I use the following:

`myPredict400 <- predict(lung.reg, newdata=data.frame(time=400), type="response")`

I get the following:

`myPredict400`

1

421.7758

I was expecting something like 37% so I am missing something pretty obvious

Answer

The point with this survival function is to find an empirical distribution that fits the survival times. Essentially you are associating a survival time with a probability. Once you have that distribution, you can pick out the survival rate for a given time.

Try this:

```
library(survival)
lung.reg <- survreg(Surv(time,status) ~ 1, data = lung) # because you want a distribution
pct <- 1:99/100 # this creates the empirical survival probabilities
myPredict400 <- predict(lung.reg, newdata=data.frame(time=400),type='quantile', p=pct)
indx = which(abs(myPredict400 - 400) == min(abs(myPredict400 - 400))) # find the closest survival time to 400
print(1 - pct[indx]) # 0.39
```

Straight from the help docs, here's a plot of it:

```
matplot(myPredict400, 1-pct, xlab="Months", ylab="Survival", type='l', lty=c(1,2,2), col=1)
```

**Edited**

You're basically fitting a regression to a distribution of probabilities (hence 1...99 out of 100). If you make it go to 100, then the last value of your prediction is `inf`

because the survival rate in the 100th percentile is infinite. This is what the `quantile`

and `pct`

arguments do.

For example, setting `pct = 1:999/1000`

you get much more precise values for the prediction (`myPredict400`

). Also, if you set `pct`

to be some value that's not a proper probability (i.e. less than 0 or more than 1) you'll get an error. I suggest you play with these values and see how they impact your survival rates.

Source (Stackoverflow)