GRS - 1 year ago 73

R Question

Link to dataset

Defined parameters:

`M <- maximum.oxygen.uptake`

m <- mass

a <- age

s <- sex

v <- as.numeric(vigorous.exercise>0)

sv <- s*v

asv <- a*s*v

as <- a*s

av <- a*v

lnm=log(m)

lnms <- log(m)*s

lnmv <- log(m)*v

lnmsv <- log(m)*s*v

y <- M/m^(2/3)

I fit an

`nls`

`nls.full <- nls(M ~ (m ^ (alpha0 + alpha1 * s + alpha2 * v + alpha3 * s * v)) *`

(beta0 + beta1 * s + beta2 * v + beta3 * sv +

a * gamma0 + gamma1 * as + gamma2 * av + gamma3 * asv),

trace=TRUE,

start=list(alpha0=2/3, alpha1=0, alpha2=0, alpha3=0,

beta0=est[1], beta1=est[2], beta2=est[3],beta3=est[4],

gamma0=est[5],gamma1=est[6],gamma2=est[7],gamma3=est[8]))

`xpredict <- seq(10,120,length.out=300)`

data1 <- data.frame(a=35,s=0,v=1,m=seq(10,120,length.out=300))

ypredict <- predict(nls.full, newdata=data1, type="response")

plot(log(maximum.oxygen.uptake) ~ log(mass), subset = (s=='0' & v=='1'))

lines(xpredict,ypredict)

lengths of y and x differ.

I don't see why it should, I defined a new data frame with 300 variables, I should only have 300 results in the

`y`

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

Your question adds an important case study on the use of `predict`

, which is currently missing on this site (as far as I know), hence I did not close it as a duplicate as I would usually do.

This simple example is sufficient to illustrate what your problem is:

```
set.seed(0)
x <- runif(50)
y <- runif(50)
## true model
z <- exp(4 * x - x * y) + sin(0.5 * y) + rnorm(50)
```

We can fit a non-linear regression model by:

```
fit1 <- nls(z ~ exp(a * x + b * x * y) + sin(c * y),
start = list(a = 3, b = 0, c = 1))
```

or

```
xy <- x * y
fit2 <- nls(z ~ exp(a * x + b * xy) + sin(c * y),
start = list(a = 3, b = 0, c = 1))
```

However, be careful when making prediction with `predict`

.

```
newdat <- data.frame(x = runif(2), y = runif(2))
pred1 <- predict(fit1, newdat)
# [1] 19.476569 2.870397
pred2 <- predict(fit2, newdat)
#[1] 12.205215 2.900922 16.675160 2.588310 18.466907 3.221744 21.207958
#[8] 2.478375 16.294230 2.230084 22.675165 2.741694 22.053141 2.441442
#[15] 20.378554 2.069649 20.362845 2.380586 10.570350 3.168567 11.477691
#[22] 2.438041 19.336928 2.648129 22.282448 2.899636 16.264152 3.229857
#[29] 19.928498 1.779721 16.563424 2.688125 14.925190 2.718176 21.853093
#[36] 1.856641 20.213350 1.957830 22.960452 2.767944 21.890656 2.719899
#[43] 22.370200 2.066384 14.061771 2.237771 12.102094 3.232742 18.985547
#[50] 1.909355
```

`predict.nls`

does not issue any warning like what `predict.lm`

and `predict.glm`

do (Getting Warning: “ 'newdata' had 1 row but variables found have 32 rows” on predict.lm in R). Basically, you have to provide all variables used in your fitting formula. Be aware, `xy`

is also a variable:

```
newdat$xy <- with(newdat, x * y)
pred2 <- predict(fit2, newdat)
# [1] 19.476569 2.870397
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**