Saul Garcia - 1 year ago 112
R Question

# `nls` fails to estimate parameters of my model

I am trying to estimate the constants for Heaps law.
I have the following dataset

`novels_colection`
:

``````  Number of novels DistinctWords WordOccurrences
1                1         13575          117795
2                1         34224          947652
3                1         40353         1146953
4                1         55392         1661664
5                1         60656         1968274
``````

Then I build the next function:

``````# Function for Heaps law
heaps <- function(K, n, B){
K*n^B
}
heaps(2,117795,.7) #Just to test it works
``````

So
`n = Word Occurrences`
, and
`K`
and
`B`
are values that should be constants in order to find my prediction of Distinct Words.

I tried this but it gives me an error:

``````fitHeaps <- nls(DistinctWords ~ heaps(K,WordOccurrences,B),
data = novels_collection[,2:3],
start = list(K = .1, B = .1), trace = T)
``````

Error =
```Error in numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model```

Any idea in how could I fix this or a method to fit the function and get the values for
`K`
and
`B`
?

If you take log transform on both sides of `y = K * n ^ B`, you get `log(y) = log(K) + B * log(n)`. This is a linear relationship between `log(y)` and `log(n)`, hence you can fit a linear regression model to find `log(K)` and `B`.
``````logy <- log(DistinctWords)