AwfulPersimmon - 1 year ago 47

R Question

I am trying to find the values which contribute the most and least to a Guassian Kernel Density estimator. I've written a function to find these, but I'm getting multiple values for a max when I run it. I thought this might be related to the number of significant digits, so I increased those but nothing changed.

Can anyone offer some insight?

`#Create a vector with Brainsize-Small Litter Size Data:`

Y_i<-c(0.42, 0.86, 0.88, 1.11, 1.34, 1.38, 1.42, 1.47, 1.63,1.73, 2.17, 2.42, 2.48, 2.74, 2.74, 2.79, 2.90, 3.12,3.18, 3.27, 3.30, 3.61, 3.63, 4.13, 4.40, 5.00, 5.20,5.59, 7.04, 7.15, 7.25, 7.75, 8.00, 8.84, 9.30, 9.68,10.32, 10.41, 10.48, 11.29, 12.30, 12.53, 12.69, 14.14, 14.15,14.27, 14.56, 15.84, 18.55, 19.73, 20.00)

#Create a kernel density estimator function:

kern<-function(data=0,effOf=0,bw=0){

z<-rep(0,length(data))

k<-rep(0,length(data))

for(i in data){

z[i]=((effOf-data[i])/bw)

}

for(j in z){

k[j]=(1/sqrt(2*pi))*exp(-(z[j]^2/2))/(length(data)*bw)

}

min=min(k)

imin=which(k==min)

ymin=data[imin]

max=max(k)

imax=which(k==max)

ymax=data[imax]

print(paste("The minimum contributor is value",ymin))

print(paste("The maximum contributor is value",ymax))

estimate=sum(k)

return(estimate)

}

#(a.)Use KDE function to estimate f(16)

kern(Y_i,16,3)

This is the output- keep in mind that I only want one maximum value:

`[1] "The minimum contributor is value 0.42" "The minimum contributor is value 0.86"`

[3] "The minimum contributor is value 0.88" "The minimum contributor is value 1.38"

[5] "The minimum contributor is value 1.42" "The minimum contributor is value 1.47"

[7] "The minimum contributor is value 1.63" "The minimum contributor is value 1.73"

[9] "The minimum contributor is value 2.17" "The minimum contributor is value 2.42"

[11] "The minimum contributor is value 2.48" "The minimum contributor is value 2.74"

[13] "The minimum contributor is value 2.74" "The minimum contributor is value 2.79"

[15] "The minimum contributor is value 2.9" "The minimum contributor is value 3.12"

[17] "The minimum contributor is value 3.18" "The minimum contributor is value 3.27"

[19] "The minimum contributor is value 3.3" "The minimum contributor is value 3.61"

[21] "The minimum contributor is value 3.63" "The minimum contributor is value 4.13"

[23] "The minimum contributor is value 4.4" "The minimum contributor is value 5"

[25] "The minimum contributor is value 5.2" "The minimum contributor is value 5.59"

[27] "The minimum contributor is value 7.04" "The minimum contributor is value 7.15"

[29] "The minimum contributor is value 7.25" "The minimum contributor is value 7.75"

[31] "The minimum contributor is value 8" "The minimum contributor is value 8.84"

[33] "The minimum contributor is value 9.3" "The minimum contributor is value 9.68"

[35] "The minimum contributor is value 10.32" "The minimum contributor is value 10.41"

[37] "The minimum contributor is value 10.48" "The minimum contributor is value 11.29"

[39] "The minimum contributor is value 12.3" "The minimum contributor is value 12.53"

[41] "The minimum contributor is value 12.69" "The minimum contributor is value 14.14"

[43] "The minimum contributor is value 14.15" "The minimum contributor is value 14.27"

[45] "The minimum contributor is value 14.56" "The minimum contributor is value 15.84"

[47] "The minimum contributor is value 18.55" "The minimum contributor is value 19.73"

[49] "The minimum contributor is value 20"

[1] "The maximum contributor is value 1.34"

[1] 2.868016e-08

Answer Source

I think the correct code should be like the following (with 2 modifications, the indexing in the for loops were incorrect that resulted in wrong z and kernel values):

```
kern<-function(data=0,effOf=0,bw=0){
z<-rep(0,length(data))
k<-rep(0,length(data))
for(i in 1:length(data)){ # iterate through all data points
z[i]=((effOf-data[i])/bw)
}
for(j in 1:length(z)){ # iterate through all z values
k[j]=(1/sqrt(2*pi))*exp(-(z[j]^2/2))/(length(data)*bw)
}
min=min(k)
imin=which(k==min)
ymin=data[imin]
max=max(k)
imax=which(k==max)
ymax=data[imax]
print(paste("The minimum contributor is value",ymin))
print(paste("The maximum contributor is value",ymax))
estimate=sum(k)
return(estimate)
}
#(a.)Use KDE function to estimate f(16)
kern(Y_i,16,3)
[1] "The minimum contributor is value 0.42"
[1] "The maximum contributor is value 15.84"
[1] 0.02254657
```