I found codes to calculate the sum of the area under a density curve in R. Unfortunately, I don't understand why there is always an extra ~"0.000976" at the area...
nb.data = 500000
y = rnorm(nb.data,10,2)
de = density(y)
This discrepancy is not just due to rounding errors or floating-point arithmetic. You are using linear interpolation (a.k.a. the trapzoidal rule), which means that you are overestimating the area in areas of the density that are concave up and underestimating it in areas that are concave down. Here's an example image from the Wikipedia article demonstrating the systematic error:
Image by Intégration_num_trapèzes.svg: Scalerderivative work: Cdang (talk) - Intégration_num_trapèzes.svg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=8541370
Since the normal distribution has more concave up areas, the overall estimate is too high. As mentioned in another answer, using a higher resolution (i.e. increasing N) helps to minimize the error. You might also get better results using a different method for numerical integration (e.g. Simpson's rule).
However, there is no numerical integration method that is going to give you an exact answer, and even if there was, the return value of
density is only an approximation of the real distribution anyway. (And for real data, the true distribution is unknown.)
If all you want is the satisfaction of seeing a known density function integrating to 1, you can use
integrate on the normal density function:
> integrate(dnorm, lower=-Inf, upper=Inf, mean=10, sd=2) 1 with absolute error < 4.9e-06