Peta Hitchens Peta Hitchens - 3 months ago 11
R Question

Calculating trapezoidal integration from origin to each time point in R

I have a data frame

x
:

head(x)
# time Qfr
#1 1 0.004751271
#2 2 0.005405618
#3 3 0.005785781
#4 4 0.006028213
#5 5 0.006179973
#6 6 0.006263814


I am trying to calculate the numerical integration from
time = 0
up to each time point, i.e., the integral:

\integral_{u=0}^t Qfr du


My data set looks like

plot(x, type = "p", cex = 0.2)


enter image description here

So far I have only been able to calculate the integral in total, using the package
pracma
:

require(pracma)
trapz(x$time, x$Qfr)
# [1] 0.1536843


How do I code for integral from the origin to the time given in that row?

Any help greatly appreciated!




x <-
structure(list(time = 1:100, Qfr = c(0.00475127142639315, 0.00540561802578535,
0.00578578141896237, 0.00602821304872631, 0.00617997318815436,
0.00626381438010966, 0.0062930341038365, 0.00627650284793016,
0.00622076547748955, 0.00613104312485634, 0.00601175416200995,
0.00586680072681021, 0.00569973138194467, 0.00551383427584607,
0.00531218958660475, 0.00509769744944577, 0.00487309097312275,
0.00464094029551979, 0.0044036514994002, 0.00416346290979426,
0.00392244046575488, 0.00368247330791138, 0.00344527034180023,
0.00321235826358148, 0.00298508133306843, 0.00276460302703881,
0.00255190958997126, 0.0023478154110241, 0.00215297008955578,
0.00196786700285879, 0.00179285315617775, 0.00162814007427384,
0.00147381548391774, 0.00132985553610085, 0.00119613732394456,
0.00107245146585054, 0.000958514542040229, 0.000853981195025623,
0.000758455729566888, 0.000671503074231956, 0.000592658993812166,
0.000521439468716574, 0.000457349183339767, 0.000399889089676022,
0.000348563034666554, 0.00030288345957139, 0.000262376196826176,
0.000226584404259194, 0.000195071688184451, 0.000167424475817082,
0.000143253703811788, 0.000122195893694217, 0.000103913686771705,
8.80959110345906e-05, 7.44572508696844e-05, 6.27375873853488e-05,
5.27010730706422e-05, 4.4134999643845e-05, 3.68485125344791e-05,
3.06712197100413e-05, 2.54517366998868e-05, 2.1056203851463e-05,
1.73668062169167e-05, 1.42803211212964e-05, 1.17067134905708e-05,
9.56779447711296e-06, 7.79595484853561e-06, 6.33298101979921e-06,
5.1289585088234e-06, 4.14126496950611e-06, 3.33365277945052e-06,
2.67541940141655e-06, 2.14066236056745e-06, 1.70761464370064e-06,
1.35805559020556e-06, 1.07679186575234e-06, 8.51202848467075e-07,
6.7084467545985e-07, 5.27107259938671e-07, 4.12918764032332e-07,
3.22492271674051e-07, 2.51109725048683e-07, 1.94938546369442e-07,
1.50876746740867e-07, 1.16422711484835e-07, 8.95662353428025e-08,
6.86977528360132e-08, 5.25330624541746e-08, 4.00511738714265e-08,
3.0443212344273e-08, 2.30705923615172e-08, 1.74309232147824e-08,
1.31303328068787e-08, 9.86109389078818e-09, 7.38361045310242e-09,
5.51197296231586e-09, 4.102421614833e-09, 3.044168569724e-09,
2.25212541292965e-09, 1.66116273443706e-09)), .Names = c("time",
"Qfr"), class = "data.frame", row.names = c(NA, -100L))

Answer

Since the other answer shows you how to use pracma::trapz to achieve your purpose, I can't do that way. I had planned to wrote an answer that way, but since I spent a great deal of time editing your question, @shayaa took the first place. Luckily, I have a much better idea.

Trapezoidal numerical integral is nothing complicated. You already have time on a regular grid 1, 2, 3, ... 100 with bin size 1, as well as known function values Qfr on the grid. The numerical integral on each bin is just the area of the trapezoid. So, you could compute:

## integration on each bin cell
cell <- with(x, (Qfr[1:99] + Qfr[2:100]) / 2)

Then, the cumulative integral value you want is just:

cumsum(c(0, cell))

This method is supper fast! Suppose you have N data points, this method has computational costs of O(N), yet it is fully vectorized. The other answer using sapply is not vectorized, and will cost you O(N^2) computation.