Huanian Zhang Huanian Zhang - 3 months ago 7x
Python Question

How to distribute data in 2D map grid and take average

I am trying to do a map in 2D coordinates with color defined by a third variable. I already defined the grids by the following command:

b_step = np.linspace(-75,90,12)
l_step = np.linspace(0,360,25)
grid = [(x,y) for x in b_step for y in l_step]

There are three variables in my data set, one is
, which are the coordinates, the real data is called
. There are about 7 million datasets. I first want to distribute the data in those grid points, then take average within each grid. Then finally I will use the average
to do map. Anyone has any ideas how to distribute the data in the grid points efficiently and take average?

I know
(which is a powerful software for High Energy community) can handle it, but I want to write it more pythonic. Thanks.


ROOT TH2F is the best way to handle it efficiently. If you create two TH2F histogram, one tracks the data, the other one tracks total number contributed, then you can calculate the mean value in each grid point. The python code for this is below:

from ROOT import TH2F, gStyle, TCanvas

##### if you want equally distributed grid points.
#h1 = TH2F('h1','h1',l_num,0.0,360.0,b_num,-90.0,90.0)
#h2 = TH2F('h2','h2',l_num,0.0,360.0,b_num,-90.0,90.0)

##### if you want non-equally distributed grid points.
xBins = 37
yBins = 17
xEdges = np.linspace(-185,185,38)
yEdges = np.array([-105.0,-75.0,-60.0,-45.0,-30.0,-15.0,15.0,35.0,40.0,45.0,50.0,55.0,60.0,65.0,70.0,75.0,80.0,100.0])
h1 = TH2F('h1','h1',xBins,xEdges,yBins,yEdges)
h2 = TH2F('h2','h2',xBins,xEdges,yBins,yEdges)

for i in range(data_size):

for ii in range(1,h1.GetNbinsX()+1):
    for jj in range(1,h1.GetNbinsY()+1):
        ss = h1.GetBinContent(ii,jj)
        nn = h2.GetBinContent(ii,jj)
        xx = h1.GetXaxis().GetBinCenter(ii)
        yy = h1.GetYaxis().GetBinCenter(jj)
        mean = ss/nn

Now you already have the grip coordinates xx and yy, and the data points within it ss, then you can make color plots.