dindom dindom - 1 month ago 19
Python Question

can two python programs write and read a same HDF5 file at same time

I have 2 python programs:


1)data reciever: It's a TCP SERVER writen in tornado,about 3,000 rows of data will be sent to it every second. here is the hander:


def _on_data_rev(data_list):
tickstore_file=r"d:\data\2016_01_11.h5"
tempdf=pd.DataFrame(data_list)
hdf_output = pd.HDFStore(tickstore_file, complib='blosc')
hdf_output['_'+str(int(time.time()))]=tempdf
hdf_output.flush()


The data comes very fast, so this program only do save job, but nothing else.


2)data analysis:program 2 will analysis the lastest data on the same file every second


Can I read the HDF5 file at the same time? will this destory the HDF5 file?

Answer

What your looking for is the Single Writer Multiple Reader (SWMR) feature of HDF5.

SWMR is listed as new in the 1.10 release and has a fair bit of documentation on it.

It is also in h5py with version 2.5.0.

As for support in pandas, I am not too sure as I don't use it.

Comments