taras.roshko taras.roshko - 5 months ago 12
Python Question

What is the way data is stored in *.npy?

I'm saving NumPy arrays using numpy.save function.
I want other developers to have capability to read data from those file using C language.
So I need to know,how numpy organizes binary data in file.OK, it's obvious when I'm saving array of 'i4' but what about array of arrays that contains some structures?Can't find any info in documentation

UPD :
lets say tha data is something like :

dt = np.dtype([('outer','(3,)<i4'),('outer2',[('inner','(10,)<i4'),('inner2','f8')])])


UPD2 : What about saving "dynamic" data (dtype - object)

import numpy as np
a = [0,0,0]
b = [0,0]
c = [a,b]
dtype = np.dtype([('Name', '|S2'), ('objValue', object)])
data = np.zeros(3, dtype)
data[0]['objValue'] = a
data[1]['objValue'] = b
data[2]['objValue'] = c
data[0]['Name'] = 'a'
data[1]['Name'] = 'b'
data[2]['Name'] = 'c'

np.save(r'D:\in.npy', data)


Is it real to read that thing from C?

Answer

The npy file format is documented in numpy's https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.rst.

For instance, the code

>>> dt=numpy.dtype([('outer','(3,)<i4'),
...                 ('outer2',[('inner','(10,)<i4'),('inner2','f8')])])
>>> a=numpy.array([((1,2,3),((10,11,12,13,14,15,16,17,18,19),3.14)),
...                ((4,5,6),((-1,-2,-3,-4,-5,-6,-7,-8,-9,-20),6.28))],dt)
>>> numpy.save('1.npy', a)

results in the file:

93 4E 55 4D 50 59                      magic ("\x93NUMPY")
01                                     major version (1)
00                                     minor version (0)

96 00                                  HEADER_LEN (0x0096 = 150)
7B 27 64 65 73 63 72 27 
3A 20 5B 28 27 6F 75 74 
65 72 27 2C 20 27 3C 69 
34 27 2C 20 28 33 2C 29 
29 2C 20 28 27 6F 75 74 
65 72 32 27 2C 20 5B 28 
27 69 6E 6E 65 72 27 2C 
20 27 3C 69 34 27 2C 20 
28 31 30 2C 29 29 2C 20 
28 27 69 6E 6E 65 72 32                Header, describing the data structure
27 2C 20 27 3C 66 38 27                "{'descr': [('outer', '<i4', (3,)),
29 5D 29 5D 2C 20 27 66                            ('outer2', [
6F 72 74 72 61 6E 5F 6F                               ('inner', '<i4', (10,)), 
72 64 65 72 27 3A 20 46                               ('inner2', '<f8')]
61 6C 73 65 2C 20 27 73                            )],
68 61 70 65 27 3A 20 28                  'fortran_order': False,
32 2C 29 2C 20 7D 20 20                  'shape': (2,), }"
20 20 20 20 20 20 20 20 
20 20 20 20 20 0A 

01 00 00 00 02 00 00 00 03 00 00 00    (1,2,3)
0A 00 00 00 0B 00 00 00 0C 00 00 00
0D 00 00 00 0E 00 00 00 0F 00 00 00
10 00 00 00 11 00 00 00 12 00 00 00
13 00 00 00                            (10,11,12,13,14,15,16,17,18,19)
1F 85 EB 51 B8 1E 09 40                3.14

04 00 00 00 05 00 00 00 06 00 00 00    (4,5,6)
FF FF FF FF FE FF FF FF FD FF FF FF
FC FF FF FF FB FF FF FF FA FF FF FF
F9 FF FF FF F8 FF FF FF F7 FF FF FF 
EC FF FF FF                            (-1,-2,-3,-4,-5,-6,-7,-8,-9,-20)
1F 85 EB 51 B8 1E 19 40                6.28
Comments