user3084259 user3084259 - 2 months ago 28
Python Question

numpy.fromfile differences between python 2.7.3 and 2.7.6

I'm having an issue running code between two consoles and I've gotten it down to a difference between the versions of python installed on these computers (2.7.3 and 2.7.6 respectively).

Here is the input file found on github (https://github.com/tkkanno/PhD_work/blob/master/1r).

when in python 2.7.3 and numpy version 1.11.1 the following code works as expected:

import numpy as np
s = 'directory/to/file'
f = open(s, 'rb')
y = np.fromfile(f,'<l')
y.shape


this give gets an numpy array of shape (16384,). However, when it is run on python 2.7.6/numpy 1.11.1 it gives an array half the size (8192,). This isnt' acceptable for me

I can't understand why numpy is acting this way with different versions of python. I would be grateful for any suggestions

Answer

Converted from my comment:

You're likely running on different Python/OS builds with different notions of how big a long is. C doesn't require a specific long size, and in practice, Windows always treats it as 32 bits, while other common desktop OSes treat it as 32 bits if the OS & Python are built for 32 bit CPUs (ILP32), and 64 bits if built for 64 bit CPUs (LP64).

If you want a fixed width type on all OSes, don't use the system-dependent-width types. Use fixed width types instead. From your comments, the expected behavior is to load 32 bit/4 byte values. If you were just using native endianness, you could just pass numpy.int32 (numpy recognizes the raw class as datatypes). Since you want to specify endianness explicitly (perhaps this might run on a big endian system), you can instead pass '<i4' which explicitly states it's a little endian (<) signed integer (i) four bytes in size (4):

import numpy as np
s  = 'directory/to/file'
with open(s, 'rb') as f:       # Use with statements when opening files
    y = np.fromfile(f, '<i4')  # Use explicit fixed width type
y.shape