shjnlee shjnlee - 1 year ago 60
Python Question

Read binary file of logging data and output to new file with int (python)

I've been working on an embedded software project that writes sensor data to SD card using FATFS module. The datatype of the data is uint32_t (4 bytes) and the output is binary file.

I'm trying to write a python script to read the binary file (and parse the data to int and write to a new file). My current code,

def read():
with open("INPUT1.TXT", "rb") as binary_file:
# Read the whole file at once
data = binary_file.read()
print(data)


and that gives me a chunk of value in hex,

b' \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \t \n \x0b \
x0c \r \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17
\x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x01 \x02 \x03
\x04 \x05 \x06 \x07 \x08 \t \n \x0b \x0c \r \x0e \x0f
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1
b \x1c \x1d \x1e \x1f '


When printing each 4 bytes, some numbers are even missing,

f = open("INPUT2.TXT", "rb")
try:
bytes_read = f.read(4)
while bytes_read:
print(bytes_read)
bytes_read = f.read(4)
finally:
f.close()


give result of

b' ' #supposed to be \x00
b'\x01 '
b'\x02 '
b'\x03 '
b'\x04 '
b'\x05 '
b'\x06 '
b'\x07 '
b'\x08 '
b'\t ' #supposed to be \x09
b'\n ' #supposed to be \x0a
b'\x0b '
b'\x0c '
b'\r ' #supposed to be \x0d
b'\x0e '
b'\x0f '
b'\x10 '
b'\x11 '
b'\x12 '
b'\x13 '
b'\x14 '
b'\x15 '
b'\x16 '
b'\x17 '
b'\x18 '
b'\x19 '
b'\x1a '
b'\x1b '
b'\x1c '
b'\x1d '
b'\x1e '
b'\x1f '


But when I read the binary file in a hex editor, all the binary appears to be correct?!

If I want to read 4 bytes at a time, and write to a new file (in type int), how could I achieve it?

Thanks,

Henry

Answer Source
nums = []
with open("INPUT2.TXT", "rb") as file:
    while byte:
        byte = file.read(4)
        nums.append(int.from_bytes(byte, byteorder="little"))

This should do it for python 3.

It looks like your bytes are flipped from your example, so I changed byte order to little. if they aren't slipped, then change it back to big.

Another weird thing: it looks like 0x00 is getting turned into b" ", instead of b"\x00". if thats the case, then do this isntead:

nums = []
with open("INPUT2.TXT", "rb") as file:
    while byte:
        byte = file.read(4)
        nums.append(int.from_bytes(byte.replace(b" ", b"\x00"), byteorder="little"))

Here's an example with what you provided.

>>> test = [b'    ',
b'\x01   ',
b'\x02   ',
b'\x03   ',
b'\x04   ',
b'\x05   ',
b'\x06   ',
b'\x07   ',
b'\x08   ',
b'\t   ',
b'\n   ',
b'\x0b   ',
b'\x0c   ',
b'\r   ',
b'\x0e   ',
b'\x0f   ',
b'\x10   ',
b'\x11   ',
b'\x12   ',
b'\x13   ',
b'\x14   ',
b'\x15   ',
b'\x16   ',
b'\x17   ',
b'\x18   ',
b'\x19   ',
b'\x1a   ',
b'\x1b   ',
b'\x1c   ',
b'\x1d   ',
b'\x1e   ',
b'\x1f   ']

>>> for t in test:
>>>     print(int.from_bytes(t.replace(b" ", b"\x00"),  byteorder="little"))
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download