I am trying to open a file created by a measurement equipment, find the bytes correspoding to metadata, then write everything else to a new binary file. (The metadata part is not the problem: I know the headers and can find them easily. Let's not worry about that.)
The problem is: when I open the file and write the bytes into a new file, new bytes are added, which messes up the relevant data. Specifically, every time there is a '0A' byte in the original file, the new file has a '0D' byte before it.
I've gone through a few iterations of trimming down the code to find the issue. Here is the latest and simplest version, in three different ways that all produce the same result:
file_name = raw_input('Name of the file to be edited: ')
f = open(file_name, 'rb')
#1st try: using mmap, to make the metadata sarch easier
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
full_data = s.read(len(s))
with open(os.path.join('.', 'edited', ('[mmap data]' + file_name + '.bin')), 'a') as data_mmap:
#2nd try: using bytes, in case mmap was giving me trouble
f_byte = bytes(f.read())
with open(os.path.join('.', 'edited', ('[bytes data]' + file_name + '.bin')), 'a') as data_bytes:
#3rd try: using os.read/write(file) instead of file.read() and file.write().
from os.path import getsize
o = os.open(file_name,os.O_BINARY) #only available on Windows
f_os = bytes(os.read(o,getsize(file_name)))
with open(os.path.join('.', 'edited', ('[os data]' + file_name + '.bin')), 'a') as data_os:
Welcome to the world of end of line markers! When a file is open in text mode under Windows, any raw
\n (hex 0x0a) will be written as
\r\n (hex 0x0d 0x0a).
Fortunately it is easy to fix: just open the file in binary mode (note the b):
with open(..., 'ab') as data_...:
and the unwanted
\r will no longer bother you :-)