Aurelius Aurelius - 7 months ago 12
Python Question

Iterating over lines in a file yields wrong number of lines

I am slowly becoming more familiar with Python, but have hit a major roadblock in what I thought would be a simple script.

Programmable infrared remotes store their infrared codes in a file format called ProntoEdit HEX. This is essentially a long file, where each line in the file represents the data for a particular IR code. The data is represented in hexadecimal numbers, with a space between each. I have edited the file, so that each line contains only hex numbers, with a space between each (no 0x preceding the hex numbers). There are 64 hexadecimal numbers per line. They are in pairs, so each line contains 32 pairs. In order to convert the data into a simple binary string, you divide the second number of each pair by the first, and depending on the ratio, it is either a 1 or a 0.

Hopefully you've followed all that. I have managed to write a script in Python to automate this for me, as there are 768 lines in the file I'm starting with. The script appears to be working, but for some reason, it only does half of the file, and then stops. It also seems to be skipping the first line of the file. I have checked by hand, and it is correctly "decoding" from the second line until exactly halfway through the file, at line 384. I am at a loss as to why this is happening.

Here is the (relatively) simple script:

rawcode = open(r"stripped.txt", "r")
outputfile = open(r"output_codes.txt", "w")
currentline = 0
for lines in rawcode:
output = [] #empty list for output
line = rawcode.readline()
splitline = line.split(" ") #turn the line into a list
splitline.remove('\n')

y = 0
for x in list(range(32)): #go through each pair in the line
num1 = int(splitline[y], 16)
num2 = int(splitline[y+1], 16)
if (num2 / num1) == 3:
output.append("0")
elif (num2 / num1) == 7:
output.append("1")
y += 2

print(output)
outstring = ''.join(output)
outputfile.write(outstring)
outputfile.write("\n")
currentline += 1
print(currentline)

outputfile.flush()
outputfile.close()
rawcode.close()


Also, here is a link to the input file, and the output file that I'm getting.

stripped.txt

output.txt

If anyone has experience working with file in this manner, your help is greatly appreciated! I am really not familiar with the intricacies of Python -- as you can probably tell, I come from a C background, and am still struggling with the different philosophies of the two languages.

Answer

You're doing a double-read here:

for lines in rawcode:
    output = [] #empty list for output
    line = rawcode.readline()

It's not clear what you're trying to accomplish, since your description of the process makes no sense. (It may well be accurate, but it still makes no sense: divide and then round to 1 or 0?)

Okay, this seems to work:

#!python3

with open('stripped.txt') as infile, open('output.txt', 'w') as outfile:
    for line in infile:
        line = line.strip()
        if not line:
            continue

        hexnums = [int(hn, 16) for hn in line.split()]
        for num1, num2 in zip(hexnums[0::2], hexnums[1::2]):
            digit = '0' if num2 // num1 == 3 else '1'
            outfile.write(digit)
        outfile.write('\n')

I get 768 lines of output, just like the input, with the first bunch being:

01000000000000010100011111111111
01000100000000010100001111111111
01000010000000010100010111111111
01000110000000010100000111111111
01000001000000010100011011111111
01000101000000010100001011111111
01000011000000010100010011111111
01000111000000010100000011111111
01000000100000010100011101111111
01000100100000010100001101111111
01000010100000010100010101111111
01000110100000010100000101111111
01000001100000010100011001111111
01000101100000010100001001111111
01000011100000010100010001111111
01000111100000010100000001111111
01000000010000010100011110111111
01000100010000010100001110111111
01000010010000010100010110111111
01000110010000010100000110111111