bream bream - 25 days ago 9
Python Question

What causes my code to inflate text file size?

I've written a Python program to go through the text files in a directory and create new versions of each one with added line numbers. Here is the relevant function in the program:

def create_lined_ver(filename):
new_text = []

with open(filename + ".txt", "r+") as f:
text = f.readlines()
for (num, line) in enumerate(text):
new_text.append("[{0}]: ".format(num) + line)

with open(filename + "_lined" + ".txt", "a+") as f:
for line in new_text:
f.write(line)


To test it, I ran it on a batch of text files, and then, out of curiosity, ran it again (adding a second set line numbers to the already numbered files). I noticed that each time I ran the program, the file size of the newly created files were much larger than they should have been for adding ~5-6 characters per line. The file sizes were jumping from 150 KB (original) to 700, 1800, and then 3000 KB for each subsequent run.

What's causing the file sizes to increase so much?

Answer Source

As pointed out, in the comments, you are appending to the lined version every time you run the code. Instead try:

def create_lined_ver(filename):

    with open(filename + ".txt", "r") as f:
        text = f.readlines()

    new_text = ["[{0}]: ".format(num) + line for (num, line) in enumerate(text)]

    with open(filename + "_lined" + ".txt", "w") as f:
        f.write(''.join([new_text]))