ws_e_c421 ws_e_c421 - 4 years ago 83
Python Question

How are file objects cleaned up in Python when the process is killed?

What happens to a file object in Python when the process is terminated? Does it matter whether Python is terminated with

(etc.) or by a

I have some logging scripts that continually acquire data and write it to a file. I don't care about doing any extra clean up, but I just want to make sure that log file is not corrupted when Python is abruptly terminated (e.g. I could leave it running in the background and just shutdown the computer). I made the following test scripts to try to see what happens:

for i in $(seq 1 10); do
python $i & export pypid=$!
sleep 0.3
echo $pypid
kill -SIGTERM $pypid

import csv
import os
import signal
import sys

end_loop = False

def handle_interrupt(*args):
global end_loop
end_loop = True

signal.signal(signal.SIGINT, handle_interrupt)

with open('test' + str(sys.argv[-1]) + '.txt', 'w') as csvfile:
writer = csv.writer(csvfile)
for idx in range(int(1e7)):
writer.writerow((idx, 'a' * 60000))
if end_loop:

I ran
with different signals (changed
, and
) (note: I put an explicit handler in
since Python does not handle that one other than as
). In all cases, all of the output files had only complete rows (no partial writes) and did not appear corrupted. I put the
calls to try to make sure the data was being written to disk as much as possible so that the script had the greatest chance of being interrupted mid-write.

So can I conclude that Python always completes a write when it is terminated and does not leave a file in an intermediate state? Or does this depend on the operating system and file system (I was testing with Linux and an ext4 partition)?

Answer Source

It's not how files are "cleaned up" so much as how they are written to. It's possible that a program might perform multiple writes for a single "chunk" of data (row, or whatever) and you could interrupt in the middle of this process and end up with partial records written.

Looking at the C source for the csv module, it assembles each row to a string buffer, then writes that using a single write() call. That should generally be safe; either the row is passed to the OS or it's not, and if it gets to the OS it's all going to get written or it's not (barring of course things like hardware issues where part of it could go into a bad sector).

The writer object is a Python object, and a custom writer could do something weird in its write() that could break this, but assuming it's a regular file object, it should be fine.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download