user797963 user797963 - 1 year ago 75
Python Question

python - comparing a newly written file with filecmp.cmp() always returns False?

I must be making a stupid mistake here, because this should be working. I'm thinking the file is staying open or something, and it's driving me nuts.

This is for some regression test cases I have where I'm comparing generated output of a script ran against mock files to known good output files (key files).

Here is a simple example:

def run_and_compare(self, key_file, out_file, option):
print filecmp.cmp(out_file, key_file) # always True (as long as I've run this before, so the out_file exists already)
cmd = './ -f option'
with open(out_file, 'wb') as out:
subprocess.Popen(cmd.split(), stdout=out, stderr=subprocess.PIPE)
print filecmp.cmp(out_file, key_file) # always False
print filecmp.cmp(out_file, key_file) # always True

I really don't want to keep that sleep in the test! How can I be sure the out file is OK to compare without using the sleep? I've tried using out.close(), but it doesn't work, and shouldn't be needed as long as I'm using 'with'. I'm using python 2.6.4 if that matters here.

Answer Source

I'd suggest you add a wait to your subprocess to wait for it to complete

with open(out_file, 'wb') as out:
    p=subprocess.Popen(cmd.split(), stdout=out, stderr=subprocess.PIPE)

If you don't wait, the subprocess starts, taking the file out as output and returns immediately (starts in background). When you compare both files, one is probably empty hence the False.

After a while, the subprocess ends, out is no longer used and probably garbage collected, handle closed: your file is valid. (I'm not saying it is exactly what's going on here, but the lack of p.wait() is surely the issue here)

Aside from that, I have always wondered why people run subprocesses involving python commands when it's so simple to import them and call their functions directly, thus benefiting from exception chain, one sole process, avoiding all this inter-process communication issues..