I am using
dummy.exe < file.txt > foo.txt
diff file.txt foo.txt
exe_path = r'dummy.exe'
file_path = r'file.txt'
with open(file_path, 'r') as test_file:
stdin = test_file.read().strip()
p = subprocess.run([exe_path], input=stdin, stdout=subprocess.PIPE, universal_newlines=True)
out = p.stdout.strip()
err = p.stderr
if stdin == out:
print('failed: ' + out)
if __name__ == "__main__":
int size, count, a, b;
std::cin >> size;
std::cin >> count;
std::cout << size << " " << count << std::endl;
for (int i = 0; i < count; ++i)
std::cin >> a >> b;
std::cout << a << " " << b << std::endl;
I'll start with a disclaimer: I don't have Python 3.5 (so I can't use the
run function), and I wasn't able to reproduce your problem on Windows (Python 3.4.4) or Linux (3.1.6). That said...
The data read is buffered in memory, so do not use this method if the data size is large or unlimited.
This sure sounds like your problem. Unfortunately, the docs don't say how much data is "large", nor what will happen after "too much" data is read. Just "don't do that, then".
The docs for
subprocess.call go into a little more detail (emphasis mine)...
Do not use
stderr=PIPEwith this function. The child process will block if it generates enough output to a pipe to fill up the OS pipe buffer as the pipes are not being read from.
...as do the docs for
This will deadlock when using
stderr=PIPEand the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use
Popen.communicate()when using pipes to avoid that.
That sure sounds like
Popen.communicate is the solution to this problem, but
communicate's own docs say "do not use this method if the data size is large" --- exactly the situation where the
wait docs tell you to use
communicate. (Maybe it "avoid(s) that" by silently dropping data on the floor?)
Frustratingly, I don't see any way to use a
subprocess.PIPE safely, unless you're sure you can read from it faster than your child process writes to it.
On that note...
You're holding all your data in memory... twice, in fact. That can't be efficient, especially if it's already in a file.
If you're allowed to use a temporary file, you can compare the two files very easily, one line at a time. This avoids all the
subprocess.PIPE mess, and it's much faster, because it only uses a little bit of RAM at a time. (The IO from your subprocess might be faster, too, depending on how your operating system handles output redirection.)
Again, I can't test
run, so here's a slightly older
communicate solution (minus
main and the rest of your setup):
import io import subprocess import tempfile def are_text_files_equal(file0, file1): ''' Both files must be opened in "update" mode ('+' character), so they can be rewound to their beginnings. Both files will be read until just past the first differing line, or to the end of the files if no differences were encountered. ''' file0.seek(io.SEEK_SET) file1.seek(io.SEEK_SET) for line0, line1 in zip(file0, file1): if line0 != line1: return False # Both files were identical to this point. See if either file # has more data. next0 = next(file0, '') next1 = next(file1, '') if next0 or next1: return False return True def compare_subprocess_output(exe_path, input_path): with tempfile.TemporaryFile(mode='w+t', encoding='utf8') as temp_file: with open(input_path, 'r+t') as input_file: p = subprocess.Popen( [exe_path], stdin=input_file, stdout=temp_file, # No more PIPE. stderr=subprocess.PIPE, # <sigh> universal_newlines=True, ) err = p.communicate() # No need to store output. # Compare input and output files... This must be inside # the `with` block, or the TemporaryFile will close before # we can use it. if are_text_files_equal(temp_file, input_file): print('OK') else: print('Failed: ' + str(err)) return
Unfortunately, since I can't reproduce your problem, even with a million-line input, I can't tell if this works. If nothing else, it ought to give you wrong answers faster.