user248237dfsf user248237dfsf - 7 months ago 16
Bash Question

using Python subprocess to redirect stdout to stdin?

I'm making a call to a program from the shell using the subprocess module that outputs a binary file to STDOUT.

I use Popen() to call the program and then I want to pass the stream to a function in a Python package (called "pysam") that unfortunately cannot Python file objects, but can read from STDIN. So what I'd like to do is have the output of the shell command go from STDOUT into STDIN.

How can this be done from within Popen/subprocess module? This is the way I'm calling the shell program:

p = subprocess.Popen(my_cmd, stdout=subprocess.PIPE, shell=True).stdout


This will read "my_cmd"'s STDOUT output and get a stream to it in p. Since my Python module cannot read from "p" directly, I am trying to redirect STDOUT of "my_cmd" back into STDIN using:

p = subprocess.Popen(my_cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE, shell=True).stdout


I then call my module, which uses "-" as a placeholder for STDIN:

s = pysam.Samfile("-", "rb")


The above call just means read from STDIN (denoted "-") and read it as a binary file ("rb").

When I try this, I just get binary output sent to the screen, and it doesn't look like the Samfile() function can read it. This occurs even if I remove the call to Samfile, so I think it's my call to Popen that is the problem and not downstream steps.

EDIT: In response to answers, I tried:

sys.stdin = subprocess.Popen(tagBam_cmd, stdout=subprocess.PIPE, shell=True).stdout
print "Opening SAM.."
s = pysam.Samfile("-","rb")
print "Done?"
sys.stdin = sys.__stdin__


This seems to hang. I get the output:

Opening SAM..


but it never gets past the Samfile("-", "rb") line. Any idea why?

Any idea how this can be fixed?

EDIT 2: I am adding a link to Pysam documentation in case it helps, I really cannot figure this out. The documentation page is:

http://wwwfgu.anat.ox.ac.uk/~andreas/documentation/samtools/usage.html

and the specific note about streams is here:

http://wwwfgu.anat.ox.ac.uk/~andreas/documentation/samtools/usage.html#using-streams

In particular:

"""
Pysam does not support reading and writing from true python file objects, but it does support reading and writing from stdin and stdout. The following example reads from stdin and writes to stdout:

infile = pysam.Samfile( "-", "r" )
outfile = pysam.Samfile( "-", "w", template = infile )
for s in infile: outfile.write(s)


It will also work with BAM files. The following script converts a BAM formatted file on stdin to a SAM formatted file on stdout:

infile = pysam.Samfile( "-", "rb" )
outfile = pysam.Samfile( "-", "w", template = infile )
for s in infile: outfile.write(s)


Note, only the file open mode needs to changed from r to rb.
"""

So I simply want to take the stream coming from Popen, which reads stdout, and redirect that into stdin, so that I can use Samfile("-", "rb") as the above section states is possible.

thanks.

Answer

I'm a little confused that you see binary on stdout if you are using stdout=subprocess.PIPE, however, the overall problem is that you need to work with sys.stdin if you want to trick pysam into using it.

For instance:

sys.stdin = subprocess.Popen(my_cmd, stdout=subprocess.PIPE, shell=True).stdout
s = pysam.Samfile("-", "rb")
sys.stdin = sys.__stdin__ # restore original stdin

UPDATE: This assumed that pysam is running in the context of the Python interpreter and thus means the Python interpreter's stdin when "-" is specified. Unfortunately, it doesn't; when "-" is specified it reads directly from file descriptor 0.

In other words, it is not using Python's concept of stdin (sys.stdin) so replacing it has no effect on pysam.Samfile(). It also is not possible to take the output from the Popen call and somehow "push" it on to file descriptor 0; it's readonly and the other end of that is connected to your terminal.

The only real way to get that output onto file descriptor 0 is to just move it to an additional script and connect the two together from the first. That ensures that the output from the Popen in the first script will end up on file descriptor 0 of the second one.

So, in this case, your best option is to split this into two scripts. The first one will invoke my_cmd and take the output of that and use it for the input to a second Popen of another Python script that invokes pysam.Samfile("-", "rb").

Comments