JiBBY JiBBY - 3 months ago 17
Python Question

Piping in a string from a shell command and then parsing the string in python 3.5

So I'm trying to execute a shell command from python and then either store it in an array or directly parse the piped shell command.

I am piping the shell data via the subprocess command and verified the output using print statement and it worked just fine.

a = subprocess.Popen('filepath/command', shell=True, stdout=subprocess.PIPE)
b = a.stdout.read()
print(b)


Now, I am trying to parse out data out of an unknown amount of rows and 6 columns. Since b should be one long string, I tried to parse the string and store the salient characters into another array to be used however I want to analyze the data.

i = 0
a = subprocess.Popen('filepath/command', shell=True, stdout=subprocess.PIPE)
b = a.stdout.read()
for line in b.split("\n\n"): #to scan each row with a blank line separating each row
salient_Chars[i, 0] = line.split(" ")[3] #stores the third set of characters and stops at the next blank space
salient_Chars2[i, 0] = line.split(" ")[4] #stores the fourth set of characters and stops at the next blank space
i = i + 1


I get an error [TypeError: a bytes-like object is required, not 'str']. I searched this error and it means that I stored bytes and not a string using the Popen which I am not sure why since I verified it was a string with the print command. I tried using check_output after searching for how to pipe shell commands into a string.

from subprocess import check_output
a = check_output('file/path/command')


This gives me a permission error so I would like to use Popen command if possible.

How do I get the piped shell command into a string and then how do I properly parse through a string that is divided into rows and columns with spaces in between columns and blank lines in between rows?

Answer

Quoting Aaron Maenpaa's answer:

You need to decode the bytes object to produce a string:

>>> b"abcde"
b'abcde'

# utf-8 is used here because it is a very common encoding, but you
# need to use the encoding your data is actually in.
>>> b"abcde".decode("utf-8") 
'abcde'

Therefore your code would look like:

i = 0
a = subprocess.Popen('filepath/command', shell=True, stdout=subprocess.PIPE)
b = a.stdout.read().decode("utf-8") # note the decode method
for line in b.split("\n\n"): #to scan each row with a blank line separating each row
    salient_Chars[i, 0] = line.split(" ")[3] #stores the third set of characters and stops at the next blank space
    salient_Chars2[i, 0] = line.split(" ")[4] #stores the fourth set of characters and stops at the next blank space
    i = i + 1

By the way, I don't really understand your parsing code, that will give you a TypeError: list indices must be integers, not tuple since you are passing a tuple to the list index in salient_Chars (assuming it is a list).

Edit

Note that calling the print built-in method is not a way of checking whether the arguments passed are a plain string-type object. From the OP from the quoted answer:

The communicate() method returns an array of bytes:

>>> command_stdout
b'total 0\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1\n-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2\n'

However, I'd like to work with the output as a normal Python string. So that I could print it like this:

>>> print(command_stdout)
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file1
-rw-rw-r-- 1 thomas thomas 0 Mar  3 07:03 file2
Comments