schaazzz schaazzz -4 years ago 85
Python Question

Search for substring inclusive of carriage return and newline characters

I have a Python script which is communicating with another application (a debugger connected to an embedded target to be exact) using a socket connection.

The responses from the debugger can vary in length and can span multiple lines but they always end with either

True\r\n
or
False\r\n
. I want to capture
True
or
False
including the newline characters.

The regular expression I'm using (e.g.
r'^[.]+|[\r]+|[\n]+(True\r\n)$'
for True) seems to work when tested on regex101.com but only returns
\r
when run with Python.

Sample code with a sample response string:

import re
sample_response = 'var0 = 0x00000001\r\nTrue\r\n'
re_true = re.compile(r'^[.]+|[\r]+|[\n]+(True\r\n)$')
print re_true.search(sample_response).group(0) # Will print out '\r'


I know there is something fundamentally wrong with the regex I'm using. I've also tried positive lookbehind as shown below and that seems to work but I'm not sure if this is the correct way to do this:

import re
sample_response = 'var0 = 0x00000001\r\nTrue\r\n'
re_true = re.compile(r'(?<=(True\r\n))')
print re_true.search(sample_response).group(0) # Will print out ''
print re_true.search(sample_response).group(1) # Will print out 'True\r\n'

Answer Source

You say you need to match True\r\n or False\r\n, then the ^[.]+|[\r]+| in your pattern is redundant. Use

re.search(r'[\r\n]*\b(?:True|False)[\r\n]*$', s)

Or leave out the initial [\r\n]* if you need no line breaks before True or False.

Details:

  • [\r\n]* - zero or more CR or LF symbols
  • \b - a word boundary
  • (?:True|False) - either True or False as whole words
  • [\r\n]* - as above
  • $ - end of string.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download