Sam Sam - 2 months ago 4x
Python Question

How do I capture string between certain Character and String in multi line String? Python

Let's say we have a string

string="This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)\

test \

(testing test) test >asdf \


I need to get the string between character > and string "test".

I tried

re.findall(r'>[^)](.*)test',string, re.MULTILINE )

However I get

(ascd asdfas -were)\ test \ (testing test) test >asdf.

However I need:

(ascd asdfas -were)\



How can I get those 2 string?


What about:

import re

s="""This is a test code [asdf -wer -a2 asdf] >(ascd asdfas -were)
(testing test) test >asdf

print(re.findall(r'>(.*?)\btest\b', s, re.DOTALL))


['(ascd asdfas -were)\n', 'asdf\n']

The only somewhat interesting parts of this pattern are:

  • .*?, where ? makes the .* "ungreedy", otherwise you'd have a single, long match instead of two.
  • Using \btest\b as the "ending" identifier (see Jan's comment below) instead of test. Where,

    \b Matches the empty string, but only at the beginning or end of a word....

Note, it may be reading up on re.DOTALL, as I think that's really what you want. DOTALL lets . characters include newlines, while MULTILINE lets anchors (^, $) match start and end of lines instead of the entire string. Considering you don't use anchors, I'm thinking DOTALL is more appropriate.