user2177047 user2177047 - 5 months ago 15
Python Question

Split string at word boundaries without losing whitespaces or newlines

Suppose I have a string like this which contains (multiple) whitespaces and newlines:

"\n\n\nmy string \n"


I want this to be split into:

['\n', '\n', '\n', 'my', ' ', ' ', 'string', ' ', '\n']


How could I get this? I suppose I need a regular expression?

Answer

Use regex \w+|\W and find matches

>>> import re
>>> p = re.compile('\w+|\W')
>>> p.findall('\n\n\nmy  string \n')

['\n', '\n', '\n', 'my', ' ', ' ', 'string', ' ', '\n']

Regex explanation here.

Comments