dahlia dahlia - 1 year ago 57
Python Question

Split on period without removing the period punctuation once split - Python

I have seen a lot of related questions to mine but I still can't seem to get my specific example to work.
I have some data in a file which is several sentences strung together. I am trying to split the sentences into a list with each sentence being an element of the list. But when I split on a period followed by a space I lose the period in all the elements of my list (except the last one).
I begin with this:

text = "This sentence. And this one. One more."

Desired output:

["This sentence.", "And this one.", "One more."]

Currently I am getting this by doing text.split('. "):

["This sentence","And this one","One more."]

Answer Source

Use positive look behind:

import re
re.split(r'(?<=\.) ', text)

The above assumes your sentence always end with a period and a space (except the last sentence). (?<=\.) is a positive look behind, so the regexp above will split on a space that is just after a dot, but it won't take dot into account when doing the substrings.