Connor Blanck Connor Blanck - 5 months ago 5
Python Question

Are there any guarantees about the splitting order of str.split()?

According to the Python 2.7 docs, using

str.split()
with
maxsplit
specified will split a string up to
maxsplit
times.

However, it never explicitly specifies that these splits will be executed left to right. There is a related function
str.rsplit()
that guarantees right to left split ordering.

Aside from doing string reverse followed by
str.rsplit()
, is there any way to guarantee a left to right splitting order? Are there any situations where
str.split()
will NOT use a left to right order?

Answer

If you're looking for guarantees that splitting with the maxsplit argument splits from left-to-right, you only need to look at the builtin python test suite.

Here's an excerpt:

    self.checkequal(['a', 'b', 'c', 'd'], 'a|b|c|d', 'split', '|')
    self.checkequal(['a|b|c|d'], 'a|b|c|d', 'split', '|', 0)
    self.checkequal(['a', 'b|c|d'], 'a|b|c|d', 'split', '|', 1)
    self.checkequal(['a', 'b', 'c|d'], 'a|b|c|d', 'split', '|', 2)
    self.checkequal(['a', 'b', 'c', 'd'], 'a|b|c|d', 'split', '|', 3)
    self.checkequal(['a', 'b', 'c', 'd'], 'a|b|c|d', 'split', '|', 4)
    self.checkequal(['a', 'b', 'c', 'd'], 'a|b|c|d', 'split', '|',
                    sys.maxsize-2)
    self.checkequal(['a|b|c|d'], 'a|b|c|d', 'split', '|', 0)
    self.checkequal(['a', '', 'b||c||d'], 'a||b||c||d', 'split', '|', 2)
    self.checkequal(['abcd'], 'abcd', 'split', '|')
    self.checkequal([''], '', 'split', '|')
    self.checkequal(['endcase ', ''], 'endcase |', 'split', '|')
    self.checkequal(['', ' startcase'], '| startcase', 'split', '|')
    self.checkequal(['', 'bothcase', ''], '|bothcase|', 'split', '|')
    self.checkequal(['a', '', 'b\x00c\x00d'], 'a\x00\x00b\x00c\x00d', 'split', '\x00', 2)

From the tests, it is clear that any implementation that did something different would fail these tests.