Tim Harper Tim Harper - 6 months ago 33
Java Question

Sane implementation of java.lang.String.split for Scala

Java's string split method has behavior that is inconsistent and has bitten me quite a few times through the years:

scala> "/path".split("/")
res2: Array[String] = Array("", path)

scala> "/".split("/")
res4: Array[String] = Array()

scala> "".split("/")
res3: Array[String] = Array("")


Specifically, it is surprising that when there is a leading delimiter, with some text following, I get an
""
, followed by the text. However, when I remove the text, I get an empty array. I would expect
"/".split("/")
to return
Array("")
, consistent with
"/path".split("/")
returning `Array("", "path").

Also,
java.lang.String.split
expects a regex, instead of a string token. This is not immediately clear and also has caused confusion and surprising errors.

Is there an alternative in some popular standard library that has more straightforward behavior?

(voted to close as, apparently, asking for library recommendations is off-topic)

Answer

Scala standard library does provide an alternative to Java's String.split with very straightforward behavior as it uses Char as a separator:

def split(separator: Char): Array[String]

Your other option would be StringUtils in apache-commons:

StringUtils.split("ab.cd", "b." // res0: Array[String] = Array(a, cd)