jbnunn jbnunn - 5 months ago 13
Java Question

What's Scala's idiomatic way to split a List by separator?

If I have a List of type String,

scala> val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
items: List[java.lang.String] = List(Apple, Banana, Orange, Tomato, Grapes, BREAK, Salt, Pepper, BREAK, Fish, Chicken, Beef)


how can I split it into
n
separate lists based on a certain string/pattern (
"BREAK"
, in this case).

I've thought about finding the position of
"BREAK"
with
indexOf
, and split up the list that way, or using a similar approach with
takeWhile (i => i != "BREAK")
but I'm wondering if there's a better way?

If it helps, I know there will only ever be 3 sets of items in the
items
list (thus 2
"BREAK"
markers).

Answer
def splitBySeparator[T]( l: List[T], sep: T ): List[List[T]] = {
  l.span( _ != sep ) match {
    case (hd, _ :: tl) => hd :: splitBySeparator( tl, sep )
    case (hd, _) => List(hd)
  }
}

val items = List("Apple","Banana","Orange","Tomato","Grapes","BREAK","Salt","Pepper","BREAK","Fish","Chicken","Beef")
splitBySeparator(items, "BREAK")

Result:

res1: List[List[String]] = List(List(Apple, Banana, Orange, Tomato, Grapes), List(Salt, Pepper), List(Fish, Chicken, Beef))

UPDATE: The above version, while concise and effective, has two problems: it does not handle well the edge cases (like List("BREAK") or List("BREAK", "Apple", "BREAK"), and is not tail recursive. So here is another (imperative) version that fixes this:

import collection.mutable.ListBuffer
def splitBySeparator[T]( l: Seq[T], sep: T ): Seq[Seq[T]] = {
  val b = ListBuffer(ListBuffer[T]())
  l foreach { e =>
    if ( e == sep ) {
      if  ( !b.last.isEmpty ) b += ListBuffer[T]()
    }
    else b.last += e
  }
  b.map(_.toSeq)
}

It internally uses a ListBuffer, much like the implementation of List.span that I used in the first version of splitBySeparator.