stack0114106 stack0114106 - 7 months ago 22
Scala Question

scala.. parse a message into various fields

In Scala, I want to parse each message (of length=20) into individual units. The message will be appended to the end of previous message without a newline character. I tried the below, but any optimizations and improving performance are welcome

/* Length.. id=3,name=5,city=8,port=3,indicator=1 */

def layout(rec:String) = {
val id=rec.take(3)
val name=rec.drop(3).take(5)
val city=rec.drop(3+5).take(8)
val port=rec.drop(3+5+8).take(3)
val ind=rec.drop(3+5+8+3).take(1)

val messages="101Jim Portland990Y102JamesHouston 990X103John Boston 880Y"
messages grouped(20) foreach { x => layout(x) }


scala> :load work.scala
Loading work.scala...
layout: (rec: String)Unit
messages: String = 101Jim Portland990Y102JamesHouston 990X103John Boston 880Y
(101,Jim ,Portland,990,Y)
(102,James,Houston ,990,X)
(103,John ,Boston ,880,Y)



You can do this quite nicely with a regular expression:

val messages = "101Jim  Portland990Y102JamesHouston 990X103John Boston  880Y"

val RecordPattern = """(\d{3})(.{5})(.{8})(\d{3})(.)""".r

val records = messages.grouped(20).map {
  case RecordPattern(id, name, city, port, ind) => (id, name, city, port, ind)

And then:

scala> records.foreach(println)
(101,Jim  ,Portland,990,Y)
(102,James,Houston ,990,X)
(103,John ,Boston  ,880,Y)

This is also likely to perform better than splitting the string using the collections operations like drop and take, but the difference will be small, and the primary advantage is clarity.