Manuj Kathuria Manuj Kathuria - 3 months ago 25
Scala Question

scala io Exception handling

I am trying to read a comma separated file in scala and convert that into list of Json object. The code is working if all the records are valid. How do i catch the exception for the records which are not valid in my function below. If a record is not valid it should throw and exception and continue reading the file. but in my case once an invalid record comes the application stops.

def parseFile(file: String): List[JsObject] = {

val bufferedSource = Source.fromFile(file)
try {
bufferedSource.getLines().map(line => {
val cols = line.split(",").map(_.trim)
(Json.obj("Name" -> cols(0), "Media" -> cols(1), "Gender" -> cols(2), "Age" -> cols(3).toInt)) // Need to handle io exception here.
}).toList
}

finally {
bufferedSource.close()
}
}

Answer

I think you may benefit from using Option (http://danielwestheide.com/blog/2012/12/19/the-neophytes-guide-to-scala-part-5-the-option-type.html) and the Try object (http://danielwestheide.com/blog/2012/12/26/the-neophytes-guide-to-scala-part-6-error-handling-with-try.html)

What you are doing here is stopping all work when an error happens (ie you throw to outside the map) a better option is to isolate the failure and return some object that we can filer out. Below is a quick implementation I made

package csv

import play.api.libs.json.{JsObject, Json}

import scala.io.Source
import scala.util.Try

object CsvParser extends App {

  //Because a bad row can either != 4 columns or the age can not be a int we want to return an option that will be ignored by our caller
  def toTuple(array : Array[String]): Option[(String, String, String, Int)] = {
    array match {
      //if our array has exactly 4 columns  
      case Array(name, media, gender, age) => Try((name, media, gender, age.toInt)).toOption
      // any other array size will be ignored
      case _ => None
    }
  }

  def toJson(line: String): Option[JsObject] = {
    val cols = line.split(",").map(_.trim)
    toTuple(cols) match {
      case Some((name: String, media: String, gender: String, age: Int)) => Some(Json.obj("Name" -> name, "Media" -> media, "Gender" -> gender, "Age" -> age))
      case _ => None
    }
  }

  def parseFile(file: String): List[JsObject] = {

    val bufferedSource = Source.fromFile(file)
    bufferedSource.getLines().map(toJson).toList.flatten
  }

  parseFile("my/csv/file/path")

}

The above code will ignore any rows where there not exactly 4 columns. It will also contain the NumberFormatException from .toInt.

The idea is to isolate the failure and pass back some type that the caller can either work with when the row was parsed...or ignore when a failure happened.

Comments