mariop mariop - 2 months ago 17
Scala Question

Read CSV in Scala into case class instances with error handling

I would like to read a CSV String/File in Scala such that given a case class

and an error type
, the parser fills an
. Is there any library that does this or something similar?

For instance, given a class and error

case class Person(name: String, age: Int)

type Error = String

and the CSV String


the parser would output

Stream(Right(Person("Foo",1)), Left("Cannot read 'Ro'"), Right(Person("Bar", 24)))


I think my question wasn't clear, so let me clarify: is there a way read CSV in Scala without defining boilerplate? Given any case class, is there a way to load it automatically? I would like to use it in this way:

val iter = csvParserFor[Person].parseLines(lines)


Here's a Shapeless implementation that takes a slightly different approach from the one in your proposed example. This is based on some code I've written in the past, and the main difference from your implementation is that this one is a little more general—for example the actual CSV parsing part is factored out so that it's easy to use a dedicated library.

First for an all-purpose Read type class (no Shapeless yet):

import scala.util.{ Failure, Success, Try }

trait Read[A] { def reads(s: String): Try[A] }

object Read {
  def apply[A](implicit readA: Read[A]): Read[A] = readA

  implicit object stringRead extends Read[String] {
    def reads(s: String): Try[String] = Success(s)

  implicit object intRead extends Read[Int] {
    def reads(s: String) = Try(s.toInt)

  // And so on...

And then for the fun part: a type class that provides a conversion (that may fail) from a list of strings to an HList:

import shapeless._

trait FromRow[L <: HList] { def apply(row: List[String]): Try[L] }

object FromRow {
  import HList.ListCompat._

  def apply[L <: HList](implicit fromRow: FromRow[L]): FromRow[L] = fromRow

  def fromFunc[L <: HList](f: List[String] => Try[L]) = new FromRow[L] {
    def apply(row: List[String]) = f(row)

  implicit val hnilFromRow: FromRow[HNil] = fromFunc {
    case Nil => Success(HNil)
    case _ => Failure(new RuntimeException("No more rows expected"))

  implicit def hconsFromRow[H: Read, T <: HList: FromRow]: FromRow[H :: T] =
    fromFunc {
      case h :: t => for {
        hv <- Read[H].reads(h)
        tv <- FromRow[T].apply(t)
      } yield hv :: tv
      case Nil => Failure(new RuntimeException("Expected more cells"))

And finally to make it work with case classes:

trait RowParser[A] {
  def apply[L <: HList](row: List[String])(implicit
    gen: Generic.Aux[A, L],
    fromRow: FromRow[L]
  ): Try[A] = fromRow(row).map(gen. from)

def rowParserFor[A] = new RowParser[A] {}

Now we can write the following, for example, using OpenCSV:

case class Foo(s: String, i: Int)

import scala.collection.JavaConverters._

val reader = new CSVReader(new"foos.csv"))

val foos = => rowParserFor[Foo](row.toList))

And if we have an input file like this:


We'll get the following:

scala> foos.foreach(println)
Failure(java.lang.NumberFormatException: For input string: "twelve")

(Note that this conjures up Generic and FromRow instances for every line, but it'd be pretty easy to change that if performance is a concern.)