Surender Raja Surender Raja - 3 months ago 8
Scala Question

In scala How do we find the latest record for each Customer?

My input file is below . It contains some purchase details for each Customer.

Input:

100,Surender,2015-01-23,PHONE,20000
100,Surender,2015-01-24,LAPTOP,25000
101,Ajay,2015-02-21,LAPTOP,40000
101,Ajay,2015-03-10,MUSIC_SYSTEM,50000
102,Vikram,2015-07-20,WATCH,60000


My requirement is I would like to find out the latest Purchase details for each Customer .

So the expected output is

Expected OutPut:

List(101,Ajay,2015-03-10,MUSIC_SYSTEM,50000)
List(100,Surender,2015-01-24,LAPTOP,25000)
List(102,Vikram,2015-07-20,WATCH,60000)


I tried the below code and it is giving me the expected output..

But this below logic is somewhat similar to java .

My Scala code :

package pack1
import scala.io.Source
import scala.collection.mutable.ListBuffer
object LatestObj {

def main(args:Array[String])=
{
var maxDate ="0001-01-01"
var actualData:List[String] =List()
var resultData:ListBuffer[String] = ListBuffer()

val myList=Source.fromFile("D:\\Scala_inputfiles\\records.txt").getLines().toList;
val myGrped = myList.groupBy { x => x.substring(0,3) }
//println(myGrped)
for(mappedIterator <- myGrped)
{
// println(mappedIterator._2)
actualData =mappedIterator._2
maxDate=findMaxDate(actualData)
println( actualData.filter { x => x.contains(maxDate) })
}


}

def findMaxDate( mytempList:List[String]):String =
{
var maxDate ="0001-01-01"
for(m <- mytempList)
{
var transDate= m.split(",")(2)
if(transDate > maxDate)
{
maxDate =transDate
}
}

return maxDate
}

}


Could some one help me on trying the same approach in a simpler way using scala?

Or The above code is the only way to achieve that logic?

Answer

Even simpler version, also using a case class with coincidentally the same name. Doesn't remove bad records like Tzach's, though, and I leave everything as String.

case class Record(id: String, name: String, dateString: String, item: String, count: String)
  myList.map { line =>
    val Array(id, name, dateString, item, count) = line.split(",")
    Record(id, name, dateString, item, count)
  }
  .groupBy(_.id)
  .map(_._2.maxBy(_.dateString))
  .toList  
Comments