Surender Raja Surender Raja - 3 months ago 9
Scala Question

Finding the TotalSpent for each Customer in Scala

I am working on a simple requirement to find out the totalSpent by each Customer . If a Customer did not spend anything then I need to display TotalSpent Amount as 0 to him

custs.txt :

100,Surender
101,Raja
102,Vijay


txns.txt :

100,2015-01-29,20
100,2015-01-30,18
101,2015-01-14,30
101,2015-01-17,20


Scala Code :

import scala.io.Source

case class Txns(custId: Int, txn_dateString: String, spentAmount: Int)
object totalamounteachcustomer {


def main (args: Array[String])={


val myCusts=Source.fromFile("C:\\inputfiles\\custs.txt").getLines().toList;

val custsTxns=Source.fromFile("C:\\inputfiles\\txns.txt").getLines().toList;

val TxnsGrped =custsTxns.map { x => {
val Array(custId,txn_dateString,spentAmount) = x.split(",")
Txns(custId.toInt,txn_dateString,spentAmount.toInt)

}
}.groupBy { txn => txn.custId }

for(i <- myCusts)
{
val customer= i.split(",")(0).toInt
val values =TxnsGrped.get(customer)

val TotalSpentAmpunt = values match {

case Some( a:List[Txns]) => a.map { x => x.spentAmount }.sum
case None => 0

}

println(customer+" "+TotalSpentAmpunt)
}
}

}


The above code works..

Output :

100 38
101 50
102 0


Do we have simple Join Keywords in scala? If we need to get values based on common key between two files then can't we use something like Join(inner join, left join) in scala?

Here, I am using scala collection Map and iterate it over against each Customer.

Can we achieve this same requirement with simple lines of scala code ?

Answer

Implementing join of maps is a piece of cake in Scala:

def join[K, A, B](a: Map[K, A], b: Map[K, B]): Map[K, (A,B)] =
    for((k,va) <- a; vb <- b.get(k)) yield k -> (va, vb)

Use example:

val customers = Map(
 100 -> "Surender",
 101 -> "Raja",
 102 -> "Vijay"
)

val purchases = Seq(
 (100,"2015-01-29",20),
 (100,"2015-01-30",18),
 (101,"2015-01-14",30),
 (101,"2015-01-17",20)
) groupBy(_._1)

join(customers, purchases) mapValues { case (_, l) => l.map(_._3).sum }

You can make of join an infix operation by wrapping it in an implicit class:

implicit class C[K, A](a: Map[K, A]) {

    def join[B](b: Map[K, B]): Map[K, (A,B)] =
        for((k,va) <- a; vb <- b.get(k)) yield k -> (va, vb)

}

customers join purchases

Note that this join implementation (inner join) is easily modificable to behave as a left-join:

implicit class C[K, A](a: Map[K, A]) {

    def join[B](b: Map[K, B]): Map[K, (A,B)] =
        for((k,va) <- a; vb <- b.get(k)) yield k -> (va, vb)

    def leftJoin[B](b: Map[K, B], default: B): Map[K, (A,B)] =
        for((k,va) <- a; vb = b.getOrElse(k, default)) yield k -> (va, vb)

}

And then use it with you data to get exactly the result you are looking for:

customers leftJoin(purchases, Seq()) mapValues {
    case (_, l) => l.map(_._3).sum
}

> res: scala.collection.immutable.Map[Int,Int] = Map(100 -> 38, 101 -> 50, 102 -> 0)