deepseas deepseas - 3 months ago 26
Scala Question

Traverse Set elements and get values using them as keys from a hashmap using Scala

I am trying to use a set created using first column of all files and to create hashmap for each files using first column as key and the second column as value.

Using this Set, I want to check the values for the hashmap but if there is no such key in that hashmap for that file, it needs to put "0" as the values for that key in the new hashmap. It needs to have new hashmap for each file.

//Set for storing
var ids : Set[String] = collection.immutable.HashSet()
//Hashmap for storing
var id:Map[String,String] = collection.immutable.Map()

for (arg<-args){
ids ++= Source.fromFile(arg)
.getLines()
.filterNot(_.trim.startsWith("#"))
.map(_.split("\t")(0))
}

//Create hash map for each file
for (arg<-args){
id ++= Source.fromFile(arg).getLines()
.filterNot(_.trim.startsWith("#"))
.map { l =>
val Array(k,v1,_*)= l.split("\t")
k-> (v1)}.toMap
val filtered = id.filter(i =>
ids.contains(i._1))
println(filtered)
}


For example File a,

#comments
ABC 2
ABN 7
CVF 9


File b

#Comments
#
#
ABC 1
DFG 2
CVF 3


Output:

Map(ABC -> 2, ABN -> 7, CVF -> 9)
Map(ABC -> 1, ABN -> 7, CVF -> 3, DFG -> 2)


Desired output:

Map(ABC -> 2, ABN -> 7, CVF -> 9,DFG -> 0)
Map(ABC -> 1, ABN -> 0, CVF -> 3, DFG -> 2)

Answer

You are overcomplicating your life by worrying overmuch about the file component of it. To think in a 'functional' style, break that part off: all the file functionality needs to do is produce two maps of key-value pairs that you can work with.

Now, starting with those as the assumption, the rest of your program is simple:

// Start with these
val file1 = Map("ABC" -> 2, "ABN" -> 7, "CVF" -> 9)
val file2 = Map("ABC" -> 1, "DFG" -> 2, "CVF" -> 3)

// Get all their keys
val keys = file1.keySet ++ file2.keySet

// For each file, generate a map that has a value for all the keys
def produceMap(file: Map[String, Int], keyset: Set[String]): Map[String, Int] = {
  val keyValuePairs = for {
    key <- keys  // Iterates through all the keys
  } yield (key, file.getOrElse(key, 0))  // getOrElse is useful for filling in empty values
  keyValuePairs.map{case (a, b) => (a -> b)}.toMap  // Converts the Seq[(String, Int)] to a proper map.
}

val map1 = produceMap(file1, keys)  // Map(ABC -> 2, ABN -> 7, CVF -> 9, DFG -> 0)
val map2 = produceMap(file2, keys)  // Map(ABC -> 1, ABN -> 0, CVF -> 3, DFG -> 2)
Comments