user113531 user113531 -3 years ago 115
Scala Question

Difference between scala concurrent and non concurrent class under parallelsim

In Scala - functional programming course on coursera, the following snippet is described to fail in the concurrent case. The snippet will succeed if the

mutable.Set
is replaced with a concurrent class.

def intersection(a: GenSet[Int], b: GenSet[Int]): Set[Int] = {
val result = mutable.Set[Int]()
for (x <- a) if (b contains x) result += x
result
}
intersection((0 until 1000).toSet, (0 until 1000 by 4).toSet)
intersection((0 until 1000).par.toSet, (0 until 1000 by 4).par.toSet)



Rule: Avoid mutations to the same memory locations without proper
synchronization.


What is the difference between the concurrent and non-concurrent class, or what is the reason that the non-concurrent class can fail under parallelism?

Answer Source

When concurrently append element to Set without synchronization, the Set can't handle the collisions correctly or calculate position wrongly. so for your example, the res2 maybe will have duplicate fields or less some fields.

Explantation:

for:

for (x <- a) if (b contains x) result += x
result
}

There is a race condition for result += x. it equals to result.addEntry(x), but for this method it's not thread safe,

var h = index(newEntry.hashCode)
var curEntry = table(h)
while (null != curEntry) {
  if (curEntry == newEntry) return false
  h = (h + 1) % table.length
  curEntry = table(h)
  //Statistics.collisions += 1
}
table(h) = newEntry

In the above code, when try to concurrently append element to HashTable. it's maybe caused calculate wrong position or meet wrong collisions. For example, when try to add newEntry to the Set, actually it's not exist in the set, it will directly go to table(h) = newEntry, but at the same time, there is a new value, it has the same hashcode with the newEntry, but for the first newEntry still not finish table(h) = newEntry, so the newEntry will be override by the second value.

so for the synchronization maybe you can do it like:

for (x <- a) {
  if (b contains x) {
    this.synchronized {
      result += x
    }
  }
}
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download