chaotic3quilibrium chaotic3quilibrium - 3 months ago 11
Scala Question

Confused by foldLeft error (both in Eclipse and REPL)

The context for this is pretty simple. My assumptions are based on Odersky's book "Programming in Scala, 2nd Edition", section 8.5 describing "Placeholder Syntax".

I have a List[List[Boolean]] (i.e. a rectangular bit map) where I am attempting to count the total occurrences of the value "true". Here's the REPL line defining the data which executes fine:

val rowsByColumns =
List( List(false, true, false)
, List(true, true, true)
, List(false, true, false)
)


Next, I attempted to count the occurrences of "true" with the following line. And instead of executing, I receive an error:

val marks = (for(row <- rowsByColumns)
yield {row.foldLeft[Int](0)(_ + (if (_) 1 else 0))}).sum

<console>:8: error: wrong number of parameters; expected = 2
val marks = (for(row <- rowsByColumns) yield {row.foldLeft[Int](0)(_ + (i
f (_) 1 else 0))}).sum
^


I did not understand the error as I have the two underscores representing the parameters to the function. So, I made the function more explicit by writing this which executes just fine:

val marks = (for(row <- rowsByColumns)
yield {row.foldLeft[Int](0)((sum, marked) => sum + (if (marked) 1 else 0))}
).sum


My question is this: Why did I receive and error for the less explicit case, but when I map out the function by reducing the "simplifications", it executes correctly?

Thank you for any insight you can give me on this.

Answer

The limitations of Scala's placeholder syntax for anonymous functions can be extremely confusing (to me, at least). One rule of thumb is that underscores get bound to their nearest enclosing parentheses, but this is an approximation—see section 6.23 of the Scala specification for the detail:

An expression e of syntactic category Expr binds an underscore section u, if the following two conditions hold: (1) e properly contains u, and (2) there is no other expression of syntactic category Expr which is properly contained in e and which itself properly contains u.

In this case the compiler doesn't see the second underscore as a second parameter. This might seem odd, since _ + _ is properly seen as having two parameters, and if (_) x else y is equivalent to z => if (z) x else y (where z is a fresh identifier), but nesting the two doesn't work.

It's true that the compiler could in theory figure out that the two underscores should be parameters for the same anonymous function in your foldLeft, but not, for example, in the following, where the second underscore really does need to be bound separately:

rowsByColumns.map(_.map(!_))

This would require a lot of extra cleverness on the part of the compiler, though, and the Scala language designers have decided that it's not worth it—that placeholder syntax only needs to be provided for some fairly simple cases without nested expressions.


Luckily in this case you can just write rowsByColumns.flatten.count(identity) instead. flatten here concatenates the sublists to give a single List[Boolean]. We then want to know how many of the values in that list are true. count takes a predicate and tells you how many values in a collection satisfy that predicate. For example, here's one way to count the even numbers between 1 and 10 (inclusive):

val isEven: Int => Boolean = _ % 2 == 0    
(1 to 10) count isEven

In your case, though, we already have boolean values, so the predicate doesn't need to do any work—it can just be the identity function x => x. As dhg notes in a comment, Scala's Predef object provides this as a method named identity, which I'm using here. You could just as easily write rowsByColumns.flatten.count(x => x), though, if you find that clearer.