Scala Question

What is the Von Neuman bottleneck?

What is the Von Neuman bottleneck and how does functional programming reduces its effect? Can someone explain in a simple way through a practical and comprehensive example that shows, for instance, the advantage of using Scala over Java?

Answer

Using Scala will not necessarily fix your performance problems, even if you use functional programming.

More importantly, there are many causes of poor performance, and you don't know the right solution without profiling.

The von Neumann Bottleneck has to do with the fact that, in a von Neumann architecture, the CPU and memory are separate and therefore the CPU often has to wait for memory. Modern CPUs solve this by caching memory. This isn't a perfect fix, since it requires the CPU to guess correctly about which memory it needs to cache. However, high-performance code makes it easy for the CPU to guess correctly by structuring data efficiency and iterating over data linearly (i.e. good data locality).

Scala can simplify parallel programming, which is probably what you are looking for. This is not directly related to the von Neumann Bottleneck.

Even so, Scala is not automatically the answer if you want to do parallel programming. There are several reasons for this.

  1. Java is also capable of parallel programming, and has many types of parallel collections for that purpose.
  2. Java 8 Streams are Java's answer to Scala's parallel collections. They can be used for functional programming.
  3. Parallel programming is not guaranteed to improve performance, and can make a program slower on small data sets, due to setup costs.

There is one case where you are correct that Scala overcomes the von Neumann Bottleneck, and that is with big data. When the data won't fit easily on a single machine, you can store the data on many machines, such as a Hadoop cluster. Hadoop's distributed filesystem is designed to keep data and CPUs close together to avoid network traffic. The easiest way to program for Hadoop is currently with Apache Spark in Scala. Here are some Spark examples; as of Spark 2.x, the Scala examples are much simpler than the Java examples.