Ben Morrow Ben Morrow - 3 months ago 16
Swift Question

Should I prefer joined() or flatMap(_:) in Swift 3?

Swift 3 recently added

joined()
. I'm curious about the performance characteristics of these two ways of flattening a multi dimensional array:

let array = [[1,2,3],[4,5,6],[7,8,9]]
let j = Array(array.joined())
let f = array.flatMap{$0}


They both flatten the nested
array
into
[1, 2, 3, 4, 5, 6, 7, 8, 9]
. Should I prefer one over the other for performance? Also, is there a more readable way to write the calls?

Answer

TL; DR

When it comes just to flattening 2D arrays (without any transformations or separators applied, see @dfri's answer for more info about that aspect), array.flatMap{$0} and Array(array.joined()) are both conceptually the same and have similar performance.


The main difference between flatMap(_:) and joined() (note that this isn't a new method, it has just been renamed from flatten()) is that joined() is always lazily applied (for arrays, it returns a special FlattenBidirectionalCollection<Base>).

Therefore in terms of performance, it makes sense to use joined() over flatMap(_:) in situations where you only want to iterate over part of a flattened sequence (without applying any transformations). For example:

let array2D = [[2, 3], [8, 10], [9, 5], [4, 8]]

if array2D.joined().contains(8) {
    print("contains 8")
} else {
    print("doesn't contain 8")
}

Because joined() is lazily applied & contains(_:) will stop iterating upon finding a match, only the first two inner arrays will have to be 'flattened' to find the element 8 from the 2D array. Although, as @dfri correctly notes below, you are also able to lazily apply flatMap(_:) through the use of a LazySequence/LazyCollection – which can be created through the lazy property. This would be ideal for lazily applying both a transformation & flattening a given 2D sequence.

In cases where joined() is iterated fully through, it is conceptually no different from using flatMap{$0}. Therefore, these are all valid (and conceptually identical) ways of flattening a 2D array:

array2D.joined().map{$0}

Array(array2D.joined())

array2D.flatMap{$0}

In terms of performance, flatMap(_:) is documented as having a time-complexity of:

O(m + n), where m is the length of this sequence and n is the length of the result

This is because its implementation is simply:

  public func flatMap<SegmentOfResult : Sequence>(
    _ transform: (${GElement}) throws -> SegmentOfResult
  ) rethrows -> [SegmentOfResult.${GElement}] {
    var result: [SegmentOfResult.${GElement}] = []
    for element in self {
      result.append(contentsOf: try transform(element))
    }
    return result
  }
}

As append(contentsOf:) has a time-complexity of O(n), where n is the length of sequence to append, we get an overall time-complexity of O(m + n), where m will be total length of all sequences appended, and n is the length of the 2D sequence.

When it comes to joined(), there is no documented time-complexity, as it is lazily applied. However, the main bit of source code to consider is the implementation of FlattenIterator, which is used to iterate over the flattened contents of a 2D sequence (which will occur upon using map(_:) or the Array(_:) initialiser with joined()).

  public mutating func next() -> Base.Element.Iterator.Element? {
    repeat {
      if _fastPath(_inner != nil) {
        let ret = _inner!.next()
        if _fastPath(ret != nil) {
          return ret
        }
      }
      let s = _base.next()
      if _slowPath(s == nil) {
        return nil
      }
      _inner = s!.makeIterator()
    }
    while true
  } 

Here _base is the base 2D sequence, _inner is the current iterator from one of the inner sequences, and _fastPath & _slowPath are hints to the compiler to aid with branch prediction.

Assuming I'm interpreting this code correctly & the full sequence is iterated through, this also has a time complexity of O(m + n), where m is the length of the sequence, and n is the length of the result. This is because it goes through each outer iterator and each inner iterator to get the flattened elements.

So, performance wise, Array(array.joined()) and array.flatMap{$0} both have the same time complexity.

Although what's interesting is that running a quick benchmark reveals that:

import QuartzCore

func benchmark(repeatCount:Int = 1, name:String? = nil, closure:() -> ()) {
    let d = CACurrentMediaTime()
    for _ in 0..<repeatCount {
        closure()
    }
    let d1 = CACurrentMediaTime()-d
    print("Benchmark of \(name ?? "closure") took \(d1) seconds")
}

let arr = [[Int]](repeating: [Int](repeating: 0, count: 1000), count: 1000)

benchmark {
    _ = arr.flatMap{$0} // 0.3377s
}

benchmark {
    _ = Array(arr.joined()) // 0.514s
}

benchmark {
    _ = arr.joined().map{$0} // 1.573s
}

flatMap(_:) appears to be the fastest by a fraction of a second. I suspect that joined() being slower could be due to the branching that occurs within the FlattenIterator (although the hints to the compiler minimise this cost) – although just why map(_:) is so slow, I'm not too sure. Would certainly be interested to know if anyone else knows more about this.