IridiumFX IridiumFX - 7 months ago 6
Swift Question

First steps in Swift, performance issue when allocating small objects in a BST

In trying to learn Swift 2.2, I am facing a serious performance drop when trying to allocate many small objects (fundamentally, a BST of 262144 elements). My current benchmark is a Java 1.8.0_74 compile of an old snip that I wrote some years ago, which, on my 2012 Retina Macbook Pro executes in 59 seconds (59036178 microsecond). The Issue I can observe via Instruments is that I get literally dozens of swift_retain_ and swift_release per iteration. Not sure how to avoid them:

import Foundation
import Darwin;

import Foundation

public class BinarySearchTree<T : Comparable> {
private var _value : T?;

private var _leftTree : BinarySearchTree<T>?;
private var _rightTree : BinarySearchTree<T>?;

public init(value : T) {
_value = value;
}

var value : T? {
get {
return self._value;
}
set {
self._value = newValue;
}
}

var leftTree : BinarySearchTree<T>? {
get {
return self._leftTree;
}
set {
self._leftTree = newValue;
}
}

var rightTree : BinarySearchTree<T>? {
get {
return self._rightTree;
}
set {
self._rightTree = newValue;
}
}

public func add(newValue : T) -> BinarySearchTree<T> {
var navigator : BinarySearchTree<T>?;
var subtree : BinarySearchTree<T>?;

var done : Bool?;

done = false;
navigator = self;

while (!done!) {
if (newValue < navigator?.value) {
subtree = navigator?.leftTree;
if (subtree != nil) {
navigator = subtree;
} else {
let newNode = BinarySearchTree<T>(value: newValue);
navigator!.leftTree = newNode;
done = true;
}
} else if (newValue > navigator?.value) {
subtree = navigator?.rightTree;
if (subtree != nil) {
navigator = subtree;
} else {
let newNode = BinarySearchTree<T>(value: newValue);
navigator?.rightTree = newNode;
done = true;
}
} else {
done = true;
}
}
return self;
}
} /* cut remove/search methods */


And this is the test code I wrote for the test run

let count : Int32 = 262144;
let base : Int32 = 65536;
let target : Int32 = count + 1;

var info = mach_timebase_info(numer:0, denom:0);
var timebase = mach_timebase_info(&info);
let numer = UInt64(info.numer);
let denom = UInt64(info.denom);
let norm = UInt64(numer/denom);

let check1 = (mach_absolute_time() * norm);

var root = BinarySearchTree<Int32>(value:base);

for var loop in 0 ... count-1 {
if (loop % 1000 == 0) {
print(loop);
}
root = root.add(loop);
}

let check2 = (mach_absolute_time() * norm);
print("Creation phase microseconds: [" + String((check2 - check1) / 1000) + "]");


I tried searching for the specific swift release/retain issue with no luck, I am not sure how to proceed. Thanks everyone

Answer

I simplified your code, removing some Optionals and your getter/setters because they were unnecessary and could contribute to slow code.

I profiled both your code and mine and got this result on the same data set of random elements:

1000 elements:

Yours: Creation phase microseconds: [28680771]

Mine : Creation phase microseconds: [8564279]

10000 elements:

Yours: Creation phase microseconds: [426233689]

Mine : Creation phase microseconds: [126725800]

Here is my code:

public class BinarySearchTree2<T : Comparable> {
  public init(value : T) {
    self.value = value
  }

  var value : T
  var leftTree : BinarySearchTree2<T>?
  var rightTree : BinarySearchTree2<T>?

  public func add(newValue : T) -> BinarySearchTree2<T> {
    var navigator = self

    while (true) {
      if (newValue < navigator.value) {
        guard let subtree = navigator.leftTree else {
          navigator.leftTree = BinarySearchTree2<T>(value: newValue)
          break
        }
        navigator = subtree
        continue
      }
      if (newValue > navigator.value) {
        guard let subtree = navigator.rightTree else {
          navigator.rightTree = BinarySearchTree2<T>(value: newValue)
          break
        }
        navigator = subtree
        continue
      }
      break
    }
    return self
  }
} /* cut remove/search methods */

Edit:

I also did a more optimum balanced tree test where I created a data set of 1001 sequential elements, removed the middle element, used a Fisher-Yates shuffle to randomize the order, initialized the root with the middle element, and ran both sets. Here are my results:

Yours: Creation phase microseconds: [27648219]

Mine: Creation phase microseconds: [8332361]

Comments