retrovius retrovius - 7 days ago 4
Swift Question

Which is more efficient: Creating a "var" and re-using it, or creating several "let"s?

Just curious which is more efficient/better in swift:


  • Creating three temporary constants (using let) and using those constants to define other variables

  • Creating one temporary variable (using var) and using that variable to hold three different values which will then be used to define other variables



This is perhaps better explained through an example:

var one = Object()
var two = Object()
var three = Object()

func firstFunction() {
let tempVar1 = //calculation1
one = tempVar1

let tempVar2 = //calculation2
two = tempVar2

let tempVar3 = //calculation3
three = tempVar3

}

func seconFunction() {
var tempVar = //calculation1
one = tempVar

tempVar = //calculation2
two = tempVar

tempVar = //calculation3
three = tempVar

}


Which of the two functions is more efficient? Thank you for your time!

Answer

Not to be too cute about it, but the most efficient version of your code above is:

var one = Object()
var two = Object()
var three = Object()

That is logically equivalent to all the code you've written since you never use the results of the computations (assuming the computations have no side-effects). It is the job of the optimizer to get down to this simplest form. Technically the simplest form is:

func main() {}

But the optimizer isn't quite that smart. But the optimizer really is smart enough to get to my first example. Consider this program:

var one = 1
var two = 2
var three = 3

func calculation1() -> Int { return 1 }
func calculation2() -> Int { return 2 }
func calculation3() -> Int { return 3 }

func firstFunction() {
    let tempVar1 = calculation1()
    one = tempVar1

    let tempVar2 = calculation2()
    two = tempVar2

    let tempVar3 = calculation3()
    three = tempVar3

}

func secondFunction() {
    var tempVar = calculation1()
        one = tempVar

    tempVar = calculation2()
        two = tempVar

    tempVar = calculation3()
        three = tempVar
}

func main() {
    firstFunction()
    secondFunction()
}

Run it through the compiler with optimizations:

$ swiftc -O -wmo -emit-assembly x.swift

Here's the whole output:

    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 9
    .globl  _main
    .p2align    4, 0x90
_main:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    $1, __Tv1x3oneSi(%rip)
    movq    $2, __Tv1x3twoSi(%rip)
    movq    $3, __Tv1x5threeSi(%rip)
    xorl    %eax, %eax
    popq    %rbp
    retq

    .private_extern __Tv1x3oneSi
    .globl  __Tv1x3oneSi
.zerofill __DATA,__common,__Tv1x3oneSi,8,3
    .private_extern __Tv1x3twoSi
    .globl  __Tv1x3twoSi
.zerofill __DATA,__common,__Tv1x3twoSi,8,3
    .private_extern __Tv1x5threeSi
    .globl  __Tv1x5threeSi
.zerofill __DATA,__common,__Tv1x5threeSi,8,3
    .private_extern ___swift_reflection_version
    .section    __TEXT,__const
    .globl  ___swift_reflection_version
    .weak_definition    ___swift_reflection_version
    .p2align    1
___swift_reflection_version:
    .short  1

    .no_dead_strip  ___swift_reflection_version
    .linker_option "-lswiftCore"
    .linker_option "-lobjc"
    .section    __DATA,__objc_imageinfo,regular,no_dead_strip
L_OBJC_IMAGE_INFO:
    .long   0
    .long   1088

Your functions aren't even in the output because they don't do anything. main is simplified to:

_main:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    $1, __Tv1x3oneSi(%rip)
    movq    $2, __Tv1x3twoSi(%rip)
    movq    $3, __Tv1x5threeSi(%rip)
    xorl    %eax, %eax
    popq    %rbp
    retq

This sticks the values 1, 2, and 3 into the globals, and then exits.

My point here is that if it's smart enough to do that, don't try to second-guess it with temporary variables. It's job is to figure that out. In fact, let's see how smart it is. We'll turn off Whole Module Optimization (-wmo). Without that, it won't strip the functions, because it doesn't know whether something else will call them. And then we can see how it writes these functions.

Here's firstFunction():

__TF1x13firstFunctionFT_T_:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    $1, __Tv1x3oneSi(%rip)
    movq    $2, __Tv1x3twoSi(%rip)
    movq    $3, __Tv1x5threeSi(%rip)
    popq    %rbp
    retq

Since it can see that the calculation methods just return constants, it inlines those results and writes them to the globals.

Now how about secondFunction():

__TF1x14secondFunctionFT_T_:
    pushq   %rbp
    movq    %rsp, %rbp
    popq    %rbp
    jmp __TF1x13firstFunctionFT_T_

Yes. It's that smart. It realized that secondFunction() is identical to firstFunction() and it just jumps to it. Your functions literally could not be more identical and the optimizer knows that.

So what's the most efficient? The one that is simplest to reason about. The one with the fewest side-effects. The one that is easiest to read and debug. That's the efficiency you should be focused on. Let the optimizer do its job. It's really quite smart. And the more you write in nice, clear, obvious Swift, the easier it is for the optimizer to do its job. Every time you do something clever "for performance," you're just making the optimizer work harder to figure out what you've done (and probably undo it).


Just to finish the thought: the local variables you create are barely hints to the compiler. The compiler generates its own local variables when it converts your code to its internal representation (IR). IR is in static single assignment form (SSA), in which every variable can only be assigned one time. Because of this, your second function actually creates more local variables than your first function. Here's function one (create using swiftc -emit-ir x.swift):

define hidden void @_TF1x13firstFunctionFT_T_() #0 {
entry:
  %0 = call i64 @_TF1x12calculation1FT_Si()
  store i64 %0, i64* getelementptr inbounds (%Si, %Si* @_Tv1x3oneSi, i32 0, i32 0), align 8
  %1 = call i64 @_TF1x12calculation2FT_Si()
  store i64 %1, i64* getelementptr inbounds (%Si, %Si* @_Tv1x3twoSi, i32 0, i32 0), align 8
  %2 = call i64 @_TF1x12calculation3FT_Si()
  store i64 %2, i64* getelementptr inbounds (%Si, %Si* @_Tv1x5threeSi, i32 0, i32 0), align 8
  ret void
}

In this form, variables have a % prefix. As you can see, there are 3.

Here's your second function:

define hidden void @_TF1x14secondFunctionFT_T_() #0 {
entry:
  %0 = alloca %Si, align 8
  %1 = bitcast %Si* %0 to i8*
  call void @llvm.lifetime.start(i64 8, i8* %1)
  %2 = call i64 @_TF1x12calculation1FT_Si()
  %._value = getelementptr inbounds %Si, %Si* %0, i32 0, i32 0
  store i64 %2, i64* %._value, align 8
  store i64 %2, i64* getelementptr inbounds (%Si, %Si* @_Tv1x3oneSi, i32 0, i32 0), align 8
  %3 = call i64 @_TF1x12calculation2FT_Si()
  %._value1 = getelementptr inbounds %Si, %Si* %0, i32 0, i32 0
  store i64 %3, i64* %._value1, align 8
  store i64 %3, i64* getelementptr inbounds (%Si, %Si* @_Tv1x3twoSi, i32 0, i32 0), align 8
  %4 = call i64 @_TF1x12calculation3FT_Si()
  %._value2 = getelementptr inbounds %Si, %Si* %0, i32 0, i32 0
  store i64 %4, i64* %._value2, align 8
  store i64 %4, i64* getelementptr inbounds (%Si, %Si* @_Tv1x5threeSi, i32 0, i32 0), align 8
  %5 = bitcast %Si* %0 to i8*
  call void @llvm.lifetime.end(i64 8, i8* %5)
  ret void
}

This one has 6 local variables! But, just like the local variables in the original source code, this tells us nothing about final performance. The compiler just creates this version because it's easier to reason about (and therefore optimize) than a version where variables can change their values.

(Even more dramatic is this code in SIL (-emit-sil), which creates 16 local variables for function 1 and 17 for function 2! If the compiler is happy to invent 16 local variables just to make it easier for it to reason about 6 lines of code, you certainly shouldn't be worried about the local variables you create. They're not just a minor concern; they're completely free.)