Stéphane Stéphane - 2 months ago 14
Perl Question

Optimization of the ternary operator in Perl

I have this loop :

for my $line (split /\n/, $content) {
($line !~ /^\-{2,}$/) ? ( $return .= "$line\n" )
: ( $return .= "\N{ZERO WIDTH SPACE}$line\n" );
}


There will be mostly lines that doesn't match the regex (ie : most of the time the condition will be true).

I've first wrote the condition using the
=~
operator (with the two conditional instructions swapped) but then this is the second instruction would have been executed most of the times.

In other words… When you have a test which you know that it will choose one branch in 99% of the cases, does it change something (performance) to write it with that branch first?

Answer

When you have a test which you know that it will choose one branch in 99% of the cases, does it change something (performance) to write it with that branch first?

In the simple if/else case (which is what the ternary operator is), the answer is no. The order of the branches does not matter, the condition will run every time and pick which branch to go down.

In an if/elsif/else case it would matter because there are multiple conditionals to be run. Putting the most common case first would make things faster.

If an if/else pick the order that makes the most sense for the reader, and that usually means avoiding negatives. $line =~ /^\-{2,}$/ is easier to read than $line !~ /^\-{2,}$/. $line =~ /^-{2,}$/ is even better (there's no need to escape - in a regex).

At least it shouldn't matter. As with anything as complicated as Perl, it's best to benchmark these things. It's a bit troublesome to come up with something that will exercise the CPU enough so as not to be lost in the normal benchmarking jitter. Be sure to run this multiple times with plenty of iterations before drawing conclusions.

use strict;
use warnings;
use v5.10;

use Benchmark qw(cmpthese);

my $Iterations = shift;

my $Threshhold = 100_000;

# I've picked something that isn't constant to avoid constant folding
sub a_then_b {
    my $num = shift;
    return $num > $Threshhold ? sqrt($num) + sqrt($num) ** 2
                              : $num + $num;
}

sub b_then_a {
    my $num = shift;
    return $num <= $Threshhold ? $num + $num
                               : sqrt($num) + sqrt($num) ** 2;
}

say "First one side";
cmpthese $Iterations, {
    a_then_b => sub { a_then_b($Threshhold - 1) },
    b_then_a => sub { b_then_a($Threshhold - 1) }
};

say "Then the other";
cmpthese $Iterations, {
    a_then_b => sub { a_then_b($Threshhold + 1) },
    b_then_a => sub { b_then_a($Threshhold + 1) }
};

As a final note, to take proper advantage of a ternary the assignment should go on the left-hand-side. The ternary returns the result of its branch.

$return .= $line =~ /^-{2,}$/ ? "\N{ZERO WIDTH SPACE}$line\n"
                               : "$line\n";