n0pe n0pe - 2 years ago 55
Perl Question

Trying to understand this perl script

It seems very simple and I figured most of it out. But seeing as perl is loose with syntax, it's difficult for a new comer to jump right in :)

my @unique = ();
my %seen = ();
foreach my $elem ( @array ) {
next if $seen{ $elem }++;
push @unique, $elem;

This is right from the perldoc website. If I understand correctly, it can also be written as:

my @unique = ();
my %seen = ();
my $elem;
foreach $elem ( @array ) {
if ( $seen{ $elem }++ ) {
push ( @unique, $elem );

So my understanding at this point is:

  • Declare an array named unique

  • Declare a hash named seen

  • Declare a variable named elem

  • Iterate over @array, each iteration is stored in $elem

  • If $elem is a key in the hash %seen (I have no idea what the
    does), skip to the next iteration

  • Append $elem to the end of @unique

I'm missing 2 things:

  • When does anything get stored in %seen?

  • What does ++ do (in every other language it increments, but I dont see how that works)

I know that the issue lies with this part:

$seen{ $elem }++

which I suspect is doing a bunch of different stuff at once. Is there a simpler more verbose way of writing that line?

Thanks for the help

Answer Source

The ++ operator does essentially the same thing in Perl as it does in most other languages that have it: it increments a variable.

$seen{ $elem }++;

increments a value in the %seen has, namely the one whose key is $elem.

The "magic" is that if $seen{$elem} hasn't been defined yet, it's automatically created, as if it already existed and had the value 0; the ++ then sets it to 1. So it's equivalent to:

if (! exists $seen{$elem}) {
    $seen{$elem} = 0;
$seen{$elem} ++;

This is called "autovivification". (No, really, that's what it's called.) (EDIT2: No, my mistake, it's not; as @ysth points out, "autovification" actually refers to references springing into existence. See perldoc perlref.)

EDIT: Here's a revised version of your description:

  • Declare an array variable named @unique
  • Declare a hash variable named %seen
  • Declare a scalar variable named $elem
  • Iterate over @array, each iteration is stored in $elem
  • If $elem is a key in the hash %seen, skip to the next iteration
  • Append the value of $elem to the end of @unique

@unique, %seen, and $elem are all variables. The punctuation character (known as the "sigil" indicates what kind of variable each of them is, and is best thought of as part of the name.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download