Rob Rob - 1 month ago 12
Perl Question

ARRAY(0x7ff4bbb0c7b8) error: perl hash of arrays

Although my code runs without throwing a fatal error, the output is clearly erroneous. I first create a hash of arrays. Then I search sequences in a file against the keys in the hash. If the sequence exists as a key in the hash, I print the key and the associated values. This should be simple enough and I am creating the hash of arrays correctly. However, when I print the associated values I get "ARRAY(0x7ff4bbb0c7b8)" in its place.

The file "INFILE" is tab delimitated and looks like this, for example:

AAAAA AAAAA
BBBBB BBBBB BBBBB


Here is my code:

use strict;
use warnings;

open(INFILE, '<', '/path/to/file') or die $!;
my $count = 0;

my %hash = (
AAAAA => [ "QWERT", "YUIOP" ],
BBBBB => [ "ASDFG", "HJKL", "ZXCVB" ],
);

while (my $line = <INFILE>){
chomp $line;
my $hash;
my @elements = split "\t", $line;
my $number = grep exists $hash{$_}, @elements;
open my $out, '>', "/path/out/Cluster__Number$._$number.txt" or die $!;
foreach my $sequence(@elements){
if (exists ($hash{$sequence})){
print $out ">$sequence\n$hash{$sequence}\n";
}
else
{
$count++;
print "Key Doesn't Exist ", $count, "\n";
}
}
}


Current output looks like:

>AAAAA
ARRAY(0x7fc52a805ce8)
>AAAAA
ARRAY(0x7fc52a805ce8)


Expected output will look like:

>AAAAA
QWERT
>AAAAA
YUIOP


Thank you very much for your help.

Answer

The key here is work with the arrayref held by the hash rather than just trying to print it. No matter what, you are going to want to remove the first item from the array, you can do this with the shift function. You can then either push the item onto the end of the array, or delete the key from the hash when there are no more items depending on what you want to happen when all keys have been used once. You could also choose a random element from the array with the rand function like this:

my $out_seq = $hash{$sequence}[rand $@{ $hash{$sequence} }];

If you wanted the items to run out in the random case, you would need to remove the item from the array. The best way to do that is probably with splice (the generic form of shift, unshift, pop, and push):

my $out_seq = splice @{ $hash{$sequence} }, rand @{ $hash{$sequence} }, 1;
delete $hash{$sequence} unless @{ $hash{$sequence} };

Here is my version of your program:

#!/usr/bin/perl

use strict;
use warnings;

use strict;
use warnings;

# open my $in, '<', '/path/to/file') or die $!;
my $in = \*DATA; #use internal data file instead for testing
my $count = 0;

my %hash = (
    AAAAA    => [ "QWERT", "YUIOP" ],
    BBBBB    => [ "ASDFG", "HJKL", "ZXCVB" ],
);

while (<$in>) {
    chomp;
    my $hash;
    my @elements = split "\t";
    my $number = grep exists $hash{$_}, @elements;
    #open my $out, '>', "/path/out/Cluster__Number$._$number.txt" or die $!;
    my $out = \*STDOUT; # likewise use STDOUT for testing
    for my $sequence (@elements) {
        if (exists $hash{$sequence}) {
            my $out_seq = shift @{ $hash{$sequence} };
            # if you want to repeat
            push @{ $hash{$sequence} }, $out_seq;
            # if you want to remove $sequence when they run out
            # delete $hash{$sequence} unless @{ $hash{$sequence} };

            print $out ">$sequence\n$out_seq\n";
        } else {
            warn "Key [$sequence] Doesn't Exist ", ++$count, "\n";
        }
    }
}

__DATA__
AAAAA   AAAAA
CCCCC
BBBBB   BBBBB   BBBBB
Comments