Gyler Gyler - 7 months ago 21
Perl Question

Empty array in a perl while loop, should have input

Was working on this script when I came across a weird anomaly. When I go to print @extract after declaring it, it prints correctly the following:

------MMMMMMMMMMMMMMMMMMMMMMMMMM-M-MMMMMMMM
------SSSSSSSSSSSSSSSSSSSSSSSSSS-S-SSSSSDTA
------TIIIIIIIIIIIIITIIIVVIIIIII-I-IIIIITTT


Now the weird part, when I then try to print or return @extract (or $column) inside of the while loop, it comes up empty, thus rendering the rest of the script useless. I've never come across this before up until now, haven't been able to find any documentation or people with similar problems as mine. Below is the code, I marked with #<------ where the problems are and are not, to see if anyone can have any idea what is going on? Thank you kindly.

P.S. I am utilizing perl version 5.12.2

use strict;
use warnings;
#use diagnostics;
#use feature qw(say);

open (S, "Val nuc align.txt") || die "cannot open FASTA file to read: $!";
open (OUTPUT, ">output.txt");

my @extract;
my $sum = 0;
my @lines = <S>;
my @seq = ();
my $start = 0; #amino acid column start
my $end = 10; #amino acid column end

#Removing of the sequence tag until amino acid sequence composition (from >gi to )).

foreach my $line (@lines) {
$line =~ s/\n//g;
if ($line =~ />/g) {
$line =~ s/>.*\]/>/g;
push @seq, $line;
}
else {
push @seq, $line;
}
}

my $seq = join ('', @seq);
my @seq_prot = join "\n", split '>', $seq;
@seq_prot = grep {/[A-Z]/} @seq_prot;

#number of sequences
print OUTPUT "Number of sequences:", scalar (grep {defined} @seq_prot), "\n";

#selection of amino acid sequence. From $start to $end.

my @vertical_array;
while ( my $line = <@seq_prot> ) {
chomp $line;
my @split_line = split //, $line;
for my $index ( $start..$end ) { #AA position, extracts whole columns
$vertical_array[$index] .= $split_line[$index];
}
}

# Print out your vertical lines

for my $line ( @vertical_array ) {
my $extract = say OUTPUT for unpack "(a200)*", $line; #split at end of each column
@extract = grep {defined} $extract;
}
print OUTPUT @extract; #<--------------- This prints correctly the input

#Count selected amino acids excluding '-'.
my %counter;
while (my $column = @extract) {
print @extract; #<------------------------ Empty print, no input found
}


Update: Found the main problem to be with the unpack command, I thought I could utilize it to split my columns of my input at X elements (43 in this case). While this works, the minute I change $start to another number that is not 0 (say 200), the code brings up errors. Probably has something to do with the number of column elements does not match the lines. Will keep updated.

Answer

Write your last while loop the same way as your previous for loop. The assignment

my $column = @extract

is in scalar context, which does not give you the same result as:

for my $column (@extract)

Instead, it will give you the number of elements in the array. Try this second option and it should work.

However, I still have a concern, because in fact, if @extract had anything in it, you would obtain an infinite loop. Is there any code that you did not include between your two commented lines?

Comments