Rob Rob - 7 months ago 35
Perl Question

perl: sliding window search along mesh array

I am ultimately trying to combine three arrays of letters using

, so that I can then compare each position among the sequences.
For example, if I have three files that look like:


the first comparison would be between TTT (this would count as no substitution). If the first letters are TAA, this would count as a substitution. The first challenge is to get the three corresponding letters together to compare.

Here is my code so far:

use strict;
use warnings;
use List::MoreUtils qw{mesh};

open (SEQ_ONE, "<", "/path/to/file_1.txt") or die $!;
open (SEQ_TWO, "<", "/path/to/file_2.txt") or die $!;
open (REFERENCE, "<", "/path/to/reference_sequence.txt") or die $!;

my @first;
my @second;
my @reference;
my @combined;
my $sequence;
my $secondsequence;
my $thirdsequence;
my $windowsize = 3;
my $step = 3;

while (my $line = <SEQ_ONE>){
chomp $line;
if ($line !~ /^>+/) {
$sequence .= $line;
@first = split //, $sequence;

while (my $secondline = <SEQ_TWO>){
chomp $secondline;
if ($secondline !~ /^>+/){
$secondsequence .= $secondline;
@second = split //, $secondsequence;

while (my $thirdline = <REFERENCE>){
chomp $thirdline;
if ($thirdline !~ /^>+/){
$thirdsequence .= $thirdline;
@reference = split //, $thirdsequence;

@combined = mesh @reference, @first, @second;
my $list = "@combined";

for (my $windowstart = 0; $windowstart <= (length($list) - $windowsize); $windowstart += $step){
my $windowSeq = substr($list, $windowstart, $windowsize);
print $windowSeq, "\n";

This seems to break up the letters in chunks of letters, alternating in lengths of 2 and 1 letters. Output for the above code looks something like:


I have experimented with different window and step sizes, but I still can't get the desired output of separate three letters at a time. I am close, just not quite there. Thanks for the help.


The statement my $list = "@combined"; produces a string which contains array elements and spaces added between them. This completely throws off substr processing below. Double-quoting an array ("@array") is a convenience so that when printed it is easier to read. Here you want

my $list = join '', @combined;