SJPRO SJPRO - 7 months ago 13
Perl Question

Perl: Matching 3 pairs of numbers from 4 consecutive numbers

I am writing some code and I need to do the following:

Given a 4 digit number like "1234" I need to get 3 pairs of numbers (the first 2, the 2 in the middle, and the last 2), in this example I need to get "12" "23" and "34".

I am new to perl and don't know anything about regex. In fact, I am writing a script for personal use and I've started reading about Perl some days ago because I figured it was going to be a better language for the task at hand (need to do some statistics with the numbers and find patterns)

I have the following code but when testing I processed 6 digit numbers, because I "forgot" that the numbers I would be processing are 4 digits, so it failed with the real data, of course

foreach $item (@totaldata)
{
my $match;

$match = ($item =~ m/(\d\d)(\d\d)(\d\d)/);

if ($match)
{
($heads[$i], $middles[$i], $tails[$i]) = ($item =~ m/(\d\d)(\d\d)(\d\d)/);
$processednums++;
$i++;
}
}


Thank you.

------------------------- EDIT

This is where I am now, applying the help from the users here. I am going to add some more detail at the beggining of the code, for reference (if needed):

#!/usr/bin/perl

use Switch;
use List::MoreUtils "uniq";
use Scalar::Util "openhandle";


my $argsize = $#ARGV + 1;
my $i, $processednums;
my $datasentinel = 0;
my @heads, @middles, @tails;


# Check the total number of files passed via command line (no more than 3) #
# open the files (when possible or die), copy content/s to array and close #
# if command line arguments are < 0 or > 3 inform appropiate msg and exit. #


switch ($argsize)
{
case 3
{
open(NATDRAWING_DATA, "<$ARGV[2]") or die "No se puedo abrir el fichero, $!";
chomp (@nnatdata = <NATDRAWING_DATA>);
close (NATDRAWING_DATA);
$datasentinel++;
next;
}
case [3,2]
{
open(PDRAWING_DATA, "<$ARGV[1]") or die "No se pudo abrir el fichero, $!";
chomp (@plusdata = <PDRAWING_DATA>);
close (PDRAWING_DATA);
$datasentinel++;
next;
}
case [3,2,1]
{
open(NPDRAWING_DATA, "<$ARGV[0]") or die "No se pudo abrir el fichero, $!";
chomp (@noplusdata = <NPDRAWING_DATA>);
close (NPDRAWING_DATA);
$datasentinel++;
}
else
{
print "\n";
print "Uso: $0 archivo [archivo2] [archivo3]\n";
exit;
}
}

my $promsg = ($datasentinel == 1) ? "Primeros 3 sorteos" : ($datasentinel == 2) ? "Todos los sorteos" : "Todos los sorteos y addendum";

@totaldata = (@noplusdata, @plusdata, @nnatdata);

foreach $item (@totaldata)
{
my $match;
my @arr;

$match = ($item =~ m/(\d\d)(\d\d)(\d\d)/);

if ($match)
{

while ($item =~ /(\d\d)\G/)
{
push @arr, $1;
pos($item)--;
}

($heads[$i], $middles[$i], $tails[$i]) = @arr;
$processednums++;
$i++;
}
}

if ($processednums) { ; } else { print "\nNo se han encontrado datos para processar.\n"; exit; }


Now the last statement is executing and exiting the script, even if I am providing data to process (last msg says "no data to process has been found")

Answer

Here's a pretty loud example demonstrating how you can use substr() to fetch out the portions of the number, while ensuring that what you're dealing with is in fact exactly a four-digit number.

use warnings;
use strict;

my ($one, $two, $three);

while (my $item = <DATA>){
    if ($item =~ /^\d{4}$/){
        $one   = substr $item, 0, 2;
        $two   = substr $item, 1, 2;
        $three = substr $item, 2, 2;
        print "one: $one, two: $two, three: $three\n";
    }
}

__DATA__
1234
abcd
a1b2c3
4567
891011

Output:

one: 12, two: 23, three: 34
one: 45, two: 56, three: 67