Zesa Rex Zesa Rex - 2 months ago 12
Perl Question

how to make grep use a "from X to Y" syntax? (Using date as parameter)

so I would like to write a script that scans through orders and files and pastes certain lines of these files into a file.

How can I let my file scan through a specified range instead of a singular date?

Actually the code I'd need to change looks like this:

$curdir = "$scriptdir\\$folder";
opendir my $dir_b, "$curdir" or die "Can't open directory: $!";
my @file = grep { /$timestamp/ } readdir $dir_b;
closedir $dir_b;


Now line 3 needs to work actually like this

my @file = grep { /$timestamp1 to $timestamp2/ } readdir $dir_b;


anyone knows how to achieve this? timestamp1 would be as example 20160820 and timestamp2 would be 20160903 or 20160830

thanks for the help

Answer

You can use Regexp::Assemble to build one pattern out of all timestamps that are in the range of your dates.

use strict;
use warnings;
use Regexp::Assemble;

my $timestamp_late  = 20160830;
my $timestamp_early = 20160820;

my $ra = Regexp::Assemble->new;
$ra->add( $_ ) for $timestamp_early .. $timestamp_late;

print $ra->re;

The pattern for that case is: (?^:201608(?:2\d|30))

You can now use it like this:

my $pattern = $ra->re;
my @files = grep { /$pattern/ } readdir $dir_b;

It works by combining multiple patterns into a single one.

Regexp::Assemble takes an arbitrary number of regular expressions and assembles them into a single regular expression (or RE) that matches all that the individual REs match.

As a result, instead of having a large list of expressions to loop over, a target string only needs to be tested against one expression. This is interesting when you have several thousand patterns to deal with. Serious effort is made to produce the smallest pattern possible.

Our patterns here are rather simple (they are just strings), but it works nonetheless. The resulting pattern works like this:

(?^:                ) # non-capture group w/o non-default flags for the sub pattern
    201608            # literal 201608
          (?:      )  # non-capture group
             2\d      # literal 2 followed by a digit (0-9)
                |     # or
                 30   # literal 30

The (?^:) is explained in this part of perlre.

If you pass in more numbers, the resulting regex will look different. Of course this is not meant for dates, so with my simple 1 .. 9 expression we get all numbers in between. The .. is the range operator, and will return the list (1, 2, 3, 4, 5, 6, 7, 8, 9) for the aforementioned case.

So if you wanted to make sure that you only get valid dates, you could take this approach or this approach. Here's an example.

use strict;
use warnings;
use Regexp::Assemble;
use DateTime;

my $timestamp_late  = DateTime->new( year => 2016, month => 9, day => 1 );
my $timestamp_early = DateTime->new( year => 2016, month => 8, day => 19 );    # -1 day

my $ra = Regexp::Assemble->new;
while ( $timestamp_early->add( days => 1 ) <= $timestamp_late ) {
    $ra->add( $timestamp_early->ymd(q{}) );
}

print $ra->re;

This goes over to the next month and gives

(?^:20160(?:8(?:3[01]|2\d)|901))

which, only matches real dates, while the other, simpler, solution will include all numbers between them, including the 99th of August.

(?^:20160(?:8(?:2\d|3\d|4\d|5\d|6\d|7\d|8\d|9\d)|90[01]))