onlyf onlyf - 3 months ago 11
Perl Question

sed - Changing pattern with specific number of digits

i m trying to perform a substitution on the following group of lines :

1AA20160817BBBBBDIGITS1NUMBER1STYLE59 00002200000220
1AA20160817BBBBBDIGITS2NUMBER1STYLE60 00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60 00000000000486
1AA20160817DDDDDDIGITS4NUMBER2STYLE59 00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS6NUMBER3STYLE59 00000820000000


I want the final output to be like this :

1AA20160817BBBBBDIGITS1NUMBER1STYLE59 00002200000220
1AA20160817BBBBBDIGITS1NUMBER1STYLE60 00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60 00000000000486
1AA20160817DDDDDDIGITS3NUMBER2STYLE59 00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS5NUMBER3STYLE59 00000820000000


The change is one digit, just before "Number" on every second line. The patterns in the style of BBBBB/DDDDD are times, the last character being the seconds indicator.

I want it to check for a specific number of characters and perform the change there, i ve written the sed to do that task and its like :

sed -i.bak "s/^\(.\{1\}\)$scenario$datein\(.\{6\}\)$pod/1$scenario$datein$timein$pod/g" $1


The rest of the code is in Perl. Could one of you help me do the same substitution in Perl? Or perhaps tell me how i can run this sed command, from a perl code? My problem is the files in question are huge, and bash takes too long to read every line, and perform the substitutions. Thanks in advance.

Answer

You can identify even and odd lines by looking at $. -- the current line number being read from the (last accessed) filehandle. See it in perlvar.

use warnings;
use strict;

my $set_num_to = 0;

while (<DATA>) 
{
    if ($. % 2 != 0) { # odd
        ($set_num_to) = $_ =~ m/(\d)NUMBER/;
        print;
    }
    else { 
        s/\d(?=NUMBER)/$set_num_to/;
        print;
    }
}

__DATA__
1AA20160817BBBBBDIGITS1NUMBER1STYLE59        00002200000220
1AA20160817BBBBBDIGITS2NUMBER1STYLE60        00000000000220
1AA20160817DDDDDDIGITS3NUMBER2STYLE60        00000000000486
1AA20160817DDDDDDIGITS4NUMBER2STYLE59        00004860000486
1AA20160817FFFFFDIGITS5NUMBER3STYLE602523111100000000000000
1AA20160817FFFFFDIGITS6NUMBER3STYLE59        00000820000000

The regex uses the string NUMBER, as given in the example and for the lack of more specifics, to identify the digit to fetch on odd lines which is then used to replace the one at the same position on even lines. It uses a positive lookahead, (?=PATTERN).

One can use substr instead, if the position is fixed, for both purposes. You replace the regexes above with

# Retrieve the number at position 22 (counted from zero)
$set_num_to = substr $_, 22, 1;
# Replace the one at that position
substr $_, 22, 1, $set_num_to;

The rest stays the same and prints lines as specified.

Comments