Logan Logan - 3 months ago 10
Perl Question

Multiline lookback regex



I'm working with a script that outputs a bunch of memory values to a log file and I need to extract a few specific lines that are relative to each other.

Here is a sample of a chunk of the log output:

Pool ID Type Term User/Sys Total Size Free
------- ---- ---- -------- ---------- ----
0x7FC636000000 CONTROL LONG USER 1609564160 335224768

Client ID Memory Alloc'd (Normal/Small) Client Name
--------- -------------- -------------- -----------
0x7FC636001A90 7470051 (7469056/995) DiskControl
0x7FC6360017D8 4067072 (4067072/0) KJS
0x7FC636001520 1158242183 (1157640768/601415) PLU
0x7FC636001268 68499632 (68498240/1392) Splitter
0x7FC636000FB0 36665368 (36664256/1112) BackView





I need to extract the PLU line:

0x7FC636001520 1158242183 (1157640768/601415) PLU


I also need to extract the Pool ID

0x7FC636000000 CONTROL LONG USER 1609564160 335224768


This chunk is one of many and there is no way to identify which Pool ID to grab without knowing where the client is held (so I need to find where the PLU is first before I find the pool).

Finding the PLU line was easy:

/(.*)PLU/


But finding the pool line has proven to be much more difficult.

I've found suggestions for using a multi-line regex search which hasn't seemed to work. I have also tried using lookbacks which doesn't seem to work.

Ignoring for the moment the necessary relation to the specific client and pool, I tried this for just the pool line:

/(?<=----).*(?=Client)/gm


That doesn't highlight anything on regexr.

I'd appreciate some help if anyone can give it. I'm using Perl to write this script for extracting the info ( whole infrastructure is in Perl ).

Answer

It's usually a bad idea to read an entire file into memory, as you most often then need to split it into lines to process it and you may as well read it line by line in the first place

If I understand you correctly, you just need to store every Pool ID that you encounter. Then when you find the PLU client, the relevant Pool ID is the most recent one you encountered

It would look something like this

use strict;
use warnings 'all';

my ($pool_id, $client_id);

while ( <DATA> ) {
    if ( /Pool ID/ ) {
        while ( <DATA> ) {
            last if ($pool_id) = /^0x(\p{hex}+)/;    
        }
    }
    elsif ( /\sPLU\s*$/) {
        ($client_id) = /^0x(\p{hex}+)/;
        last;
    }
}

print "Pool ID:       $pool_id\n";
print "PLU Client ID: $client_id\n";

__DATA__
       Pool ID      Type      Term       User/Sys     Total Size           Free
       -------      ----      ----       --------     ----------           ----
0x7FC636000000   CONTROL      LONG           USER     1609564160      335224768

     Client ID      Memory Alloc'd      (Normal/Small)    Client Name
     ---------      --------------      --------------    -----------
0x7FC636001A90             7470051       (7469056/995)    DiskControl
0x7FC6360017D8             4067072         (4067072/0)    KJS
0x7FC636001520          1158242183 (1157640768/601415)    PLU
0x7FC636001268            68499632     (68498240/1392)    Splitter
0x7FC636000FB0            36665368     (36664256/1112)    BackView

output

Pool ID:       7FC636000000
PLU Client ID: 7FC636001520
Comments