paul-g paul-g - 2 months ago 10
Perl Question

Parsing and tabulating results

What is a simple and flexible way to parse and tabulate output like the one below on a Unix system?

The output has multiple entries in the following format:

=====================================================
====== SOLVING WITH MATRIX small_SPD ====
===================================================

sizes: 5,5,8

Solving with Sparse LU AND COLAMD ...
COMPUTE TIME : 8.9287e-05
SOLVE TIME : 1.0663e-05
TOTAL TIME : 9.995e-05
REL. ERROR : 2.30263e-18


Solving with BiCGSTAB ...
COMPUTE TIME : 4.113e-06
SOLVE TIME : 1.853e-05
TOTAL TIME : 2.2643e-05
REL. ERROR : 1.34364e-10

ITERATIONS : 2


This should be tabulated as (or similar):

Matrix Sizes Solver Compute Solve Total Rel Error
small_SPD 5,5,8 Sparse LU AND COLAMD 8.9287e-05 1.0663e-05 9.995e-05 2.30263e-18
small_SPD 5,5,8 BiCGSTAB 4.113e-06 1.853e-05 2.2643e-05 1.34364e-10

Answer

If you're just parsing an output, I'd tackle it like this:

#!/usr/bin/env perl
use strict;
use warnings;

#set paragraph mode - look for empty lines between records.
local $/ = '';

#init the matrix/size vars. 
my $matrix;
my $sizes;

#output order    
my @columns = qw ( solver COMPUTE SOLVE TOTAL REL );

#Column headings.
print join "\t", "matrix", "sizes", "solver", "Compute", "Solve", "Total",
  "Rel Error", "\n";

#iterate the data. 
#note - <> is a magic file handle that reads STDIN or 'files specified on command line'
#that's just like how sed/grep/awk do it. 
while (<>) {
   #find and set the matrix name
   #note conditional - this only appears in the 'first' record. 
   if (m/MATRIX (\w+)/) {
      $matrix = $1;
   }
   #find and set the sizes. 
   if (m/sizes: ([\d\,]+)/) {
      $sizes = $1;
   }
   #multi-line pattern match to grab keys and values. 
   #this then maps neatly into a hash. 
   my %result_set = m/^(\w+).*: ([\d\.\-e]+)/gm;

   #add the solver to the 'set':
   #and use this test to check if this 'record' is of interest. 
   #skipping the "ITERATIONS" line. 
   ( $result_set{'solver'} ) = m/Solving with (.*) .../ or next;
   #print it tab separated.  
   print join "\t", $matrix, $sizes, @result_set{@columns}, "\n";
}

Output:

matrix  sizes   solver  Compute Solve   Total   Rel Error   
small_SPD   5,5,8   Sparse LU AND COLAMD    8.9287e-05  1.0663e-05  9.995e-05   2.30263e-18 
small_SPD   5,5,8   BiCGSTAB    4.113e-06   1.853e-05   2.2643e-05  1.34364e-10 

Tab separated, which is probably useful for some applications - but you might want to printf or format instead.

Comments