Arunesh Singh Arunesh Singh - 3 months ago 8
Perl Question

Perl: Formatting output into tabular form from complex hash

I have a structure Like below.

my %hash = (
'P' => {
'4' => [
'1/4/1 2',
'1/4/1 3'
],
'2' => [
'1/2/1 1',
'1/2/5 4'
]
},
'Q' => {
'4' => [
'1/4/1 3'
],
'3' => [
'1/3/1 1'
],
'5' => [
'1/5/1 1'
]
},
);


I want to format it like.

Node 2 3 4 5

P 1/2/1 1 1/4/1 2
P 1/2/5 4 1/4/1 3
Q 1/3/1 1 1/4/1 3 1/5/1 1


I am currently trying for it. Finally, I need to put it into excel. I guess something like this may work with some tweaks.

my %seen;
my @headers = sort { $a <=> $b } grep {!$seen{$_}++} map{ keys %{$hash{$_}} }keys %hash;
print "node", ' ' x 30, join(" " x 10, @headers),"\n";

foreach my $node (keys %hash) {
my @values;
foreach my $num (keys %{$hash{$node}}) {
foreach my $klm (@{$hash{$node}{$num}}) {
push (@values,$klm);
}
}
foreach my $i (0 .. $#values) {
print "$node $values[$i]\n";
}
}


Further suggestions are appreciated.

Answer

I've done it with an Excel file directly because I think it's easier to work with rows and cols and target individual fields than messing with output directly.

use strict;
use warnings 'all';
use Excel::Writer::XLSX;

my %hash = (
    'P' => {
        '4' => [ '1/4/1 2', '1/4/1 3' ],
        '2' => [ '1/2/1 1', '1/2/5 4' ]
    },
    'Q' => {
        '4' => ['1/4/1 3'],
        '3' => ['1/3/1 1'],
        '5' => ['1/5/1 1']
    },
);
my %seen;
my @headers = sort { $a <=> $b } 
    grep { !$seen{$_}++ } 
    map { keys %{ $hash{$_} } } 
    keys %hash;

# count how many rows each node has
my %number_of_rows;
foreach my $node ( keys %hash ) {
COUNT_ROWS: foreach my $header ( keys %{ $hash{$node} } ) {
        $number_of_rows{$node} = scalar @{ $hash{$node}->{$header} };
        last COUNT_ROWS;
    }
}

my $workbook  = Excel::Writer::XLSX->new('test.xlsx');
my $worksheet = $workbook->add_worksheet;

# write the headers and save the cols
my %col_headings;
$worksheet->write( 0, 0, 'Node' );
for ( my $i = 0; $i <= $#headers; $i++ ) {
    $col_headings{ $headers[$i] } = $i + 1;
    $worksheet->write( 0, $i + 1, $headers[$i] );
}

my $row      = 1; # overall row in the Excel file
my $node_row = 0; # current row for the current node
NODE: foreach my $node ( sort keys %hash ) {

    # write the node value (letter P, Q)
    $worksheet->write( $row, 0, $node );

    # iterate the columns ...
    foreach my $header (@headers) {
        # ... but only write one that has a value
        $worksheet->write( 
            $row, 
            $col_headings{$header}, 
            $hash{$node}->{$header}->[$node_row] 
        ) if exists $hash{$node}->{$header};
    }

    $row++; # always go to a new row

    if ( ++$node_row < $number_of_rows{$node} ) {
        # but if we have not done all the rows for the current node,
        # redo this node in a new row with the next node_row
        redo NODE;
    }
    else {
        # or reset the node_row
        $node_row = 0;
    }
}

This will use your code to find all the headers. It assigns which header has which column, and counts the rows per node. It then iterates the nodes, using redo to keep going with one node until all the rows of that node have been exhausted. It does that by keeping a counter of the current node-row that gets incremented. At the same time we always increment the overall row counter so it moves down in the table.

Here is what it looks like.

screenshot of open office

The code is a bit messy, but it gets the job done. It looks like this is a run-once-and-forget kind of task anyway.

Comments