saumya saumya - 5 months ago 11
Perl Question

How to print XML Tag's attribute data and tag value sequentially in perl?



Say I have an example XML Document,

<root>
<subnode1 att1="sn1att1" att2="sn1att2">Subnode 1</subnode1>
<subnode2 att1="sn2att1" att2="sn2att2">Subnode 2</subnode2>
<subnode3 att1="sn3att1" att2="sn3att2">
<subnode31 att1="sn31att1" att2="sn31att2">
<subnode311 att1="sn311att1" att2="sn311att2">
<subnode3111 att1="sn3111att1" att2="sn3111att2">Subnode 3-111</subnode3111>
</subnode311>
</subnode31>
<subnode32 att1="sn32att1" att2="sn32att2">Subnode 3-2</subnode32>
</subnode3>
</root>


I want to print something like this

sn1att1 sn1att2 Subnode 1
sn2att1 sn2att2 Subnode 2
sn3att1 sn3att2 Subnode 3
sn31att1 sn31att2
sn311att1 sn311att2
sn3111att1 sn3111att2 Subnode 3-111


I have written below code, which is able to print the attributes as described but not able to print the tag value (for example "Subnode 1","Subnode 2",etc).

use XML::XPath;
use XML::XPath::XMLParser;

my $xp = XML::XPath->new( filename => 'raw1.xml');

for my $node ($xp->findnodes('*/*')){
print "\n".$node->getName."\t";

for my $attribute ($node->getAttributes){
print " ".$attribute->getData;
}

for my $property ($node->findnodes('.//*')){
print "\n".$property->getName."\t";
for my $attributes ($property->getAttributes){
print " ".$attributes->getData;
}
}

}

Answer

I think this does what you want

I'm not very familiar with XML::XPath, but I do know XPath

It looks like, for each element in the XML, you want to print a line that contains the values of each of the attributes, and of all child text nodes if there are any

That's not so simple as it may seem, as any element may contain multiple text children interspersed with multiple child elements

This code accumulates the values of all attributes and all non-blank text children into array @line and prints the line if the result isn't empty

I don't understand why your required output doesn't include my line

sn32att1 sn32att2 Subnode 3-2

Perhaps you will explain?

use strict;
use warnings 'all';

use XML::XPath;

my $xp = XML::XPath->new( filename => 'raw1.xml' );

for my $node ( $xp->findnodes('//*') ) {

    my @line;

    for my $attr ( $node->getAttributes ) {
        push @line, $attr->getData;
    }

    my @text = grep /\S/, map { $_->getData } $node->findnodes('text()');

    push @line, @text;

    print "@line\n" if @line;
}

output

sn1att1 sn1att2 Subnode 1
sn2att1 sn2att2 Subnode 2
sn3att1 sn3att2
sn31att1 sn31att2
sn311att1 sn311att2
sn3111att1 sn3111att2 Subnode 3-111
sn32att1 sn32att2 Subnode 3-2