smandape smandape - 3 days ago 5
Perl Question

XML parsing using perl

I tried to research on simple question I have but couldn't do it. I am trying to get data from web which is in XML and parse it using perl. Now, I know how to loop on repeating elements. But, I am stuck when its not repeating (I know this might be silly). If the elements are repeating, I put it in array and get the data. But, when there is only a single element it throws and error saying 'Not an array reference'. I want my code such that it can parse at both time (for single and multiple elements). The code I am using is as follows:

use LWP::Simple;
use XML::Simple;
use Data::Dumper;

open (FH, ">:utf8","xmlparsed1.txt");

my $db1 = "pubmed";
my $query = "13054692";
my $q = 16354118; #for multiple MeSH terms
my $xml = new XML::Simple;

$urlxml = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=$db1&id=$query&retmode=xml&rettype=abstract";
$dataxml = get($urlxml);
$data = $xml->XMLin("$dataxml");
#print FH Dumper($data);
foreach $e(@{$data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading}})
{
print FH $e->{DescriptorName}{content}, ' $$ ';
}


Also, can I do something such that the separator $$ will not get printed after the last element?
I also tried the following code:

$mesh = $data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading};
while (my ($key, $value) = each(%$mesh)){
print FH "$value";
}


But, this prints all the childnodes and I just want the content node.

Answer

Perl's XML::Simple will take a single item and return it as a scalar, and if the value repeats it sends it back as an array reference. So, to make your code work, you just have to force MeshHeading to always return an array reference:

$data = $xml->XMLin("$dataxml", ForceArray => [qw( MeshHeading )]);