Shah Ismail Shah Ismail - 6 months ago 15
Bash Question

Extract multiline attribute value from HTML file

I'm writing a shell script, on MacOSX, in which I need to extract a multiline attribute of a hidden field:

type="hidden" name="PaRes" value="eJzteVdv7MaW7rsA/QfD97FhM3eThqwBc2Y3c3i5YJNs5hybv/5SksP2Pp45c2aAwTzcBgSRi1Wr
VtUK31dVb/+21dUPSzKMedv8+iP0M/jjD0kTtXHepL/+aFvcT/iP//b+ZmVDkjBmEs1D8v6mJuMY
pskPefzrj9f/CyEQgUEEAv34/nYjjWT8lNdj+vM3334e87RJ4qPJb2O9H0P9DL8Bv78eSocoC5vp
/S2MekrU3lECJkAYfQN+e3+rk0Fk3iEYQbHzBSfegC/BG/Bn19v88TQeNm55/E41ybQkrn677OJ9
361eKBfenON5wX99Az5avMXhlLzDIHQGUfj8A3T+BUN+gS5vwKf8rftQR9btfOi+nDHwDfhW8nas
xnAs1vMdh89vwB9vb8nWtU1ytDjm98fzG/CncV3YvIPf/Q7dh/TN8t7fprz+xigQ/QXDf0EPoz7l
b+MUTvP47r8Bvz29ReGyvJMkSZGGgZUr+f3vmOxnk7ckyt9B7DDq+P/Zi6zSdsinrP4w9a+CN+DD
FODTo+9v5uG+Y7Ah+eGIl2b89cdsmrpfAGBd159X5Od2SAH4mAQAEsDRID68/X9+/OqVxGLzaP+l
bnTYtE0ehVW+h9MRHGoyZW38wx+2/Z0ay/jQBAEGS/90qPopgtDmpw8JiEDYoRP4e6XfzOw/M8r3
xg5j+NOYhR+hD3yn6P3NSB7JR0QkP9iG+B9kBJOnyTj9V4b/fehvNfyuzwmrOXmX0BYBczbgOvqM
XIxwmemrZ2+BrZW//t7vq+Ub8Ie9v03my3PfrNBXQxGeMOBiSpqcCeMuCeeTWvKppluzz/CiZgxb
Cuptd3v2VgDkiVrVBSIVy01N5RVwL1FKP50sTfS0TavXF/ER3LFLoj90Wy0yFini3GA4JC0kIjrN
mQLcpYqqogvOJP1i62dWPwMVZGqbwk5P27svii7IsV9fNDt7fclxi7gkDVbYRYQ+ATqm1TzLB4BS
xQwJ0KuLLSG/zqRRplitpoKmYWXpP4JMIRmeLgSvXf1Wnx3GxLnXFwe4ogy0KwoQKXV9y6bdu48W
bTZANYj3uBfgsxit9pQ0Hapuj8aDpkdhY9P5apasZu+72kpCVgNCcipfX4ga0KV1V+QwlahTmMJr
f8WvwuSNtI6x9pVQl6omf/31m0j6zTNy8vzyhIeBBBNO4dcTnQxT/jhC+ihVqihylkXT5O6m5CpS
ZHr87aRGpWWflTlPrCBF6jZHMjQ1FayikiVPQjZLZSptgOLGFqROpZpzdLReX2g7y+4etYd89Qws
9qqS61frTeWMeqsCT994hnS/erQWAxltYFLqHY47/0kJUU1AEcNaKsXyry8fPelNlW3YecZ1VQQ2
sUY7e1epLxvITTUddhOUiqLsJ2YHngYqlQb5pUapVrRqui/JbSBmS6Qd2vSS4vKsCDwJDN2g82EO
DExsP97h0NWqCKTAkLenO080oatOASItsUfOPkxMKoV6jMVCKqOumsUiry8qw64q1x5SEfxd+pts
bHDc04G07FlAMt51RNdkdpzMzfVvaA3w5y0J8MfNyZ93Kp/3xZ/31x9XnN/ea/8/K5YUKA=="></input></td>
</tr>


I need to extract the value of the hidden attribute named PaRes after
value="
to
"></input>


I tried
sed
and
cut
commands but my results aren't very accurate. Any suggestions?

Answer

I've tried a couple of solutions using sed and awk but I couldn't make it work.

Here's a solution with perl:

match.pl

#!/usr/bin/perl -w
my $filename = $ARGV[0];
my $content;
open(my $fh, '<', $filename) or die "cannot open file $filename";
{
  local $/;
  $content = <$fh>;
}
close($fh);

if ($content =~ m/value="(.*)"/s) {
    $result = $1;
    print($result);
}

Call it from your bash script like this:

MATCH="$(perl /path/to/match.pl /path/to/file.html)"
echo "${MATCH}"

It's not the ideal solution but it works, hope it helps!