Piodo Piodo - 7 months ago 15
Perl Question

How can I find sentences nested deeper than one bracket '()' set?

I want to print sentences from text file placed in

()
brackets deeper than one pair of brackets.

For example for this text file :

blabla(nothing(print me)) nanana (nanan)
blablabla(aaaaaaa(eeee(bbbb(cccc)bbb))aa)
blabla (blabla(hhhhh))


the output should be :

print me
eeee(bbbb(cccc)bbb)
bbbb(cccc)bbb
cccc
hhhhh


This is what I've done so far:

#!/usr/bin/perl -w

open(FILE, "<", $ARGV[0]) or die "file open error";

if ( @ARGV ) #if there are args
{
if ( -f $ARGV[0] ) #if its regular file
{
while(<FILE>)
{
my @array = split('\)',$_);
foreach(@array)
{
if ($_ =~ /.*\((.*)/)
{
print "$1\n";
}
}
}
close(FILE);
}
else{
print "Arg is not a file\n";}
}
else{
print "no args\n";}


My code can't separate the sentences placed in deeper brackets.

Answer
use strict;
use warnings;

my @a;

while (<DATA>) {
    while (/\(([^()]*(?:\(((?1))\)[^()]*(?{push @a, $2}))*+)\)/g){}
}

print join "\n", @a;

__DATA__
blabla(nothing(print me)) nanana (nanan)
blablabla(aaaaaaa(eeee(bbbb(cccc)bb(xxxx)b))aa)
blabla (blabla(hhhhh))

It returns:

print me
cccc
xxxx
bbbb(cccc)bb(xxxx)b
eeee(bbbb(cccc)bb(xxxx)b)
hhhhh

The idea is to store the capture group 2 content after each recursion, using the (?{...}) construct to execute code in the pattern.

Note that the order of results isn't ideal since the innermost content appears first. Unfortunately, I didn't find a way to change the order of results.

Pattern details:

\(  # opening bracket level 1
(   # open capture group 1
    [^()]*        # all that is not a bracket
    (?:
        \(        # opening bracket for level 2 (or more when a recursion occurs)
        (         # capture group 2: to store the result
            (?1)  # recursion
        )
        \)        # closing bracket for level 2 (or more ...)
        [^()]*    # 
        (?{push @a, $2}) # store the capture group 2 content in @a
    )*+ # repeat when needed
)
\) # closing bracket level 1
Comments