dellair dellair - 5 months ago 12
Perl Question

Perl regular expression to match embedded tag once

I have some text which I would like to match based on tag only appears once.
Text is as below (some random chars can contain anything except for tags):

<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>
<tag1><tag2><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3><tag3>Some randome chars</tag3></tag2></tag1>


The match I want is: to match tag3 within tag2 which only appears once.

For example:

<tag2><tag3>something</tag3></tag2> is matched
<tag2><tag3>something</tag3><tag3>something</tag3></tag2> isn't matched


Based on above text, the expected output is: line 2 and 5.

The regex I tried (didn't work):

<tag2><tag3>(.*)?</tag3></tag2>
<tag2><tag3>(.*){1}</tag3></tag2>

Answer

Your regex didn't work because you were allowing everything (.) in your capture group. That is very greedy and will go as far as possible and only stop at the last </tag3>. If you want to match only stuff that cannot inlcude tags, you need to match anything but an opening tag token.

m{<tag2><tag3>([^<]+)</tag3></tag2>}g

Try it on regex101.com.

Comments