cpiock cpiock - 11 months ago 82
Groovy Question

Regex find all \n in xml tags

I must search all \n inside of all xml tags in my xmal structure. So there are many different xml tags and in this tags can be a string that contains a \n.
How can i find all the \n matches?


Example: http://regex101.com/r/8hWhAX/2.
I need the regex in a groovy script


I need only the \n and not the whole string that contains the \n

Edit 3

I only want to look in the tags not in the whole string

Answer Source


Using any imperative language able to manage an XML string and Regular expressions, how to find '\n'.

The function will receive the full XML content and shall return a vector of indexes to the found characters.


XML require a parser of type at least LL (possibly LL(1), to be checked.). Regular expressions are based on finite state-machines, which do not allows to parse an LL grammaire.

You are required to parse the XML somehow (e.g. with a DOM library) and use any of the top RegExp on provided tags.

References: https://en.wikipedia.org/wiki/Chomsky_hierarchy