goodman goodman - 5 months ago 51
Java Question

remove numeric xml tag using java

I have the following xml:

<?xml version=\"1.0\"?>
<TITLE>A Sample Article</TITLE>
<SECT>The First Major Section <PARA>This section will introduce a subsection.</PARA>
<SECT>The Subsection Heading <PARA>This is the text of the subsection. </PARA>

I want to remove the numeric tags "<1>" and "<2>" using Java.

Parsers won't work as its an invalid xml. I need another solution such as a regular expression or any other idea.


You can just use the replaceAll method.

String str = "YOUR XML HERE";
str = str.replaceAll("<[12]>", "");

IDEOne demo

Or as Boheamian pointed in his comment you can use the \d shortcut for digits:

str = str.replaceAll("<\\d>", "");

Btw, if you have more than <1> and <2>, like <n> being n whatever number, then you could use:

str = str.replaceAll("<\\d+>", "");