Application have the
string
xml
<product_desc></product_desc>
Regex
string
<orderlines>
<orderline>
<id>1000001</id>
<product_id>2004</product_id>
<product_desc>ITEM2004
Color: red
Size: 150x10x10
Material: iron
</product_desc>
<qnt>2</qnt>
</orderline>
<orderline>
<id>1000002</id>
<product_id>2012</product_id>
<product_desc>ITEM2012</product_desc>
<qnt>4</qnt>
</orderline>
<orderline>
<id>1000003</id>
<product_id>3000</product_id>
<product_desc>DELIVERY</product_desc>
<qnt>1</qnt>
</orderline>
</orderlines>
Dim pattern As String = "(<product_desc>[\s\S]*</product_desc>)"
Dim newvalue As String = Regex.Replace(originvalue, pattern, "")
<orderlines>
<orderline>
<id>1000001</id>
<product_id>2004</product_id>
<qnt>1</qnt>
</orderline>
</orderlines>
Regex
<product_desc>
</product_desc>
<orederline>
<qnt>
The problem: [\s\S]*
is greedy
It matches every single char to the end of the string, then the engine backtracks to allow </product_desc>
to match. Therefore, there is one single match from the first opening tag to the last closing tag.
The solution (if we're doing regex): a lazy quantifier
With all the warnings and disclaimers about using regex to parse xml... You can do this:
?
to a quantifier makes it "lazy", so that it matches only as many chars as necessary..*?
in DOTALL mode (as in the sample code below) or [\s\S]*?
(but there is no point).Sample code
Dim ResultString As String
Try
ResultString = Regex.Replace(SubjectString, "(?s)<product_desc>.*?</product_desc>", "")
Catch ex As ArgumentException
'Syntax error in the regular expression
End Try
Reference