bademeister bademeister - 1 year ago 73
C++ Question

Regex backreference not working

I want to match this html-like pattern:

<12>Some content with \n in it<12>

Important is that only complete items are marked (numbers MUST match), means when one tag is missing the content should not be marked.
<12>Some content with \n in it<13>test<13>

This is what I've got so far:


This is what I expect that it should work but actually it does not:


I tried with this editor but the backreference does not work as I expect. Why does the backreference to the first capture group not work? The solution should work in C++.

Answer Source

Try this:



< matches the character < literally (case sensitive)
1st Capturing Group  (\d+)
\d+ matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times(greedy)

> matches the character > literally (case sensitive)

matches any character (except for line terminators)

*? Quantifier — Matches between zero and unlimited times (lazy)

< matches the character < literally (case sensitive)

\1 matches the same text as most recently matched by the 1st capturing group

> matches the character > literally (case sensitive)

C++14 Code Sample:

#include <regex>
#include <string>
#include <iostream>
using namespace std;

int main()
    string regx = R"(<\s*(\d+)\s*>(.*?)<\s*\1\s*>)";
    string input = "<1>test1<1><2>Test2<2>sfsaf<3><4>test4<4>";
    smatch matches;
        while (regex_search(input, matches, regex(regx)))
            input = matches.suffix().str();
    return 0;

Run the code here

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download