DanielH DanielH - 1 month ago 5
C++ Question

Regex to match against legal citation convention

I have scoured this site and others for the solution to the following problem to no avail.

I am working with the following code in Xcode 8:

#include <iostream>
#include <string>
#include <regex>

int main ()
{
std::string s ("This string contains a neutral citation of [2015] EWCA Civ 123");
std::smatch m;
std::regex e ("(\[\d\d\d\d\] EWCA(.*?)(Civ|Crim) (\d+)) (\(\d+(.*?)\))");

std::cout << "Target sequence: " << s << std::endl;

std::cout << "The following matches and submatches were found:" << std::endl;

while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}

return 0;
}


Essentially, all I'm looking to do is match against anything that looks like
[nnnn] EWCA Civ nnn
.

Xcode is telling me that
\d
is an unknown escape sequence, which seems a bit odd. I have taken a look at an earlier post, Regex for Commonwealth Legal Citations and am none the wiser.

A nudge in the right direction here would be gratefully received.

Answer

The compiler looks at the string "(\[\d\d\d\d\] ..." and sees \d, but doesn't know that this is supposed to be a regular expression. It just tries to find an Escape sequence and doesn't find one.

In order to have \d in your string, you must escape the backslashes, e.g.

std::regex e ("(\\[\\d\\d\\d\\d\\] ...");

An alternative would be to use @NathanOliver's suggestion and use a Raw string literal

std::regex e (R"re(\[\d\d\d\d\] ...)re");