Infogeek Infogeek - 2 months ago 8
C++ Question

POSIX extended Regex - contain not X but Y (std::regex c++11)

Question explanation

I have been trying to write a regex to pass for exactly this format:

"bob likes poo - whatever(&T(R)*HP#"
" \t \t bob likes poo - *^RFVOG(IBHUO)B"


but fail on:

"//bob likes poo - GV*(GF*("
"# \t bob likes poo - OHG(G(*"
"bob does not like poo G&((HOUIHBO:"


They key bit being.


The line does NOT start with comment characters(# or //), can have
blank spaces(space or tab), has to have something followed by
delimeter(" - "), followed by whatever.


The corner cases are:

1) " \t //this is still a comment - YGV^FV*"


should still fail.

2) " /i_am//_no_/comment - FG&*G*&G"


should pass.

Random reasoning

well, I have failed. which made me ask if we can specify somehow to contain some character but not others. for example

[^abc]


just means any character that is not a, b or c. but how would we say not abc but 123? we can't just put

[^abc123]


because that will exclude them and can't do

[^abc]123


because that will mean it has to have 123 after some character that is not a,b,c which is total of 4 chars instead of 1 we want. I have no idea if it is even possible. So there are 2 quetsions here in a sense.

my best bet so far is:

"[[:blank:]]*[^[:blank:]]+( - ).*"


this makes the format matching correct but does not account for the comments.

EDIT

I have found the working solution. It works but it's ugly as hell:

"[[:blank:]]*([^[:blank:]#]([^/].*)?|[^[:blank:]#/].*)( - ).*"


if anyone knows how to make it nicer, please tell me.

Answer

After understanding more things about requirements within comments I came with this RegEx:

^[[:blank:]]*(\/[^\/][^-]*|([[:blank:]]|^)[^[:blank:]\/#][^-]*) - .*

Matches:

enter image description here

By the way I don't know why really bob likes p** !

Comments