krakig krakig - 4 months ago 24
Javascript Question

Finding characters with spaces

I was trying last week to find parts of a text containing specific words delimited by punctuation characters. That works well.


On the following sentence
"How did you do it? bla bla bla! why did you do it?"
, it's giving me the following output :

"How did you do it?"
"why did you do it?"

Now I am trying to add the hyphen character : I want to detect if there is an hyphen with spaces around (a new sentence delimiter):

"The man went walking upstairs - why was he there?

That would return me :
"why was he there?"

It would follow the following rules:

hello - bye -> this would be the only one to be matched
hello-bye -> not matched
hello -bye -> not matched
hello- bye -> not matched

Using the negation, I tried to add that part :

[^.?!:\\s\\-\\s] => ignore everything that ends with a "." or a "?" or a "!" or a ":" or a " - "

I doesn't work, but as I am pretty bad using regex, I am probably missing something obvious.

var regex = /[^.?!:\\s\\-\\s]*\b(why|how)\b[^.?!]*[.?!]/igm
var text = "Here I am - why did you want to see me?"

var match;

while ((match = regex.exec(text)) != null) {

Output :

Here I am - why did you want to see me?

Expected output :

why did you want to see me?


There are two issues that I see:

  • backslashes (use single inside a regex literal, double in constructor) and
  • Sequence is used inside a character class (replace [^.?!:\s\-\s] with (?:(?!\s-\s)[^.?!:])*).

You may use

var regex = /(?:(?!\s-\s)[^.?!:])*\b(why|how)\b[^.?!]*[.?!]/igm

where (?:(?!\s-\s)[^.?!:])* is a tempered greedy token matching any character other than ^.?!: that is not starting a whitespace+-+whitespace pattern.

var regex = /(?:(?!\s-\s)[^.?!:])*\b(why|where|pourquoi|how)\b[^.?!]*[.?!]/ig;
var text = "L'Inde a déjà acheté nos rafales, pourquoi la France ne le -dirait-elle pas ?";
var match;
while ((match = regex.exec(text)) != null) {