OscarWilde OscarWilde - 1 month ago 9
jQuery Question

How can I use a Regular Expression to replace everything except specific words in a string with Javascript

Imagine you have a string like this: "This is a sentence with words."

I have an array of words like

$wordList = ["sentence", "words"];


I want to highlight words that aren't on the list. Which means I need to find and replace everything else and I can't seem to crack how to do that (if it's possible) with RegEx.

If I want to match the words I can do something like:

text = text.replace(/(sentence|words)\b/g, '<mark>$&</mark>');


(which will wrap the matching words in "mark" tags and, assuming I have some css for
<mark>
, highlight them) which works perfectly. But I need the opposite! I need it to basically select the entire string and then exclude the words listed. I've tried
/^((?!sentence|words)*)*$/gm
but this gives me a strange infinity issue because I think it's too open ended.

Taking that original sentence, what I would hope to end up with is
"<mark> This is a </mark> sentence <mark> with some </mark> words."


Basically wrapping (via replace) everything except the words listed.

The closest I can seem to get is something like
/^(?!sentence|words).*\b/igm
which will successfully do it if a line starts with one of the words (ignoring that entire line).

So to summarize: 1) Take a string 2) take a list of words 3) replace everything in the string except the list of words.

Possible? (jQuery is loaded for something else already, so raw JS or jQuery are both acceptable).

sln sln
Answer Source

Create the regex from the word list.
Then do a string replace with the regex.
(It's a tricky regex)

var wordList = ["sentence", "words"];

// join the array into a string using '|'.  
var str = wordList.join('|');
// finalize the string with a negative assertion
str = '\\W*(?:\\b(?!(?:' + str + ')\\b)\\w+\\W*|\\W+)+';

//create a regex from the string
var Rx = new RegExp( str, 'g' );
console.log( Rx ); 

var text = "%%%555This is a sentence with words, but not sentences ?!??!!...";
text = text.replace( Rx, '<mark>$&</mark>');

console.log( text );

Output

/\W*(?:\b(?!(?:sentence|words)\b)\w+\W*|\W+)+/g
<mark>%%%555This is a </mark>sentence<mark> with </mark>words<mark>, but not sentences ?!??!!...</mark>