user1437328 user1437328 - 1 year ago 417
Javascript Question

Remove zero-width space characters from a JavaScript string

I take user-input (JS code) and execute (process) them in realtime to show some output.

Sometimes the code has those zero width space, it's really weird. i don't know how the users are input'ing that. Example - "

(​$".length === 3

I need to be able to remove that character from my code in JS. how do i do so ? or maybe theres some other way to execute that JS code so that the browser doesn't takes the zero width space characters into account ?

Answer Source

Unicode has the following zero-width characters:

  • U+200B zero width space
  • U+200C zero width non-joiner Unicode code point
  • U+200D zero width joiner Unicode code point
  • U+FEFF zero width no-break space Unicode code point

To remove them from a string in JavaScript, you can use a simple regular expression:

var userInput = 'a\u200Bb\u200Cc\u200Dd\uFEFFe';
console.log(userInput.length); // 9
var result = userInput.replace(/[\u200B-\u200D\uFEFF]/g, '');
console.log(result.length); // 5

Note that there are many more symbols that may not be visible. Some of ASCII’s control characters, for example.