Dominik Dominik - 1 month ago 9
Javascript Question

(Javascript) Regex specific characters but excluding

I'm struggle with regex.
I found some seperated solution for my problem, but doesn't work together.
Now I'm not even sure if this is possible at all.

I have a string like:

"ÿÿÿÿÿÿBla bla äöüß!ÿÿÿÿÿ\nÿÿÿстрокаÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿQ\u001f\u0001\u0001"


I want replace all characters


  • between 0x00 and 0x1F (until space, non pritable)

  • and 0xFF ("ÿ")

  • but not 0x0A and 0x0D. (line breaks)



I have both cases seperated:

// Works great but removes linebreaks.
str = str.replace(/[\x00-\x1F\xFF]+/g, '');


I want to exclude line breaks.

//This only extract the line breaks.
str = str.replace(/[^\x0A]/g, '');


But i want this together / merged. Like that (pseudo)

'''// Incorrect regex. But correct logic.
str = str.replace(/[\x00-\x1F\xFF^\x0A^\x0D]+/g, '');
'''

I have no idea.
I would be really grateful for constructive help.

Wanted result of string:

"Bla bla äöüß!\nстрокаQ"


The string must be compatible with UTF-8.
I know there is a regex function for removing non pritable character.
But this also removes umlauts (äöü), cyrillic alphabet and others.

Answer

You may use

/[\x00-\x09\x0B\x0C\x0E-\x1F\xFF]+/g
       ^^^^^^^^^^^^^^^^   

The point is that you need to re-organize the ranges in the character class, to exclude \x0A (newline) and \x0D (carriage return).

See demo below:

var s = "ÿÿÿÿÿÿBla bla äöüß!ÿÿÿÿÿ\nÿÿÿстрокаÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿQ\u001f\u0001\u0001";
console.log(s);
var res = s.replace(/[\x00-\x09\x0B\x0C\x0E-\x1F\xFF]+/g,'');
console.log(res);