Grzegorz Grzegorz - 15 days ago 6
PHP Question

How to replace decoded Non-breakable space (nbsp)

Assuming I have a sting which is

"a s d d"
and
htmlentities
turns it into

"a s d d"
.

How to replace (using preg_replace) it without encoding it to entities?

I tried
preg_replace('/[\xa0]/', '', $string);
, but it's not working. I'm trying to remove those special characters from my string as I don't need them

What are possibilities beyond regexp?

Edit
String I want to parse: http://pastebin.com/raw/7eNT9sZr

with function
preg_replace('/[\r\n]+/', "[##]", $text)


for later
implode("</p><p>", explode("[##]", $text))


My question is not exactly "how" to do this (since I could encode entities, remove entities i don't need and decode entities). But how to remove those with just str_replace or preg_replace.

Answer

The problem is that you are specifying the non-breakable UTF-8 space badly. The proper code is 0xc2a0.

So you can replace it with the regular spaces it using the following code:

// faster solution
$regular_spaces = str_replace("\xc2\xa0", ' ', $original_string);

// more flexible solution
$regular_spaces = preg_replace('/\xc2\xa0/', ' ', $original_string);

Note that in the case of str_replace, you have to use the quotes for the search string, because only quoted strings are parsed and processed by PHP.

Also note how UTF-8 character code is specified as two separate numbers.