jitendrapurohit jitendrapurohit - 1 year ago 51
Javascript Question

unicode chars give "unterminated string literal" in js

This error is generated when my HTML has some weird characters seen as a whitespace.

<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Note that there is a character between
, but it is not seen here. I need to pass this to a function toJson(), but it returns an error saying
unterminated string literal

Everything just works fine when I use a simple text instead of this like:

works fine.

I've tried all the str_replace function which I found while searching for the same -

1) var re = /(?![\x00-\x7F]|[\xC0-\xDF][\x80-\xBF]|[\xE0-\xEF][\x80-\xBF]{2}|[\xF0-\xF7][\x80-\xBF]{3})./g;
params.body_html = html.replace(re, '');
angular.toJson(params); // gives error

2) params.body_html.replace(/\uFFFD/g, '');
angular.toJson(params); // gives error

I don't know what character is this(may be unicode). When I copy this to a emacs file, it is seen as

Answer Source

Got this working with:

params.body_html = params.body_html.replace(/\u2028/g, '');
angular.toJson(params); //works fine.

Thanks to @Gothdo for providing the character link.

But the problem is it'll only replace if html has only this particular unicode char. Is there any function with which all unicode characters gets replaced or trimmed ?