Startec Startec -4 years ago 70
Javascript Question

Is there a way to force Node to write a file using surrogate pairs for Unicode characters (in JSON)?

According to this question JSON is automatically written using surrogate pairs.

However, this is not my experience.

Using Node

and the following code my file still shows characters not encoded using surrogate pairs.

const fs = require('fs')

const infile = fs.readFile('raw.json', 'utf8', (err, data) => {
if (err) {
throw err

data = JSON.stringify(data)

fs.writeFile('final.json', data, 'utf8', (err) => {
if (err) {
throw err


In my editor, which must have good unicode and use and a font that has glyphs for these characters, the contents of the file
have characters such as

That character still appears in
(no change is made).

Additionally I tried switching the encoding
for the file being written but nothing changed.

Is there a way to force using surrogate pairs?

Answer Source

The quoted question is misleading if one concludes that JSON.stringify will convert Unicode characters in a string, outside the Basic Multilingual Plane, to a sequence of \u escaped surrogate pair values. This answer better explains how JSON.stringify only needs to escape backslash (\), double quotation (") and control characters.

Hence if the input data contains a character occupying more than one octet (such as the `'題' used for example) it will be written to the output file as that character. If the file is successfully written and then read back using UTF16 encoding, the UTF8 encoded input character will, hopefully, be the character you see.

If the goal is to convert JSON text into ASCII using \u escaped characters for non ASCII values, and surrogate pairs for characters outside the BMP, then the JSON formatted string can be processed using simple character inspection (JSON has already converted the quote, backslash and control characters:

var jsonComponent = '"2®π≤題
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download