Maximus Maximus - 1 month ago 10
Node.js Question

Why use `Buffer.concat(body).toString();` instead of `Uint8Array/Buffer.toString()`

I'm reading this article about gathering request data and it gives the following example:

var body = [];
request.on('data', function(chunk) {
body.push(chunk);
}).on('end', function() {
body = Buffer.concat(body).toString();
// at this point, `body` has the entire request body stored in it as a string
});


Other tutorials suggest this way:

var total = [];
request.on('data', function(chunk) {
total += chunk;
}).on('end', function() {
body = total.toString();
// at this point, `body` has the entire request body stored in it as a string
});


They seem to be equivalent. Why use more elaborate
Buffer.concat(body).toString();
then?

Answer

Why use Buffer.concat(body).toString(); instead of UintArray8.toString()?

Because they're doing totally different things. But that's not your real question, chunk is a Buffer as well not an Uint8Array.

The two ways of gathering request data seem to be equivalent. What's the difference?

The second snippet is absolutely horrible code. Don't use it. First of all, it should have been written like this:

var total = "";
request.on('data', function(chunk) {
  total += chunk.toString();
}).on('end', function() {
  // at this point, `total` has the entire request body stored in it as a string
});

Starting with an array is absolute nonsense if you're doing string concatenation on it, and total.toString() was only necessary for the case that there were no data events. total would better be a string right from the beginning. In chunk.toString(), the explicit method call is unnecessary (omitting it would have led to it being called implicitly), but I wanted to show what happens here.

Now, how is converting the chunk buffers to strings and concatenating them different from collecting the buffers in an array, concatenating them to a big buffer and converting that to a string?

The answer is multiple-byte characters. Depending on the encoding and body text, there might be characters that are represented by multiple bytes. It can happen that those bytes come to lie across the border of two chunks (in subsequent data events). With the code that decodes each chunk separately, you'll get an invalid result in those cases.