Vinod Vinod - 11 days ago 5
Javascript Question

Parsing to an array based on multiple delimiters

I need to parse the following string (Parsing PDF, would like to avoid third party packages.).


/Type /Pages /MediaBox [0 0 612 792] /Count 9 /Kids [ 5 0 R 355 0 R ]


I am using Javascript:

String.split(' ');


The Output I would like to get is [
'/Type',
'/Pages',
'/MediaBox',
'[0 0 612 792]',
'/Count',
'9',
'/Kids', '[ 5 0 R 355 0 R]'
]

This results in: the following output: [ '<<',
'/Type',
'/Pages',
'/MediaBox',
'[0',
'0',
'612',
'792]',

Specifically, I would like to delimit '[' and ']'. so that the string would read '[ 5, 0, R, 355, 0, R]'

The Final result expected is this:

I am trying to see if I can address this with regular expression and currently I am stuck.

Answer

This regex should take care of it

var input = "/Type /Pages /MediaBox [0 0 612 792] /Count 9 /Kids [ 5 0 R 355 0 R ]"
var result = input.match(/(\[[^\]]+\]|\S+)/g)
console.log(result)

as an explanation, it groups every character that is not ] between the characters [ and ] ([[^]]+]) OR a sequence of characters that is not a space (\S+)

Comments