gsamaras gsamaras - 1 month ago 4x
HTML Question

Intelligent regex to understand input

Following Split string that used to be a list, I am doing this:

var regex = /(-?\d{1,})/g;
var cluster = lines[line].match(regex);

which will give me this:

((3158), (737))
["3158", "737"]

where 3158 will be latter treated as the ID in my program and 737 the associated data.

I am wondering if there was a way to treat inputs of this kind too:

((3158, 1024), (737))

where the ID will be a pair, and do something like this:

var single_regex = regex_for_single_ID;
var pair_regex = regex_for_pair_ID;
// do my logic
else if(pair_regex)
// do my other logic
// bad input

Is that possible?


What I am interested in is treating the two cases differently. For example one solution would be to have this behavior:

((3158), (737))
["3158", "737"]

and for pairs, concatenate the ID:

((3158, 1024), (737))
["31581024", "737"]


For a simple way, you can use .replace(/(\d+)\s*,\s*/g, '$1') to merge/concatenate numbers in pair and then use simple regex match that you are already using.


var v1 = "((3158), (737))"; // singular string

var v2 = "((3158, 1024), (737))"; // paired number string

var arr1 = v1.replace(/(\d+)\s*,\s*/g, '$1').match(/-?\d+/g)
//=> ["3158", "737"]

var arr2 = v2.replace(/(\d+)\s*,\s*/g, '$1').match(/-?\d+/g)
//=> ["31581024", "737"]

We use this regex in .replace:

  • It matches and groups 1 or more digits followed by optional spaces and comma.
  • In replacement we use $1 that is the back reference to the number we matched, thus removing spaces and comma after the number.