Vitaly Vitaly - 9 months ago 16
PHP Question

Regex for extracting comma delimited numbers in brackets


lorem ipsum 999

Block in brackets may contain a lot of numbers.



What I'd like to see:


Solution using PHP:

preg_match_all('/\[id:(.*)\]/', $input, $ids);
if (strpos($ids[1][0], ',')) {
$ids = explode(',', $ids[1][0]);
foreach ($ids as $id) {
echo $id . "\n";
} else {
echo $ids[1][0];

But is it possible using regex without explode()?


The explode way is perhaps the best. Unfortunately, PCRE does not remember repeated groups, thus, you either do it in 2 steps (with the explode), or use a \G based regex. Here is a safer regex than the one you are using (if there are no spaces in between the numbers):

$input = "lorem ipsum 999 [id:284,286] [id:28]"; 
preg_match_all('~\[id:([\d,]*)]~', $input, $ids);
foreach ($ids[1] as $id) {
    print_r(explode(',', $id)) . PHP_EOL;

See the IDEONE demo

The '~\[id:([\d,]*)]~' regex matches [id: and then matches and captures into Group 1 zero or more (due to * 0+ occurrences quantifier) digits (\d) or ,s.

If you need a one-regex solution, in PHP, if you process individual strings, you can make use of a \G based regex that you can leverage to set up the leading boundary and then match the consecutive numbers:


See the regex demo and this IDEONE demo:

$re = '~(?:\[id:|(?!^)\G,)\K\d+~'; 
$strs = array("lorem ipsum 999", "[id:284,286]", "[id:28]"); 
foreach ($strs as $s) {
    preg_match_all($re, $s, $matches);

Pattern details:

  • (?:\[id:|(?!^)\G,) - match the [id: literal character sequence or the end of each successful match with (?!^)\G with a comma after it
  • \K - omit the matched value
  • \d+ - only match 1+ digits

If there can be whitespace between the digits, add \s* after (and perhaps, before) the comma.