Graham Graham - 8 months ago 25
Java Question

Regex to parse strings that might or might not be delimited by ; into several groups

I have a case where I need to parse a string into several groups depending on a criteria

For example the below;

01%3A%35r%07%01P%88%00;WAP_GPRS


should be 2 groups

%3A%35r%07%01P%88%00
WAP_GPRS


Notice that I dont care about 01 at the beginning and there can be 0 or more substrings delimited by ; and I need them all in their own group.

Another one;

01%3A%35r%07%01P%88%00;KPN;A23B


should be 3 groups:

%3A%35r%07%01P%88%00
KPN
A23B


Basically, I dont need to care if alpha or numeric comes first. The issue is grouping expressions into their own which can be 0 or more times. Meaning the below

01%3A%35r%07%01P%88%00


should produce also one group of
%3A%35r%07%01P%88%00

Answer Source

So I guess you need a regexp analog of split. That would require a repeated capturing group.

Bad news, some people have looked into a similar problem and haven't found the right answer: http://stackoverflow.com/a/6836024/1665128

Good news, if you can live with some reasonable limit on the number of groups and you can add some code to identify the empty trailing ones, this may help:

([^;]*);?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?;?([^;]*)?