ssr1012 ssr1012 - 4 months ago 7x
Perl Question

How to shrink the index page numbers in digits wise using perl

In the Index file, we have primary, secondary and tertiary lines. In these lines we have page numbers with the ranges like:

nutrients in, 223-234
reproductive phase of, 115-116,

It should be

nutrients in, 223-34
reproductive phase of, 115-16,

It may be three digits or above... Could you please any one help me on this one.


We start off finding a digits-digits string where the length of both sets of digits is the same, but without consuming any of it. This involves a lookahead looking for a balanced sets of digits (see for a good explanation) and a negative lookahead to make sure no more digits follow (so we don't simplify 120-1234 into 120-34) and also that it isn't something like 11-12-3 which we don't want to try to handle. Note that it is ok for there to be extra digits before the balanced digits; this allows us to further simplify partially simplified ranges like 123-24.

Once we've done that, we try to find as many digits from the first group as possible where there are at least some digits remaining and the digits in the second group start off the same (using the backreference \2). \K is used to adjust where the substitution starts so that the replacement can remain empty. /a is used to make \d just mean 0-9, not any other kinds of digits.