GlassGhost GlassGhost - 7 months ago 17
Perl Question

How to have Variable as Recursive Regex in Perl?

I'm writing a simple translator for John Tromp's Binary Lambda Calculus over to De Bruijn Notation Lambda Calculus so that I can understand how his Lambda files are working in his 2012 "Most Functional" International Obfuscated C Code winner

here is an example of the language before translation

primes.blc
:

00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110


I'm having trouble with a nested regex in commented line before the primes.txt file save section of Bruijn.pl:

#!/usr/bin/env perl
#use strict;
use warnings;
use IO::File;
use Cwd; my $originalCwd = getcwd()."/";
#primes.blc as argument for test conversion
#______________________________________________________________________open file
my ($name) = @ARGV;
$FILE = new IO::File;
$FILE->open("< ".$originalCwd."primes.blc") || die("Could not open file!");
#$FILE->open("< ".$name) || die("Could not open file!");
while (<$FILE>){ $field .= $_; }
$FILE->close;
#______________________________________________________________________Translate
$field =~ s/(00|01|(1+0))/$1 /gsm;
$field =~ s/00 /\\ /gsm;
$field =~ s/01 /(a /gsm;
$field =~ s/(1+)0 /length($1)." "/gsme;

$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/;
#$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm;
#______________________________________________________________________save file
#$fh = new IO::File "> ".$name;
$fh = new IO::File "> ".$originalCwd."primes.txt";
if (defined $fh) { print $fh $field; $fh->close; }


An what the translated file
primes.txt
should be:

\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))


Currently with the line commented out it translates to an almost readable format that looks like:

\ (a \ (a 1 (a 1 (a (a \ (a 1 1 \ \ \ (a (a 1 \ \ 1 (a \ (a (a (a 4 4 1 (a \ (a 1 1 \ (a 2 (a 1 1 \ \ \ \ (a (a 1 3 (a 2 (a 6 4 \ \ \ (a 4 (a 1 3 \ \ (a (a 1 \ \ 2 2


Which needs to find innermost abstractions of
(a
and 2 of either a number or matching parentheses and all their contents and insert a trailing
)
and remove the
a
all the way up to the outermost application.

sln sln
Answer

You probably need a regex like this

 # (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))

 ( \(a )                       # (1)
 (                             # (2 start)
      (                             # (3 start)
           [\s\\]*? 
           (?:
                \d+ 
             |  
                (?&RecursParens) 
           )
      ){2}                          # (3 end)
 )                             # (2 end)

 (?(DEFINE)

      (?<RecursParens>              # (4 start)
           (?>
                \(
                (?>
                     (?> [^()]+ )
                  |  (?:
                          (?= . )
                          (?&RecursParens) 
                       |  
                     )
                )+
                \)
           )
      )                             # (4 end)
 )

With a Perl code like this

use strict;
use warnings;
use feature qw{say};

my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";

$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;

1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;

$field =~ s/\( /\(/g;

say $field;

That will give you an output like this

\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))

That can be formatted to look like this

 \ 
 (                             # (1 start)
      \ 
      (                             # (2 start)
           1 
           (                             # (3 start)
                1 
                (                             # (4 start)
                     (                             # (5 start)
                          \ 
                          ( 1 1 )                       # (6)
                          \ \ \ 
                          (                             # (7 start)
                               ( 1 \ \ 1 )                   # (8)
                               (                             # (9 start)
                                    \ 
                                    (                             # (10 start)
                                         (                             # (11 start)
                                              ( 4 4 )                       # (12)
                                              1
                                         )                             # (11 end)
                                         (                             # (13 start)
                                              \ 
                                              ( 1 1 )                       # (14)
                                              \ 
                                              (                             # (15 start)
                                                   2 
                                                   ( 1 1 )                       # (16)
                                              )                             # (15 end)
                                         )                             # (13 end)
                                    )                             # (10 end)
                                    \ \ \ \ 
                                    (                             # (17 start)
                                         ( 1 3 )                       # (18)
                                         (                             # (19 start)
                                              2 
                                              ( 6 4 )                       # (20)
                                         )                             # (19 end)
                                    )                             # (17 end)
                               )                             # (9 end)
                          )                             # (7 end)
                     )                             # (5 end)
                     \ \ \ 
                     (                             # (21 start)
                          4 
                          ( 1 3 )                       # (22)
                     )                             # (21 end)
                )                             # (4 end)
           )                             # (3 end)
      )                             # (2 end)
      \ \ 
      (                             # (23 start)
           ( 1 \ \ 2 )                   # (24)
           2
      )                             # (23 end)
 )                             # (1 end)
Comments