IMarks IMarks - 5 months ago 13
PHP Question

PHP autodetect translatables / detect piece of code by regex

I'm having a multilanguage site which stores translatables within a default.php filled with an array that contains all the keys.

I would prefer to make it automatic.
I already have a (singleton) class that is able to detect all my files based by type. (Controller, action, view, model, etc...)

I would like to detect any piece of code of which the format is like this:

$this->translate('[a-zA-Z]');
$view->translate('[a-zA-Z]');
getView()->translate('[a-zA-Z]');
throw new Exception('[a-zA-Z]');
addMessage(array('message' => '[a-zA-Z]');


However it must be filtered when it starts with/contains:

$this->translate('((0-9)+_)[a-zA-Z]');
$this->translate('[a-zA-Z]' . $* . '[a-zA-Z]'); // Only a variable in the middle must filtered, begin or end is still allowed


ofcourse [a-zA-Z] is a regex example.

Like i sais i already have a class that detect certain files. This class also make use of Reflection (or in this case Zend Reflection, as i'm using Zend) However i could not see a way to reflect a function using regex.

The action will be placed within a cronjob and manual called action so it is not a big issue when the used memory is a bit 'too' large.

Answer

Description

[$]this->translate[(]'((?:[^'\\]|\\.|'')*)'[)];

Regular expression visualization

** To see the image better, simply right click the image and select view in new window

This regular expression will do the following:

  • code blocks starting with $this-translate(' through it's closing ');
  • places the value inside the ' quotes into capture group 1
  • avoids messy edge cases where in the substring may contain what looks like an end '); string when in reality the characters could be escaped.

Example

Live Demo

https://regex101.com/r/eC5xQ6/

Sample text

$This->Translate('(?:Droids\');{2}');
$NotTranalate('fdasad');
$this->translate('[a-zA-Z]');

Sample Matches

MATCH 1
1.  [17-33] `(?:Droids\');{2}`

MATCH 2
1.  [79-87] `[a-zA-Z]`

Explanation

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  [$]                     any character of: '$'
----------------------------------------------------------------------
  this->translate           'this->translate'
----------------------------------------------------------------------
  [(]                      any character of: '('
----------------------------------------------------------------------
  '                        '\''
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
----------------------------------------------------------------------
      [^'\\]                   any character except: ''', '\\'
----------------------------------------------------------------------
     |                        OR
----------------------------------------------------------------------
      \\                       '\'
----------------------------------------------------------------------
      .                        any character except \n
----------------------------------------------------------------------
     |                        OR
----------------------------------------------------------------------
      ''                       '\'\''
----------------------------------------------------------------------
    )*                       end of grouping
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  '                        '\''
----------------------------------------------------------------------
  [)]                      any character of: ')'
----------------------------------------------------------------------
  ;                        ';'
----------------------------------------------------------------------