Alan Alan - 1 year ago 97
PHP Question

variable length masking with preg_replace

I am masking all characters between single quotes (inclusively) within a string using

. But I would like to only use
if possible, but haven't been able to figure it out. Any help would be appreciated.

This is what I have using
which produces the correct output:

function maskCallback( $matches ) {
return str_repeat( '-', strlen( $matches[0] ) );
function maskString( $str ) {
return preg_replace_callback( "('.*?')", 'maskCallback', $str );

$str = "TEST 'replace''me' ok 'me too'";
echo $str,"\n";
echo $maskString( $str ),"\n";

Output is:

TEST 'replace''me' ok 'me too'
TEST ------------- ok --------

I have tried using:

preg_replace( "/('.*?')/", '-', $str );

but the dashes get consumed, e.g.:

TEST -- ok -

Everything else I have tried doesn't work either. (I'm obviously not a regex expert.) Is this possible to do? If so, how?

Answer Source

Yes you can do it, (assuming that quotes are balanced) example:

$str = "TEST 'replace''me' ok 'me too'";
$pattern = "~[^'](?=[^']*(?:'[^']*'[^']*)*+'[^']*\z)|'~";    
$result = preg_replace($pattern, '-', $str);

The idea is: you can replace a character if it is a quote or if it is followed by an odd number of quotes.

Without quotes:

$pattern = "~(?:(?!\A)\G|(?:(?!\G)|\A)'\K)[^']~";
$result = preg_replace($pattern, '-', $str);

The pattern will match a character only when it is contiguous to a precedent match (In other words, when it is immediately after the last match) or when it is preceded by a quote that is not contiguous to the precedent match.

\G is the position after the last match (at the beginning it is the start of the string)

pattern details:

~             # pattern delimiter

(?: # non capturing group: describe the two possibilities
    # before the target character

    (?!\A)\G  # at the position in the string after the last match
              # the negative lookbehind ensure that this is not the start
              # of the string

  |           # OR

    (?:       # (to ensure that the quote is a not a closing quote)
        (?!\G)   # not contiguous to a precedent match
      |          # OR
        \A       # at the start of the string
    '         # the opening quote

    \K        # remove all precedent characters from the match result
              # (only one quote here)

[^']          # a character that is not a quote


Note that since the closing quote is not matched by the pattern, the following characters that are not quotes can't be matched because there is no precedent match.


The (*SKIP)(*FAIL) way:

Instead of testing if a single quote is not a closing quote with (?:(?!\G)|\A)' like in the precedent pattern, you can break the match contiguity on closing quotes using the backtracking control verbs (*SKIP) and (*FAIL) (That can be shorten to (*F)).

$pattern = "~(?:(?!\A)\G|')(?:'(*SKIP)(*F)|\K[^'])~";
$result = preg_replace($pattern, '-', $str);

Since the pattern fails on each closing quotes, the following characters will not be matched until the next opening quote.

The pattern may be more efficient written like this:

$pattern = "~(?:\G(?!\A)(?:'(*SKIP)(*F))?|'\K)[^']~";

(You can also use (*PRUNE) in place of (*SKIP).)