abkrim abkrim - 2 months ago 11
PHP Question

Regex for extract numbers and extension of string

I use code below for extract numbers and file name of strings with problem standardization

30183308__90_.jpeg
30193253-(100).jpg
30193253__100__.jpg
30193253_ _100_ _.jpg


Use this function

public function refactorFileName($filename)
{
$array = preg_split("/[^A-Za-z0-9]/", $filename);
foreach($array as $key => $value) {
if($value == "") {
unset($array[$key]);
}
}
$array = array_values($array);
$standardFilename = $array[0].'.'.$array[2];
$indexFile = $array[1];

return compact("indexFile","standardFilename");
}

$filename = '30193253_ _100_ _.jpg';
extract(refactorFileName($filename));
echo "New File name -> ".$standardFilename.PHP_EOL;
echo "Index for file -> ".$indexFile.PHP_EOL;


This show (correct):

New File name -> 30193253.jpg
Index for file -> 100


I think there're a better code for regex expresion.

EDIT:
It's possible better code on preg_split or better code in general for this question?

Answer

Two things: 1) It will be easier if you put a quantifier in your pattern (to avoid the useless foreach after). (Note that preg_split has also the option PREG_SPLIT_NO_EMPTY to avoid empty items.)
2) sometimes too much verbosity kills the verbosity.

Your can rewrite it this way:

function refactorFileName($filename) {
    $p = preg_split('~[\W_]+~', $filename, 3);

    return [ 'indexFile' => $p[1], 'standardFilename' => "$p[0].$p[2]" ];
}

Or if you want to be more verbose:

function refactorFileName($filename) {
    list($name, $index, $ext) = preg_split('~[\W_]+~', $filename, 3);

    return [ 'indexFile' => $index, 'standardFilename' => "$name.$ext" ];
}

(As an aside, when you already have a working code, ask your question on codereview instead of SO)