Jokab Jokab - 7 months ago 11
Python Question

Regex capturing inserted variables in PHP SQL string statements

I'm having some regex troubles, using Python 2.7 if that matters.

Basically what I'm trying to do is to capture inserted variables in a PHP SQL query string declaration, for example:

$query = "SELECT * FROM `users` WHERE user='$user' AND password='$pass';";


This should return
$user
when I get the second group from the match.

Here's my regex as it stands right now:

r'.*?\s*=\s*\(\".*?\'(\$[^\']+)\'.*?\"\);'


Example showing that this works and captures
$user
but not the one above (yes I know it doesn't capture $pass as it ideally should, that's seems to be a limitation with Python's implementation and Regex in general. I do some hacks to get around this in my actual program)


The above works for the example I used. However, when I introduce another case where the inserted variable uses the syntax
'{$foo['bar']}'
, my other regex below doesn't work which accounts for the fact that it contains an apostrophe which doesn't close the variable:

r'.*?\s*=\s*[\(]?\".*?(?:(?:\'(\$[^\']+)\')|(?:\'(\$\{[^\}]+\})\'))?.*?\"[\)]?;'


So basically I want to capture either the
'$user
' syntax or the one with { }, for example
'{$foo['bar']}'
. Note that these are not exclusive, it's just that an inserted variable may be of either kind and I want to account for both.

Here's a link to test this out, showing that it doesn't work. Using the second regex also breaks capturing the simple
$user
, not sure why.

AKS AKS
Answer

I am not sure what do you mean by limitation in python because following works as it should:

>>> import re
>>> query = "SELECT * FROM `users` WHERE user='$user' AND password='$pass';";
>>> re.findall(r"='(\$\w+)'", query)
['$user', '$pass']

For matching the other query have a look at this regex demo:

='(\{?\$.+?)(?:'(?:\s|;))

And, code example:

>>> query1 = "(\"SELECT table_schema, table_name, create_time FROM information_schema.tables WHERE table_schema='{$_DVWA['db_database']}' AND table_name='users' LIMIT 1\");"
>>> re.findall(r"='(\{?\$.+?)(?:'(?:\s|;))", query1)
["{$_DVWA['db_database']}"]

# it works on the other query as well
>>> re.findall(r"='(\{?\$.+?)(?:'(?:\s|;))", query)
['$user', '$pass']