AkaSh AkaSh - 7 months ago 30
Python Question

Extract variables using python regex

Input file contains following lines:

a=b*c;
d=a+2;
c=0;
b=a;


Now for each line I want to extract variables that has been used.For example, for line 1, the output should be
[a,b,c]
.Currently I am doing as follows :

var=[a,b,c,d] # list of variables
for line in file_ptr :
if '=' in line :
temp=line.split('=') :
ans=list(temp[0])
if '+' in temp[1] :
# do something
elif '*' in temp[1] :
# do something
else :
# single variable as line 4 OR constant as line 3


Is it possible to do this using regex?

EDIT:

Expected output for above file :

[a,b,c]
[d,a]
[c]
[a,b]

Answer

I would use re.findall() with whatever pattern matches variable names in the example's programming language. Assuming a typical language, this might work for you:

import re

lines = '''a=b*c;
d=a+2;
c=0;
b=a;'''

for line in lines.splitlines():
    print re.findall('[_a-z][_a-z0-9]*', line, re.I)