technazi - 1 year ago 94
Python Question

Python regex iteration for all combinations

I am new to regex. I am using Python 2.7 and BeautifulSoup4. I want to iterate over a particular regular expression.

Required ouput :

length : 5 , expression : [a-zA-Z0-9!&#%@]

It should try all possible combinations e.g:

['aaaaa','aaaab','aaaac',...,'aaaaz','aaaaA',...,'aaaaZ','aaaa0','aaaa9','aaaa!','AAA!!']

Moreover this should be possible too. If the expression is orange\d{1}

['orangea','oranges']]

I tried this:

`````` regexInput = "a-z0-9"
#regexInput = "[email protected]#\$%^&"
comb = itertools.permutations(regexInput,passLength)
for x in comb:
''.join(x)
``````

Itertools functions for permutations or combinaisons takes a series of elements as first parameter. It cannot generate the serie for you (from `a-z` to `abc...xyz`). Fortunatly `string` offer some constants like `ascii_letters` that contain `a-zA-Z`.

If your goal is to interprete the regex and generate every case, ... It's pretty hard and you should explain the why? before we go further.

If you just want to get combinaisons for alphabetical letters:

``````import string
from itertools import combinations_with_replacement

result = combinations_with_replacement(string.ascii_letters, 5)

#comb = [''.join(n) for n in result] # warning, heavy processing

print [''.join(result.next()) for _ in range(10)]
# > ['aaaaa', 'aaaab', 'aaaac', 'aaaad', 'aaaae', 'aaaaf', 'aaaag', 'aaaah', 'aaaai', 'aaaaj']
``````

You can replace `string.ascii_letters` with any serie of characters.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download