Nicolas NOEL - 4 months ago 25

Python Question

I'm looking for a python library for finding the longest common substring from a set of python strings.

I'have read that it exist to way to solve this problem :

- one using suffix trees

- the other using dynamic programming.

The method implemented is not important. Otherwise, it is important to have a implementation that can be use for a set of strings and not only two strings

Answer

These paired functions will find the longest common string in any arbitrary array of strings:

```
def long_substr(data):
substr = ''
if len(data) > 1 and len(data[0]) > 0:
for i in range(len(data[0])):
for j in range(len(data[0])-i+1):
if j > len(substr) and is_substr(data[0][i:i+j], data):
substr = data[0][i:i+j]
return substr
def is_substr(find, data):
if len(data) < 1 and len(find) < 1:
return False
for i in range(len(data)):
if find not in data[i]:
return False
return True
print long_substr(['Oh, hello, my friend.',
'I prefer Jelly Belly beans.',
'When hell freezes over!'])
```

No doubt the algorithm could be improved and I've not had a lot of exposure to Python, so maybe it could be more efficient syntactically as well, but it should do the job.

**EDIT:** in-lined the second is_substr function as demonstrated by J.F. Sebastian. Usage remains the same. Note: no change to algorithm.

```
def long_substr(data):
substr = ''
if len(data) > 1 and len(data[0]) > 0:
for i in range(len(data[0])):
for j in range(len(data[0])-i+1):
if j > len(substr) and all(data[0][i:i+j] in x for x in data):
substr = data[0][i:i+j]
return substr
```

Hope this helps,

Jason.

Source (Stackoverflow)

Comments