biogeek - 1 year ago 64

Python Question

My sample data looks like:

`list1 = ['AAAABBBBCCCC','DDDDEEEEFFFF','GGGGHHHHIIII','JJJJKKKKLLLL']`

- Make a list1b such that each element is split into groups of four

`list1b = [['AAAA','BBBB','CCCC'],['DDDD','EEEE','FFFF'],['GGGG','HHHH','IIII'],['JJJJ','KKKK','LLLL']]`

I tried to to write a generalisable code for any length of elements:

`list1a =[]`

list1b =[]

for sublist in list1:

n = 4

quad = [input[i:i+n] for i in range(0, len(sublist[0]), n)]

list1a.append(quadruplets)

quad =[] #Setting it back to empty list

list1b.append(list1a)

print list1b

#Error Message:

quad = [input[i:i+n] for i in range(0, len(sublist[0]), n)]

TypeError: 'builtin_function_or_method' object has no attribute '__getitem__'

Can anyone please recognize where I may be going wrong and how I can correct it? Is there a simpler way of doing the same?

Answer Source

If you want to group by the same character, you can use groupby to do this:

```
>>> from itertools import groupby
>>> list1 = ['AAAABBBBCCCC','DDDDEEEEFFFF','GGGGHHHHIIII','JJJJKKKKLLLL']
>>> [[''.join(g) for k,g in groupby(sl)] for sl in list1]
[['AAAA', 'BBBB', 'CCCC'], ['DDDD', 'EEEE', 'FFFF'], ['GGGG', 'HHHH', 'IIII'], ['JJJJ', 'KKKK', 'LLLL']]
```

If you partitioning is by length vs by character, you can do:

```
>>> n=4
>>> [[s[i:i+n] for i in range(0, len(s), n)] for s in list1]
[['AAAA', 'BBBB', 'CCCC'], ['DDDD', 'EEEE', 'FFFF'], ['GGGG', 'HHHH', 'IIII'], ['JJJJ', 'KKKK', 'LLLL']]
```