Biotechgeek Biotechgeek - 2 months ago 8
Python Question

How to create tuples from a single list with alpha-numeric chacters?

I have the following list with 2 elements:

['AGCTT 6 6 35 25 10', 'AGGGT 7 7 28 29 2']


I need to make a list or zip file such that each alphabet corresponds to its number further in the list. For example in list[0] the list/zip should read

{"A":"6", "G":"6", "C":"35","T":"25","T":"10"}


Can I make a list of such lists/zips that stores the corresponding vales for list[0], list[1],...list[n]?

Note: The alphabets can only be A,G,C or T, and the numbers can take anyvalue

Edit 1: Previously, I thought I could use a dictionary. But several members pointed out that this cannot be done. So I just want to make a list or zip or anything else recommended to pair the Alphabet element to its corresponding number.

Answer

Use tuples splitting once to get the pairs, then split the second element of each pair, zip together:

l  =['AGCTT 6 6 35 25 10', 'AGGGT 7 7 28 29 2']

pairs =  [zip(a,b.split()) for a,b in (sub.split(None,1) for sub in l]

Which would give you:

[[('A', '6'), ('G', '6'), ('C', '35'), ('T', '25'), ('T', '10')], [('A', '7'), ('G', '7'), ('G', '28'), ('G', '29'), ('T', '2')]]

Of using a for loop with list.append:

l  = ['AGCTT 6 6 35 25 10', 'AGGGT 7 7 28 29 2']
out = []
for a,b in (sub.split(None,1) for sub in l ):
    out.append(zip(a,b))

If you want to convert any letter to Z where the digit is < 10, you just need another loop where we check the digit in each pairing:

pairs = [[("Z", i ) if int(i) < 10 else (c, i) for c,i in zip(a, b.split())] 
         for a,b in (sub.split(None, 1) for sub in l)]
print(pairs)

Which would give you:

[[('Z', '6'), ('Z', '6'), ('C', '35'), ('T', '25'), ('T', '10')], [('Z', '7'), ('Z', '7'), ('G', '28'), ('G', '29'), ('Z', '2')]]

To break it into a regular loop:

pairs = []
for a, b in (sub.split(None, 1) for sub in l):
    pairs.append([("Z", i) if int(i) < 10 else (c, i) for c, i in zip(a, b.split())])
print(pairs)

[("Z", i) if int(i) < 10 else (c, i) for c, i in zip(a, b.split())] sets the letter to Z if the corresponding digit i is < 10 or else we just leave the letter as is.

if you want to get back to the original pairs after you just need to transpose with zip:

In [13]: l = ['AGCTT 6 6 35 25 10', 'AGGGT 7 7 28 29 2']

In [14]: pairs = [[("Z", i) if int(i) < 10 else (c, i) for c, i in zip(a, b.split())] for a, b in
   ....:          (sub.split(None, 1) for sub in l)]

In [15]: pairs
Out[15]: 
[[('Z', '6'), ('Z', '6'), ('C', '35'), ('T', '25'), ('T', '10')],
 [('Z', '7'), ('Z', '7'), ('G', '28'), ('G', '29'), ('Z', '2')]]

In [16]: unzipped = [["".join(a), " ".join(b)] for a, b in (zip(*tup) for tup in pairs)]

In [17]: unzipped
Out[17]: [['ZZCTT', '6 6 35 25 10'], ['ZZGGZ', '7 7 28 29 2']]

zip(*...) will give you the original elements back into a tuple of their own, we then just need to join the strings back together. If you wanted to get back to the total original state you could just join again:

In[18][ " ".join(["".join(a), " ".join(b)]) for a, b in (zip(*tup) for tup in pairs) ]
Out[19]: ['ZZCTT 6 6 35 25 10', 'ZZGGZ 7 7 28 29 2']
Comments