Mayjunejuly Mayjunejuly - 1 month ago 9
Python Question

Python: how to show an input DNA sequence in string format to a list of nucleotide triplets in single element tuple format

def s_seq(dna_seq):
'''
parses an input sequence in string format to a list of nucleotide triplets/codons as single-valued tuples
'''
codons = []

# arrange codons as list of single element tuples
if len(dna_seq) % 3 == 0:
for i in range(0, len(dna_seq), 3):
codons = dna_seq[i:i + 3]

return codons

dna_seq01 = 'ATATTAAAGAATAATTTTATAAAAATATGT'
codons01 = s_seq(dna_seq01)


It keeps showing the last three codons only, but what I want is the split of everything: 'ATA', 'TTA' and so on. I don't know what I am doing wrong here.

Answer Source

You just need to append the codon to the list you've set above :

codons = []
if len(dna_seq) % 3 == 0:
    for i in range(0,len(dna_seq),3):
       codons.append((dna_seq[i:i + 3],))

outputs :

>>> [('ATA',), ('TTA',), ('AAG',), ('AAT',), ('AAT',), ('TTT',), ('ATA',), ('AAA',), ('ATA',), ('TGT',)]

By using an codons = dna_seq[i:i+3] you're just replacing the value in each loop iteration.