pythonbeginner2506 - 5 months ago 10x
Python Question

# Counting the number of times a letter occurs at a certain position using python

I'm a python beginner and I've come across this problem and I'm not sure how I'd go about tackling it.

If I have the following sequence/strings:

GATCCG

GTACGC

How to I count the frequency each letter occurs at each position. ie) G occurs at position one twice in the two sequences, A occurs at position 1 zero times etc.

Any help would be appreciated, thank you!

You can use a combination of `defaultdict` and `enumerate` like so:

``````from  collections import defaultdict

sequences = ['GATCCG', 'GTACGC']
d = defaultdict(lambda: defaultdict(int))  # d[char][position] = count
for seq in sequences:
for i, char in enumerate(seq):  # enum('abc'): [(0,'a'),(1,'b'),(2,'c')]
d[char][i] += 1

d['C'][3]  # 2
d['C'][4]  # 1
d['C'][5]  # 1
``````

This builds a nested `defaultdict` that takes the character as first and the position as second key and provides the count of occurrences of said character in said position.

If you want lists of position-counts:

``````max_len = max(map(len, sequences))
d = defaultdict(lambda: [0]*max_len)  # d[char] = [pos0, pos12, ...]
for seq in sequences:
for i, char in enumerate(seq):
d[char][i] += 1

d['G']  # [2, 0, 0, 0, 1, 1]
``````