I'm a python beginner and I've come across this problem and I'm not sure how I'd go about tackling it.
If I have the following sequence/strings:
How to I count the frequency each letter occurs at each position. ie) G occurs at position one twice in the two sequences, A occurs at position 1 zero times etc.
Any help would be appreciated, thank you!
from collections import defaultdict sequences = ['GATCCG', 'GTACGC'] d = defaultdict(lambda: defaultdict(int)) # d[char][position] = count for seq in sequences: for i, char in enumerate(seq): # enum('abc'): [(0,'a'),(1,'b'),(2,'c')] d[char][i] += 1 d['C'] # 2 d['C'] # 1 d['C'] # 1
This builds a nested
defaultdict that takes the character as first and the position as second key and provides the count of occurrences of said character in said position.
If you want lists of position-counts:
max_len = max(map(len, sequences)) d = defaultdict(lambda: *max_len) # d[char] = [pos0, pos12, ...] for seq in sequences: for i, char in enumerate(seq): d[char][i] += 1 d['G'] # [2, 0, 0, 0, 1, 1]