Tony Zhang Tony Zhang - 20 days ago 10
Python Question

Using python for frequency analysis

I am trying to use python to help me crack Vigenère ciphers. I am fairly new to programming but I've managed to make an algorithm to analyse single letter frequencies. This is what I have so far:

Ciphertext = str(input("What is the cipher text?"))
Letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

def LetterFrequency():
LetterFrequency = {'A': 0, 'B': 0, 'C': 0, 'D': 0, 'E': 0, 'F': 0, 'G': 0, 'H': 0, 'I': 0, 'J': 0, 'K': 0, 'L': 0, 'M': 0, 'N': 0, 'O': 0, 'P': 0, 'Q': 0, 'R': 0, 'S': 0, 'T': 0, 'U': 0, 'V': 0, 'W': 0, 'X': 0, 'Y': 0, 'Z': 0}
for letter in Ciphertext.upper():
if letter in Letters:
LetterFrequency[letter]+=1
return LetterFrequency

print (LetterFrequency())


But is there a way for me to print the answers in descending order starting from the most frequent letter? The answers are shown in random order right now no matter what I do.

Also does anyone know how to extract specific letters form a large block of text to perform frequency analysis? So for instance if I wanted to put every third letter from the text “THISISARATHERBORINGEXAMPLE” together to analyse, I would need to get:

T H I
S I S
A R A
T H E
R B O
R I N
G E X
A M P
L E


Normally I would have to do this by hand in either notepad or excel which takes ages. Is there a way to get around this in python?

Thanks in advance,

Tony

Answer

For the descending order you could use Counter:

>>> x = "this is a rather boring example"
>>> from collections import Counter
>>> Counter(x)
Counter({' ': 5, 'a': 3, 'e': 3, 'i': 3, 'r': 3, 'h': 2, 's': 2, 't': 2, 'b': 1, 'g': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'p': 1, 'x': 1})

As for the second question you could iterate per 3.

To exclude spaces you can try what @not_a_robot suggests in the comment or delete it manually like:

>>> y = Counter(x)
>>> del y[' ']
>>> y
Counter({'a': 3, 'e': 3, 'i': 3, 'r': 3, 'h': 2, 's': 2, 't': 2, 'b': 1, 'g': 1, 'm': 1, 'l': 1, 'o': 1, 'n': 1, 'p': 1, 'x': 1})