shravster shravster - 1 year ago 136
Python Question

Python - Auto Detect Email Content Encoding

I am writing a script to process emails, and I have access to the raw string content of the emails.

I am currently looking for the string "Content-Transfer-Encoding:" and scanning the characters that follow immediately after, to determine the encoding. Example encodings: base64 or 7bit or quoted-printable ..

Is there a better way to automatically determine the email encoding(at least a more pythonic way)?

Thank you.

Answer Source

You may use this standard Python package: email.

For example:

import email

raw = """From: John Doe <>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

Hi there!

my_email = email.message_from_string(raw)
print my_email["Content-Transfer-Encoding"]

See other examples here.

