How to parse Emails in Python

June 09, 2020

I am implementing a new Mailbrew feature to receive newsletters inside your brews and to get the job done I need to parse emails received at <username> and extract their content.

You can do this with the Python standard library without external dependencies (unsurprisingly):

import email

msg = email.message_from_bytes(raw_content)

for part in msg.get_payload():
  content_type = pl.get("Content-Type")
  content = pl.get_payload(decode=True)

  print("Content-Type:" content_type)
  print("Content": content)

For most emails this will print the plain-text version first:

Content-Type: text/plain; charset="UTF-8"
Content: hello world

and the html version after that:

Content-Type: text/html; charset="UTF-8"
Content: <p>hello world</p>