Donato Donato - 2 months ago 18
Ruby Question

Remove header characters from email body

I am using net::imap and mail ruby gem. I want to get the email body text, but not the headers. Unfortunately it is returning the header information with the body:

require 'net/imap'
require 'mail'
imap = Net::IMAP.new('imap.gmail.com',993, true)
imap.login('user@gmail.com','password')
imap.select('INBOX')
imap.search(['SINCE', Time.now.strftime('%d-%b-%y')])
raw_message = imap.fetch(1,'RFC822').first.attr['RFC822']
message = Mail.read_from_string raw_message

message.body
=> #<Mail::Body:0x007fa3f4354e00 @boundary="94eb2c13d952954cbc053d82a0e7", @preamble="", @epilogue="", @charset="US-ASCII", @part_sort_order=["text/plain", "text/enriched", "text/html"], @parts=[#<Mail::Part:70171076755380, Multipart: false, Headers: <Content-Type: text/plain; charset=UTF-8>>, #<Mail::Part:70171076752140, Multipart: false, Headers: <Content-Type: text/html; charset=UTF-8>>], @raw_source="--94eb2c13d952954cbc053d82a0e7\r\nContent-Type: text/plain; charset=UTF-8\r\n\r\ntest body\r\n\r\n--94eb2c13d952954cbc053d82a0e7\r\nContent-Type: text/html; charset=UTF-8\r\n\r\n<div dir=\"ltr\">test body</div>\r\n\r\n--94eb2c13d952954cbc053d82a0e7--", @encoding="7bit">
message.body.raw_source
=> "--94eb2c13d952954cbc053d82a0e7\r\nContent-Type: text/plain; charset=UTF-8\r\n\r\ntest body\r\n\r\n--94eb2c13d952954cbc053d82a0e7\r\nContent-Type: text/html; charset=UTF-8\r\n\r\n<div dir=\"ltr\">test body</div>\r\n\r\n--94eb2c13d952954cbc053d82a0e7--"


How do I get just the body text?

Answer

Those aren't mail headers. That's MIME headers in the body of the message. You have a multi-part email, usually used to provide both text and HTML versions of the same content.

Each part is treated as another Mail::Message to be processed separately. See Reading a Multipart Email.

mail = Mail.read('multipart_email')

mail.multipart?          #=> true
mail.parts.length        #=> 2
mail.body.preamble       #=> "Text before the first part"
mail.body.epilogue       #=> "Text after the last part"
mail.parts.map { |p| p.content_type }  #=> ['text/plain', 'application/pdf']
mail.parts.map { |p| p.class }         #=> [Mail::Message, Mail::Message]
mail.parts[0].content_type_parameters  #=> {'charset' => 'ISO-8859-1'}
mail.parts[1].content_type_parameters  #=> {'name' => 'my.pdf'}