The mail_processor plugin allows your Roda
application to process mail using a routing tree. Quick example:
class MailProcessor < Roda plugin :mail_processor route do |r| # Match based on the To header, extracting the ticket_id r.to /ticket\+(\d+)@example.com/ do |ticket_id| if ticket = Ticket[ticket_id.to_i] # Mark the mail as handled if there is a valid ticket associated r.handle do ticket.add_note(text: mail_text, from: from) end end end # Match based on the To or CC header r.rcpt "post@example.com" do # Match based on the body, capturing the post id and tag r.body(/^Post: (\d+)-(\w+)/) do |post_id, tag| unhandled_mail("no matching post") unless post = Post[post_id.to_i] unhandled_mail("tag doesn't match for post") unless post.tag == tag # Match based on APPROVE somewhere in the mail text, # marking the mail as handled r.handle_text /\bAPPROVE\b/i do post.approve!(from) end # Match based on DENY somewhere in the mail text, # marking the mail as handled r.handle_text /\bDENY\b/i do post.deny!(from) end end end end end
Processing Mail¶ ↑
To submit a mail for processing via the mail_processor routing tree, call the process_mail
method with a Mail
instance:
MailProcessor.process_mail(Mail.new do # ... end)
You can use this to process mail messages from the filesystem:
MailProcessor.process_mail(Mail.read('/path/to/message.eml'))
If you have a service that delivers mail via an HTTP POST request (for realtime processing), you can have your web routes convert the web request into a Mail
instance and then call process_mail
:
r.post "email" do # check request is submitted by trusted sender # If request body is the raw mail body r.body.rewind MailProcessor.process_mail(Mail.new(r.body.read)) # If request body is in a parameter named content MailProcessor.process_mail(Mail.new(r.params['content'])) # If the HTTP request requires a specific response status code (such as 204) response.status = 204 nil end
Note that when receiving messages via HTTP, you need to make sure you check that the request is trusted. How to do this depends on the delivery service, but could involve using HTTP basic authentication, checking for valid API tokens, or checking that a message includes a signature/hash that matches the expected value.
If you have setup a default retriever_method for Mail
, you can call process_mailbox
, which will process all mail in the given mailbox (using Mail.find_and_delete
):
MailProcessor.process_mailbox
You can also use a :retreiver
option to provide a specific retriever:
MailProcessor.process_mailbox(retreiver: Mail::POP3.new)
Routing Mail¶ ↑
The mail_processor plugin handles routing similar to Roda’s default routing for web requests, but because mail processing may not return a result, the mail_processor plugin uses a more explicit approach to consider whether the message has been handled. If the r.handle
method is called during routing, the mail is considered handled, otherwise the mail is considered not handled. The unhandled_mail
method can be called at any point to stop routing and consider the mail as not handled (even if inside an r.handle
block).
Here are the mail routing methods and what they use for matching:
from |
match on the mail From address |
to |
match on the mail To address |
cc |
match on the mail CC address |
rcpt |
match on the mail recipients (To and CC addresses by default) |
subject |
match on the mail subject |
body |
match on the mail body |
text |
match on text extracted from the message (same as mail body by default) |
header |
match on a mail header |
All of these routing methods accept a single argument, except for r.header
, which can take two arguments.
Each of these routing methods also has a r.handle_*
method (e.g. r.handle_from
), which will call r.handle
implicitly to mark the mail as handled if the routing method matches and control is passed to the block.
The address matchers (from, to, cc, rcpt) perform a case-insensitive match if given a string or array of strings, and a regular regexp match if given a regexp.
The content matchers (subject, body, text) perform a case-sensitive substring search if given a string or array of strings, and a regular regexp match if given a regexp.
The header matcher should be called with a key and an optional value. If the matcher is called with a key and not a value, it matches if a header matching the key is present in the message, yielding the header value. If the matcher is called with a key and a value, it matches if a header matching the key is present and the header value matches the value given, using the same criteria as the content matchers.
In all cases for matchers, if a string is given and matches, the match block is called without arguments. If an array of strings is given, and one of the strings matches, the match block is called with the matching string argument. If a regexp is given, the match block is called with the regexp captures. This is the same behavior for Roda’s general string, array, and regexp matchers.
Recipient-Specific Routing¶ ↑
To allow splitting up the mail processor routing tree based on recipients, you can use the rcpt
class method, which takes any number of string or regexps arguments for recipient addresses, and a block to handle the routing for those addresses instead of using the default routing.
MailProcessor.rcpt('a@example.com') do |r| r.text /Post: (\d+)-(\h+)/ do |post_id, hmac| next unless Post[post_id.to_i] unhandled_mail("no matching Post") unless post = Post[post_id.to_i] unhandled_mail("HMAC for doesn't match for post") unless hmac == post.hmac_for_address(from.first) r.handle_text 'APPROVE' do post.approved_by(from) end r.handle_text 'DENY' do post.denied_by(from) end end end
The rcpt
class method does not mark the messages as handled, because in most cases you will need to do additional matching to extract the information necessary to handle the mail. You will need to call r.handle
or similar method inside the block to mark the mail as handled.
Matching on strings provided to the rcpt
class method is an O(1) operation as the strings are stored lowercase in a hash. Matching on regexps provided to the rcpt
class method is an O(n) operation on the number of regexps.
If you would like to break up the routing tree using something other than the recipient address, you can use the multi_route plugin.
Hooks
¶ ↑
The mail_processor plugin offers hooks for processing mail.
For mail that is handled successfully, you can use the handled_mail hook:
MailProcessor.handled_mail do # nothing by default end
For mail that is not handled successfully, either because r.handle
was not called during routing or because the unhandled_mail
method was called explicitly, you can use the unhandled_mail hook.
The default is to reraise the UnhandledMail
exception that was raised during routing, so that calling code will not be able to ignore errors when processing mail. However, you may want to save such mails to a special location or forward them as attachments for manual review, and the unhandled_mail hook allows you to do that:
MailProcessor.unhandled_mail do # raise by default # Forward the mail as an attachment to an admin m = Mail.new m.to 'admin@example.com' m.subject '[APP] Unhandled Received Email' m.add_file(filename: 'message.eml', :content=>mail.encoded) m.deliver end
Finally, for all processed mail, regardless of whether it was handled or not, there is an after_mail hook, which can be used to archive all processed mail:
MailProcessor.after_mail do # nothing by default # Add it to a received_mail table using Sequel DB[:received_mail].insert(:message=>mail.encoded) end
The after_mail hook is called after the handled_mail or unhandled_mail hook is called, even if routing, the handled_mail hook, or the unhandled_mail hook raises an exception. The handled_mail and unhandled_mail hooks are not called if an exception is raised during routing (other than for UnhandledMail
exceptions).
Extracting Text from Mail¶ ↑
The most common use of the mail_processor plugin is to handle replies to mails sent out by the application, so that recipients can reply to mail to make changes without having to access the application directly. When handling replies, it is common to want to extract only the text of the reply, and ignore the text of the message that was replied to. Because there is no consistent way to format replies in mail, there have evolved various approaches to do this, with some gems devoted to extracting the reply text from a message.
The mail_processor plugin does not choose any particular approach for extracting text from mail, but it includes the ability to configure how to do that via the mail_text
class method. This method affects the r.text
match method, as well as mail_text
instance method. By default, the decoded body of the mail is used as the mail text.
MailProcessor.mail_text do # mail.body.decoded by default # https://github.com/github/email_reply_parser EmailReplyParser.parse_reply(mail.body.decoded) # https://github.com/fiedl/extended_email_reply_parser mail.parse end
Security¶ ↑
Note that due to the way mail delivery works via SMTP, the actual sender and recipient of the mail (the SMTP envelope MAIL FROM and RCPT TO addresses) may not match the sender and receiver embedded in the message. Because mail_processor routing relies on parsing the mail, it does not have access to the actual sender and recipient used at the SMTP level, unless a mail server adds that information as a header to the mail (and clears any existing header to prevent spoofing). Keep that in mind when you are setting up your mail routes. If you have setup your mail server to add the SMTP RCPT TO information to a header, you may want to only consider that header when looking for the recipients of the message, instead of looking at the To and CC headers. You can override the default behavior for determining the recipients (this will affect the rcpt
class method, r.rcpt
match method, and mail_recipients
instance method):
MailProcessor.mail_recipients do # Assuming the information is in the X-SMTP-To header Array(header['X-SMTP-To'].decoded) end
Also note that unlike when handling web requests where you can rely on storing authentication information in the session, when processing mail, you should manually authenticate each message, as email is trivially forged. One way to do this is assigning and storing a unique identifier when sending each message, and checking for a matching identifier when receiving a response. Another option is including a computable authentication code (e.g. HMAC) in the message, and then when receiving a response, recomputing the authentication code and seeing if it matches the authentication code in the message. The unique identifier approach requires storing a large number of identifiers, but allows you to remove the identifier after a reply is received (to ensure only one response is handled). The authentication code approach does not require additional storage, but does not allow you to ensure only a single response is handled.
Avoiding Mail Loops¶ ↑
If processing the mail results in sending out additional mail, be careful not to send a response to the sender of the email, otherwise if the sender of the email has an auto-responder, you can end up with a mail loop, where every mail you send results in a response, which you then process and send out a response to.