Email processing

Revision as of 12:51, 29 September 2010 by chris collman (talk | contribs) (Features: expand MTA and HMAC into words :))

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Note: You are currently viewing documentation for Moodle 1.9. Up-to-date documentation for the latest stable version is available here: Email processing.

Features

Moodle uses SMTP protocol for email and a technique known as Variable Envelope Return Path (VERP).

  • Works on most modern Mail Transfer Agent (at least MTAs on Unix systems).
  • All processing of bounces and replies is secured using Hashed Message Authentication Code (HMAC) and the arbitrary hashing algorithm MD5-8.
  • Bounces are handled correctly and increase a "bad email" score for the user.
  • noreply@host address is now on Reply-to field, avoiding accidental pollution of users address books.
  • noreply@host has an autorresponder
  • Makes it easy for modules to send emails with a signed VERP reply-to.
  • Handles receiving of VERP replies: Validates the HMAC-MD5-8 signature and Dispatches the encoded request data to the relevant module

Moodle configuration

Php default email settings

Unix and some other platforms might use a default php email program. The email program may use the config.php file to set the from line on all outgoing mail. Spam filters will not appreciate an email from an IP address and send it to the spam folder. Change the %CFG->wwwroot to the registered DNS address. For example:

$CFG->wwwroot ='http://192.168.0.1/Moodle' ;

should be changed to:

$CFG->wwwroot ='http://www.mydomain.org/Moodle' ;

PHP 5 handler passthrough

Some servers support PHP4 as well as PHP5. In order to tell them which to use for your installation you got to specify it in an .htaccess file. The lines for PHP5 read

AddType x-mapp-php5 .php

AddHandler x-mapp-php5 .php

If you place this into your topmost moodle folder there is a chance that the server does not pass through the AddType and AddHandler specs to subdirectories. In this case, your phpmailer won't work. Thus you simply have to put the same .htaccess file into the folder lib/phpmailer.


Enable bounce handling

Edit config.php to enable bounce handling, and setup Moodle to match your MTA configuration. Here's how. Uncomment these lines in config.php (if you cannot find them, copy them from config-dist.php):

 // once handlebounces is true, we will be using VERP for the return address of every sent email
 $CFG->handlebounces = true;
 // minimum bounces allowed per user
 $CFG->minbounces = 10;
 // ratio of bad emails to sent emails
 // if we get more than 20% bounces 
 // for a given user, his/her email is marked bad
 $CFG->bounceratio = .20;
 // Prefix to identify your site MUST BE EXACTLY 3 characters
 $CFG->mailprefix = 'mdl';
 // Domain which acccepts email for processing
 $CFG->maildomain = 'bounces.my.domain';

Edit the $CFG->maildomain line and one of the $CFG->mailprefix lines (the one that matches your MTA).

Make sure your server has a command-line PHP interpreter, and that it is able to connect to mysql (or postgres if relevant). If you are able to run cron.php from the commandline or from crontab, this means PHP is ok.

Edit the process_email.php script to point to the location of your php binary. It will usually be /usr/bin/php.

Email confirmation or registration words, and how to edit them.

  • Students receive a confirmation email when they create a user account. This text can be found in the lang/?/moodle.php as the emailconfirmation "variable".
  • Students receive a welcome email when they enroll in a course. This text can be found in the lang/?/moodle.php file as the welcometocoursetext "variable".
  • If you feel the need to edit this text, the best way to do this is to go to Site Administration -> Language editing, and then clicking on the string that you wish to edit. Once you have done this, hit the "Switch lang directory" button, and do your editing.
  • If you decide to get your hands dirty and work by editing the moodle.php directly, be careful to open the file only in a simple text editor like notepad or wordpad.
  • Tip: Be careful with markup - as you edit, you must use " " for things like putting in hyperlinks, titles and targets. You MUST 'escape' them by putting a \ before each ". If you don't the resulting page will appear blank. For example, the normal way to markup a hyperlink is:-
    • <a href="http://www.blah_blah">click here</a> If you do this in the moodle.php file, your edited text won't appear.
    • Use <a href=\"http://www.blah_blah\">click here</a>
  • Use the \ before each " when marking up things like target=\"_blank\" and title=\"Blah blah\" and other similar codes.
  • If you feel you must really go the route of editing the moodle.php file directly, save a copy of the file in its own folder (copy and paste the file in the same folder). That way, if it all goes wrong for you, all you have to do is delete the botched file and rename the 'copy of moodle.php' back to 'moodle.php' and you are back to square one, no harm done!
  • Having said all that, it is best to do all your editing in the web interface by going to Site Administration -> Language editing, and then clicking on the string that you wish to edit.

Language string edit example.jpg

Setup under Postfix

Add a line to your aliases file. The line should list a 3-letter prefix, to which we'll add a '|', and the path of the script. For example for a prefix of 'mdl' and moodle installed under /var/www/moodle we have in aliases:

   mdl:     |/var/www/moodle/admin/process_email.php
   noreply: |/var/www/moodle/admin/process_email.php

If you are using virtualdomains, consult your server administrator for the correct configuration. It probably involves editing transports and mapping your address to a "pipe" transport. One such transport can be configured using the following line in "master.cf":

  moodle  unix  -       n       n       -       -       pipe
     flags=FR user=www-data argv=/usr/bin/env SENDER=${sender} RECIPIENT=${recipient} /srv/www/moodle/admin/process_email.php

Because process_email.php uses environment variables to get recipient and sender adresses, you also need to make sure the $_ENV variable is available within PHP, otherwise Moodle will not process any messages. See : http://www.php.net/manual/en/reserved.variables.environment.php#98113

Setup under Qmail

Depending on your setup, your aliases will be controlled by one or more of

  • /etc/aliases
  • /var/qmail/alias/.qmail-PREFIX

If you edit /etc/aliases add a line like this (for a prefix of 'mdl'):

   mdl:     |/var/www/moodle/admin/process_email.php
   noreply: |/var/www/moodle/admin/process_email.php

If you create /var/qmail/alias/.qmail-PREFIX, just do

 echo "|/var/www/moodle/admin/process_email.php" > /var/qmail/alias/.qmail-mdl
 echo "|/var/www/moodle/admin/process_email.php" > /var/qmail/alias/.qmail-noreply

To this three letter prefix, we will add a '-' when sending and receiving messages. For more info, check out the manpage for dot-qmail.

Setup under Exim

Open /etc/exim/exim.conf and add to trusted_users the user Apache and cron.php run as (usually "www-data" or "nobody").

Add a line to your aliases file. The line should list a 3-letter prefix, to which we'll add a '+', and the path of the script. For example for a prefix of 'mdl' we have in aliases:

   mdl:     |/var/www/moodle/admin/process_email.php
   noreply: |/var/www/moodle/admin/process_email.php

If you are using virtualdomains, consult your server administrator for the correct configuration. It probably involves editing transports and mapping your address to a "pipe" transport.

More documentation about Exim can be found here.

You will probably have to tell Exim not to lowercase the local-part in your exim configuration. This can be done in the router which handles the mail for your aliases file with: caseful_local_part = true

Developer info

Changed functions:

  • email_to_user() will set the envelope sender to a special bounce processing address (based on $CFG settings)
  • email_to_user() will accept (and set) a reply-to header, to be generated by the module calling the function.
  • associated string changes/additions

New functions:

  • generate_email_processing_address() - ALWAYS use this to generate the reply-to header. reply-to header will look like this: (LIMIT: 64 chars total) prefix - EXACTLY four chars encodeded, packed, moduleid (0 for core) (2 chars) up to 42 chars for the modules to put anything they want it (can contain userid (or, eg for forum, postids to reply to), or anything really. 42 chars is ABSOLUTE LIMIT) 16 char hash (half an md5) of the first part of the address, together with a site "secret"
  • moodle_process_email() - any non-module email processing goes here (currently used for processing bounces)

New files:

admin/process_email.php

This script needs to be called from your MTA for anything starting with the 3 char prefix described above (and optionally, the noreply address).

How does it work? It will break down and unencode the email address into moduleid and validate the half md5 hash, and call $modname_process_email (if it exists). Arguments to these functions are: $modargs (any part of the email address that isn't the prefix, modid or the hash) and the contents of the email (read from STDIN).

It doubles up as the noreplyaddress autorresponder if you configure it with that address as well. Replying with a friendly "this is not a real email address" message.

Module Authors

Take a look at new functions moodle_process_email() and generate_email_processing_address() in moodlelib.php for ideas about how to

  • encode and unencode the arguments your module needs to do the processing
  • how to deal with multiple "actions" for any given module.

In a nutshell, users can send emails to Moodle using special dynamic addresses. These emails can trigger a call to a module function in the form of modulename_process_email($str, $bodyofemail). The $str part will be up to 42 characters of data generated by Moodle (presumably by your own module), and the $bodyofemailpart is the contents of the email (got by reading STDIN - usually generated by the users MUA).

The 42 characters come from the "local part" of an email address (the part before the @ sign) which can have up to 64 chars. Out of those 64 chars, Moodle uses 22 characters, leaving you with 42 characters to encode data.

What do we do with the 22 chars? Four go to the prefix, which we need to let the MTA know to pass the message to our script. Two go to identify the module ID so we know which module has generated the message (and so we dispatch the request to that module). The remaining 16 are a signature (HMAC-MD5-8) we use to authenticate the message.

Forty-two characters isn't a lot (although it could be the answer to life, the universe, and everything!) so make sure you use those characters wisely.

The most efficient way to encode database IDs in their full range (so that they can be placed in an email address) we have found is base64_encode(pack('V',2147483647)), which returns "/ / / / f w = =". The two trailing "= =" are redundant and you can remove them (you'll need to reappend them when retrieving your data). Join your parameters as encoded IDs in positional slots for efficiency.

To retrieve your data, use substr() to separate your parameters, and then unpack('V',base64_decode($str)). Note that it'll return a one-element array. Note:

Using 'V' reaches 2147483647, half the range of mySQL's INT. Additionally, 'V' behaves like a signed value, rather than an unsigned, so I suspect there's a bug in PHP's documentation of pack().

With each ID taking 6-chars (8 chars if we find a way to use the full range of 'V'), you have a limited number of parameters. If you need to encode more information, store it in the DB and send emails that point to your stored data. Remember to cleanup this temporary data after a safe period of time.

Note:

Do not try to use variable-width encoding to put IDs, as it'll work in small installations and break in larger ones.

Security issues

Any code in modulename_process_email() _must_ assume it will see repeat replies and handle them gracefully. The definition of 'gracefully' depends on what the code does.

Email servers (MTAs) will sometimes re-transmit a message if they are unsure that the receiving MTA got it -- and sysadmins may sometimes replay a whole email queue if something's gone wrong. In that case, they email body will be identical, the headers slightly different.

In a different case, the user may 'reply' to the message twice. Perhaps in error, perhaps purposefully. What to do depends on the specific scenario.

We /could/ support better protection at the framework level, by keeping track of every reply-to address we send out. We decided against that because (a) the performance impact will be important (b) we want the 1st cut to be lightweight and simple to change in case we need to.

With this the initial implementation, modules should expose functions that handle these "replay" cases correctly. If later we want to expose additional functions, we can add such tracking as an optional thing. It'd be awful to have it across all emails sent from Moodle.

Experimental email processing, pending merge into 1.9/2.0

Module Authors

The email interface allows you to let users respond to emails sent via your module and affect change in them. The basic idea is that each email is uniquely identified with an email session that carries with it a simple payload. This payload is stored via database, and cannot be changed by the user. Replies are handled by module code in your lib.php library - modulename_process_email($payload, $body) - which is passed the session payload and the body (with full headers) of the email received, to be parsed as you like. Some utility functions are available as well to help with this.

Here is an outline of the intended use:

  • Before sending an email to a user via the email_to_user() function, the module should check to see if the config variable $CFG->emailinterface is enabled. (This is off by default)
  • Assuming the email interface is enabled, the function generate_email_verp_address($moduleid, $payload, $userid) should be called to create an "email session". The payload should include any data you wish to use in your handling routine - such as the id of the user, or what part of the interface he is interacting with. The function will return a string containing the email session id.
  • Call the email_to_user() function as normal, appending the email session id as the 11th parameter. Keep in mind that you will probably also need to change the text or html of the email to point out that the email interface is active.
  • A function contained in the lib.php file of your module, named [modulename]_process_email($payload, $body) should be written. This is the handler function, which is called when an email reply is received. This function can inspect the contents of the payload received, then parse the email body for data. The $body parameter returns the full email, including the headers.

Note that each email can only be replied to once - the email session is destroyed once an email has been received. If you wish to have multiple replies from a user, you will have to create a new session (and new email) every time. The email session table is also regularly pruned, the default settings allow sessions to persist for one month.

The following utility functions are also defined in admin/process_email.php, which can help:

  • get_email_subject($fullemail) - return the subject of an email
  • strip_email_headers($fullemail) - strip all headers off an email
  • email_is_multipart($fullemail) - determine if an email is a multipart message
  • seperate_multipart($multipartmsg) - seperate a multipart email message into an array of parts, each of which has headers and a body.
  • get_multipart_content_type($part) - determine the content-type of a multipart, generally text/plain or text/html.
  • multipart_is_quoted_printable($part) - determine if a email part has been encoded with the quoted-printable encoding - some email clients tend to do this to html.
  • decode_quoted_printable($partdata) - decode a email that has been encoded with quoted-printable encoding - this needs to be run on the data section of a part (strip_email_headers can be used to seperate the headers of a part).
  • strip_email_reply($body, $ishtml, $identifier) - strip all irrelevant lines of an email message based on a unique identifier embedded within the original message. This is used to determine what is user input, and what is irrelevant data the email client has added.
  • strip_email_reply_multi($body, $ishtml, $identifier) - as above, but allowing for multiple sections of possible user input in a single email message.

Examples of use of these functions can be seen in the handler for the forum module. The idea behind the strip reply functions is that the original email will contain a unique identifier, such as a random 10 character string. This is embedded in a line with instructions for the user to write below and above it, e.g:

 \/ Please put your response here [ABCDEFGHIJ] \/
 <User Data>
 /\ Please put your response here [ABCDEFGHIJ] /\

This is necessary because of the wide differences between webmail and email clients in handling quotes and prefixing dates/times to replied emails. Keep in mind that text emails have an 80-character line width restrictions, which must also include the quote added by the email client, so any line of a text-email with this identifier should be kept to at least 77 characters to avoid wrapping by an email client. This unique identifier can be easily stored in the payload. Multiple input sections are handled similarly - a number is appended to the end of the identifier, e.g. [ABCDEFGHIJ1] to signfiy the first section of input. The strip_email_reply_multi returns an array containing a map from each numbered section it finds to a string containing all data in that section.

Developer Info

The reply-to header is now encoded in base32 instead of base64 to allow for case-insensitive MTA's. Generate_email_processing_address was left as a backwards-compatibility wrapper, modules should use generate_email_verp_address now instead.

Email sessions have been created to deal with the lack of space avaliable in the header under base32, payloads are stored in a new table mdl_email_sessions, and are identified with a unique base32 key which is embedded in the reply-to header. Each row has a timestamp, and the table is pruned during cleanup cron in admin/cron.php.

Bounce detection is now linked in with the email interface - bounce detection is included in admin/process_email.php. Some new configuration variables have also been created to help configure these changes - $CFG->emailinterface and $CFG->emailsessiontime - and both have been added to the server settings admin page.