Implementation of the Daemons

1. Directory Organisation

SMTP daemon code is in the smtpd directory. IMAP daemon code is in the imapd directory. Shared code is in the common directory.

2. Language

The Decimail SMTP and IMAP daemons are both implemented in C++. I use the STL container types and STL strings throughout. Total size is about 6000 lines of code.

3. The Message Class

Typically an IMAP client will show the user a listing of all, or at least several of the most recent, of the messages in a mailbox. The user then selects one message to read which is displayed. To support this efficiently the IMAP protocol allows the client to fetch certain attributes only of a message or set of messages. When it receives such a request Decimail will try not to squander the opportunity for efficiency and avoids fetching entire messages from the database when possible. This is achieved using the Message class and its subclasses.

A Message object represents a single message in an essentially read-only form. It provides accessor functions to get attributes such as the subject, date, etc. as well as larger portions such as all of the headers or the entire message. IMAP is aware of the MIME multipart structure of a message and can request, for example, the body of a message without the attachments. To support this the mimetic MIME-parsing library is used (http://mime.codesink.org/mimetic_mime_library.html); the Message class has accessor functions to get mimetic Header and root MimeEntity objects.

Message is a virtual class. It has three implementations: MessageStr, MessageFile and MessageDb. As the names suggest these are initialised with a message from a string, from a file, and from the database respectively.

To work efficiently, these classes operate lazily. That is, they postpone loading or parsing the message until some part of it is requested. This is most important in the case of MessageDb. When the object is created it simply notes the requested message number and does nothing. When an accessor is called it will fetch either the message header or the message body or both, as needed.

4. The libpbe Daemon Class

Both the SMTP daemon and the IMAP daemon are based on a general-purpose daemon base-class in libpbe. (libpbe is a personal collection of utility functions supplied with Decimail.)

The libpbe Daemon class provides the basis for a multi-threaded daemon process. It looks after all the tcp/ip and thread-related stuff. There is an example of how to use the class called inc_daemon in the libpbe examples directory.

The SMTP and IMAP daemons create subclasses of Daemon that implement its session() virtual method. This is invoked with input and output file descriptors as parameters within a new thread when a new connection is established.

We are fortunate that essentially no communication is required between the different threads in these daemons. Even so it is sobering to think that these are long-running processes where things like memory leaks will be unacceptable, and that threads provide no memory-space separation between one connection's code and anothers'. Care is required, and I think that, for example, the STL string and container classes help.

5. The IMAP Daemon

The IMAP Daemon is the more complex of the two due to the complexities of the IMAP protocol and the inherently more difficult task involved. It also requires more effort because its performance is perceived by the user in the responsiveness of their mail client.

The IMAP specification RFCs 3501 and/or 2060 are essential reading if you want to understand what this code is trying to achieve.

An Imapd::Session object is created for each connection. This stores all of the per-connection state such as the name of the logged-in user and the selected mailbox. There is one connection to the database, implemented by a DmDatabase object, per session.

A per-session flex/bison parser is used to recognise IMAP commands from the client. (Using flex and bison in C++ with threads is painful. The details are hidden within a CommandParser object.) The bison rules create an object of a subclass of Command representing the command that they have parsed. There is one subclass for each IMAP command, and for those with more complex parameters such as FETCH and SEARCH there are additional classes to describe those parameters. The Command objects store no state other than the parameter values from the IMAP command string.

Once a complete command has been parsed it is executed by invoking Command::run(). This invokes the virtual method runbody() which is implemented by the command-specific subclasses. The session is passed as a parameter.

Once a user has logged in and selected an initial mailbox, exactly one mailbox is selected at any time. (If a client seems to have more than one mailbox open it may have more than one IMAP connection open.) The selected mailbox is changed using the IMAP SELECT command (not to be confused with the SQL command of the same name!). The Session class stores information about this currently-selected mailbox.

When a new mailbox is selected, Session::set_mailbox looks up the SQL query that gets the message IDs for the messages in the new mailbox. (This is the basic idea underlying Decimail's use of PostgreSQL; see Using the Database for more about this.) Session::get_msg_ids() then runs this query.

Once a mailbox has been selected, IMAP has two ways of refering to messages within the mailbox. It can either use message sequence numbers, which simply number the messages from 1 to however many there are in the mailbox, or it can use unique IDs (which I call "message IDs" or commonly msgid in the source code). IMAP only requires that unique IDs are unique within the mailbox, but Decimail chooses to make them globally unique and uses them as the primary key to the messages table in the database. Unique IDs must be strictly-ascending within a mailbox but need not be contiguous.

To retrieve a message from the database a message ID is needed (it is used to create a MessageDb object, see above). The Session object keeps track of the sequence numbers and corresponding message IDs so that it can convert back and forth when necessary in seqnum_to_msgid[] (a vector) and msgid_to_seqnum[] (a map). These are initialised in get_msg_ids().

The FETCH command is particularly complicated yet is important for efficiency reasons. It first has to find the message IDs for the messages being fetched. MessageSetCmd::get_msg_ids() does this; what is involved depends on whether this is a FETCH or UID FETCH command. It then asks the Session object to prefetch all of these messages. The idea here is that since the same attributes of all of these messages will be wanted, it may be possible to optimise the fetch, perhaps using a single database query rather than one for each message. At present Session::prefetch_messages() will read the headers for all of the requested messages in a single database query, which gets some but not all of the potential speedup. It also avoids re-fetching messages that it already has. It then takes each message in turn and builds up the response that it will return to the client, which is a in a complex format dependent on the particular attributes requested. The attributes are represented by a list of FetchAtt subclass objects built by the parser, for example FetchUid which returns the message's ID or FetchRfc822Size which returns the size of the message. Each of these clases has an extract() method that is applied to the Message object and obtains and suitably formats the required data.

5.1. Notification of New Messages

The IMAP spec allows servers to send "unilateral data" to clients to notify them of new messages or of other changes to the state of the mailbox. These other changes result from other sessions accessing the same mailbox and, for example, modifying the seen/unseen state of messages.

Although the spec requires that clients respond to such unilateral data it seems that most don't; they poll for new mail by sending NOOP commands and get confused by data from the server that they weren't expecting. To further complicate things there is a protocol extension called IDLE that essentially defines the same capability but with extra overhead. Decimail tries to support both mechanisms.

The basic mechanism that Decimail uses is PostgreSQL's asynchronous notification system. When new messages arrive or flags are changed [actually at the moment flag changes are not propogated, but they could be] a server trigger is fired which sends a notification to listening IMAP daemons. A separate thread, started from Session::run(), waits for data on the PostgreSQL socket, indicating a notification, If new messages are detected, send_any_unilateral_updates() attempts to notify the IMAP client.

A mutex is needed so that the main IMAP-command-parsing thread and the notification monitoring thread do not try to communicate with the database, or send output to the IMAP client, simultaneously.

Note that the possibility of changes to the mailbox from a concurrent connection impacts on the way that messages are stored by the Session and FetchCmd code described above. We rely on the fact that the actual content of messages cannot change, only their flags can. Values of flags are not cached, in fact they are not stored in the Message objects at all; each time a fetch requests flag values they are read from the database.

5.2. Deleting and Expunging

IMAP defines a two-stage delete mechanism. Decimail adds the third stage.

The first stage is that messages are marked as deleted (or rather pending deletion) by having their Deleted flag set. This is reversable. Decimail handles this like any other flag and it all works.

The second stage is the IMAP EXPUNGE command which is supposed to actually remove the pending-delete messages. This is not a highlight of the IMAP protocol. The problem is that after an EXPUNGE message sequence numbers have changed. This presents difficulties for a second connection to the same mailbox which may be sending a command when the expunge happens and suffer a race condition, not knowing if its commands are interpretted in the context of the old message numbers or the new message numbers. As a consequence of this, with any IMAP server I would suggest never performing an EXPUNGE when there may be more than one client connected.

It is slightly worse for Decimail as each message may appear in more than one mailbox. So EXPUNGing in one mailbox may cause messages in another mailbox to be expunged. Some clients keep more than one connection open in order to monitor multiple mailboxes. In this case the inconsistencies that would normally only be seen with multiple clients could be seen within a single same client. I would therefore suggest that EXPUNGE should only be performed when closing down the client. ("Expunge when closing" is a common option.)

The EXPUNGE command causes Decimail to remove the messages from the database, but the backup files remain. You may decide that you want to retain these backup files indefinitely. If you want to remove them when the message is expunged, you need to run an additional daemon, the "delete daemon". A database trigger fires each time a message is expunged and writes a row in the table delete_log. The delete daemon monitors this log and removes the corresponding backup files.

5.3. Supported IMAP Commands

The IMAP daemon tries to be compliant with the IMAP specification(s), but isn't quite there. The areas of non-compliance are described here. Basically, some IMAP features are not applicable to the Decimail environment, and aren't implemented, and a few others aren't implemented because they're hard and don't seem to be important for the clients I have used. The vast majority of the spec is implemented. Feedback about where improvements would be useful would be appreciated.

"The IMAP Specification" is a vague term. This implementation was based on RFC2060, which defines "IMAP4rev1". Unfortunately there is another specification, RFC3501, which also defines "IMAP4rev1". Why this isn't called "IMAP4rev2" I don't know - there are differences, albeit small ones. I have made some changes based on RFC3501.

Command Status
CAPABILITY Implemented.
NOOP Implemented.
LOGOUT Implemented.
AUTHENTICATE No authentication mechanisms other than login with cleartext password are implemented. I did try to implement PREAUTH for localhost using identd but it confused the clients. Libraries are available to implement the authentication, all that is needed is to understand how they work and to patch them in.
LOGIN Implemented.
SELECT Implemented.
EXAMINE Implemented.
CREATE Implemented as part of the Actions framework.
DELETE Implemented as part of the Actions framework.
RENAME Implemented as part of the Actions framework.
SUBSCRIBE Implemented.
UNSUBSCRIBE Implemented.
LIST Implemented.
LSUB Implemented.
STATUS Implemented.
APPEND Not implemented. Append is supposed to store a new message, supplied by the client, at the end of a mailbox. This could be used for storing messages in a "sent mail" mailbox or for storing drafts. The difficulty in Decimail is that the mailbox(es) in which a message is visible is not determined by where you "put it", but on which mailbox queries it matches. I work around this by bcc:ing messages to myself in order to keep copies of outgoing messages. This leaves the question of saving drafts, which needs some solution.
CHECK Implemented.
CLOSE Implemented.
EXPUNGE Implemented.
SEARCH Mostly implemented. A basic implementation of the previoisly-missing SEARCH HEADER is now included, but it will not operate correctly in some cases and is inefficient. Some other searches use queries that are not as efficient as they could be, and in particular queries with a number of search terms do not merge those search terms to form a combined query in a form that PostgreSQL can execute efficiently. Isses with RECENT (see below) mean that SEARCH OLD and SEARCH NEW can't work. Explicitly user-initiated searching is not important to Decimail as the smart use of mailboxes should make it unnecessary; however, it seems that some clients make use of search commands for their own internal purposes, such as "UID SEARCH UID" which does nothing useful as far as I can see. Use cases and profiling would be interesting.
FETCH Implemented. Support for message threading may be broken in some clients because FETCH currently does not return in-reply-to information in the ENVELOPE response.
STORE Implemented.
COPY Implemented as part of the Actions framework.
UID Implemented.
IDLE Implemented. (IDLE is an extension specified in RFC 2177.)

Other areas of less-than-complete support are:

The IMAP daemon has been tested and seen to be at least superficially functional with the following mail clients. Additional compatibility information would be appreciated.

Of the GUI clients, Thunderbird seems to be the best, though they all have their oddities. Thunderbird 0.6 and above is the only GUI client I've used in which unilateral data, using the IDLE extension, is supported.

Here is a short list of mail clients that don't work:

5.4. IMAP Daemon Efficiency

As noted above the performance of the IMAP daemon is directly perceived by the user in terms of the responsiveness of their mail client when browsing and viewing mail.

My experience so far is that the area where efficiency must first be considered is ensuring that the SQL queries that are being executed are efficient. In particular I have encountered forms of SQL syntax that are less efficiency by a factor of O(n) than an exactly equivalent query. This is where I will be focusing my attention.

Apart from this, we must look for big-O improvements in the C code. This could be O(message size), O(mailbox size), or O(database size). Profiling is required.

One question to ask is whether the rope type could be more appropriate in some parts of the code than string. My initial experience is that it is necessary to convert to or from string in too many places for this to be appealing.

Proving the rule of thumb that your estimates of what parts of your code will be performance-critical are always wrong, I was surprised to find that initial client start-up time is poor due to the time taken list all available mailboxes. But this is probably an SQL problem.

6. The SMTP Daemon

The SMTP daemon performs three tasks:

The libpbe Daemon class is the base class for the SmtpDaemon class. It creates an Smtpd::Session object for each connection. The Session object impements the server side of the SMTP protocol and receives a message and list of recipients. For each recipient Session::lookup_local_user() determines whether it is a local or remote address by looking up the address in the database's incoming_addresses table. If the connection is not from localhost, only incoming addresses are accepted (no relaying).

6.1. Incoming Messages

Session::process_incoming() is called for each recipient of an incoming message. It first gets a new message ID for the message from the database. It then calls Session::save_backup() to create a backup file whose name is based on the message ID. Having done this it creates a MessageStr object based on the message and inserts it into the database. The tsearch2 message contents index is updated as the message is inserted.

6.2. Outgoing Messages

Session::process_outgoing() is called for each recipient of an outgoing message. It first performs various transformations on the message as described in the next section. The Session object contains an SmtpClient object (from libpbe; the libpbe examples directory contains mailform.cc which shows how it can be used). The SmtpClient is connected to the smart host defined in the database configuration table the first time it is needed. process_outgoing() calls SmtpClient::send_msg() to send the message.

6.3. Message Transformations

The first portion of Session::process_outgoing() is responsible for transforming the message as an anti-spam tactic. This logic should really be abstracted somehow, rather than being hard-coded here. I'd like for there to be some way in which each user could define their own rules and have them applied here. But at the moment I'm not sure what is needed, and would welcome comments. Here is a summary of what this code does.

I configure my email client(s) to set the From: address to "auto@MY-DOMAIN". This address is recognised as special because it matches a row in the auto_users table. Messages that don't have a match bypass this code.

The aim is to replace "auto" with something else. Since the domain is a personal one, mail coming to any address in the domain is for me. So "auto" can be replaced with anything and I will get the replies.

When I fill in web forms or similar, I give unique addresses, typically something like "spam_from_safeway". This has two benefits. First, I could use it to filter incoming mail. Second, if one of these addresses starts sending me spam, I can just block that address - and I know who was responsible for giving my address to a spammer.

So this logic finds a suitable address to use as follows. First, it tries to work out if the message is a reply. If it is, it looks up the original message in the database and finds the address that it was sent to, and uses that as the address.

If it isn't a reply, it looks to see if a specific custom address has been recorded for this correspondent in the custom_addrs table (actually a view). This is useful for mailing lists; when I subscribe to a list I give a unique address such as "spam_from_xyz_list".

If these both fail, a random address is generated based on the template from auto_users. The random address used is recorded in the random_addrs table which influences the custom_addrs view, so the same address will be used next time I write to that recipient.

Comments about these techniques, and how they could be abstracted out of the SMTP daemon code in a useful way, would be helpful.

7. Logging

The IMAP and SMTP daemons can log all communication with clients to syslog. You can enable and disable the IMAP logging using the imapd_log_imap configuration setting (see configuration.sql). To control SMTP logging use the smptd_log_conversation configuration setting.

Other syslog calls are in the uncaught-exception-handling code in Daemon.

Note that when enabled the log is complete AND INCLUDES PLAINTEXT PASSWORDS.

When multiple sessions are active at the same time their activity will be interleaved in the log. To help distinguish the sessions a session number is shown between { }.

It can also be helpful to log database queries. PostgreSQL has various options to enable this; adding "log_statement = true" to postgresql.conf just logs all statements. You can also log only statements that take more than a certain amount of time to execute. See the PostgreSQL documentation for details.