|
|
|
Cyrus
IMAP E-Mail Server Evaluation
By Dianna Laakso, UNIX Systems
Administrator
As ACS has
continued to move student E-Mail to a fully client-server
based system, we have begun evaluating several IMAP
servers. This article compares the two most popular and
widely used IMAP server implementations, the Cyrus and
the UW IMAP servers, both of which are available as
freeware.
Cyrus is an implementation of the IMAP server that was
developed at Carnegie Mellon University. The UW IMAP
server originated at the University of Washington. The
implementation that we currently use for student E-Mail
is UW IMAP server. The most basic difference between the
two implementations lies in the mailbox format. When the
decision was made several years ago to begin migrating
student E-Mail to IMAP, one of the reasons UW was chosen
was because it stores E-Mail in the traditional UNIX
(Berkeley) mailbox format. Additionally, the UW IMAP
mailboxes are stored in the location where traditional
UNIX Berkeley-style E-Mail programs expect to find them.
This backward compatibility with Berkeley-style mailbox
format allowed us to provide IMAP access to mailboxes
without requiring a full migration from Berkeley to IMAP,
and supported access to mailboxes via both IMAP clients
(Simeon, Netscape) and standard UNIX mail reading
programs (Elm, Pine). Since IMAP clients communicate
directly with the IMAP server, the format of the mail
folders is invisible to the client, and any of the many
IMAP clients may be used to access both UW and Cyrus
mailboxes.
Let's review a few facts about our current E-Mail
system:
- Incoming mailboxes (estimated 20,000) all reside
on a single disk partition, /var/spool/, on the
machine that serves as our IMAP server (Jove).
- Personal mail folders reside in user home
directory space (~user/mail/).
- There are no quotas on incoming mail.
- All mail is stored in Berkeley mailbox format.
There are definite disadvantages to our E-Mail
paradigm:
- Storing all inboxes on a single disk partition
severely limits scalability.
- There is no mechanism for controlling the size of
inboxes or folders on a per-folder basis.
- Storing personal folders in home directories
makes administration of the folders awkward.
- The UW IMAP server does not support multiple
simultaneously write accesses on a single mailbox
(multiple read-accesses are supported, however).
- UW IMAP server does not directly support access
control lists (ACLs) on mailboxes, but relies on
a separate protocol (IMSP) to provide access
control.
The Cyrus IMAP implementation has features that solve
many of the problems cited above. Some of Cyprus' most
notable features and advantages over UW are described
below.
Features and Advantages of the Cyrus IMAP
Implementation
- Cross-partition mailbox storage.
There's a definite advantage to having the
capability of spreading the mailbox database
across disks and/or machines. As the mail
database grows (a rough estimate of our growth is
20% per semester), it can be restructured and
distributed across resources in some logical way,
rather than solving the problem by simply moving
the mail database to a larger disk.
If, for example, we were to fill our user home
directory disk partition on Jove, then under our
current scheme, we would have to copy all user
directories and files, not just mail folders, to
a new, larger disk. Our user home directory
partition is not a single disk, but rather, an
array of disks that is nearly 120 GB in size. To
copy 120 GB of data from one disk array to
another could require a full day of downtime (and
perhaps even more if all did not go well).
However, expanding a Cyrus mail database would
require no downtime, other than the time required
to install the new disk(s). All that would be
required would be to add the new disk to the
server, decide upon which part of the mailbox
database would be moved to the new disk, copy the
mailboxes to the new disk, and finally, issue an
administrative command to rename the partition
associated with the mailboxes. Mailbox names do
not change when mailboxes are moved across
partitions, so the move would be transparent to
end users.
A feature of Cyrus that is closely related to
the cross-partitioning ability is the Cyrus
mailbox namespace. The Cyrus namespace is based
on the "netnews" convention. Inboxes
are all named user.username (e.g. my inbox would
be named "user.dianna"), and other
mailboxes are named
user.username.folder.subfolder (some of my
folders might be "user.dianna.work" and
"user.dianna.lists.sun").
Mailboxes are known to Cyrus system-wide, and
access is controlled entirely by the mailbox's
ACL. A byproduct of the namespace design is that
every mailbox on the entire system has a unique
name, making personal mailbox folders easier to
manage from an administrators point of view. In
our current system, it is possible, for example,
for my coworker and I to have identically named
mail folders (e.g. ~dianna/mail/work,
~mstgil/mail/work). This makes it very difficult
(not impossible, but definitely troublesome) to
manage personal mailboxes out of the context of
home directory space.
- Support for simultaneous read-write
connections to a single mailbox.
Having
multiple read-write accesses to a single mailbox
allows groups to share a mailbox more
efficiently. A case in point is the Computing
Center Helpdesk. Currently, only one of the
several Helpdesk consultants may answer Helpdesk
E-Mail at a time; the other consultants may read
E-Mail, but must wait until the working
consultant is finished before they can answer a
message.
- Mailbox ACLs.
ACLs on
mailboxes let the administrator or mailbox owner
allow other users, or "anyone," or
"anonymous" access to a mailbox. This
capability would be used to set up a shared
mailbox, such as the example of the Helpdesk
mentioned above. A fine granularity of access
control is possible: some users could be granted
write access, while others are allowed only
read-access.
- Storage quotas on mailbox hierarchies.
A
big motivational factor in our consideration of
Cyrus is the ability to set quotas on mailbox
usage. Cyrus storage quotas are completely
independent of UNIX disk quotas. The quota may be
applied to any or all levels of a mailbox
hierarchy. For example, suppose the following
quotas were applied to three of my mailboxes:
- user.dianna (quota of 5 MB)
- user.dianna.saved-messages (quota of 2 MB)
- user.dianna.list (quota of 1 MB)
- user.dianna.list.ssa (no quota set)
- user.dianna.list.sun (no quota set)
- user.dianna.work (no quota set)
Then user.dianna and user.dianna.work would have a
combined limit of 5 MB; user.dianna.saved-messages
would have a limit of 2 MB; and user.dianna.list.ssa
and user.dianna.list.sun would have a combined limit
of 1 MB. In all cases, the combined limit of all
mailboxes in the user.dianna hierarchy is that of the
top level mailbox (5 MB in the example above).
Messages only count towards the quota. Overhead
such as mail indexes and cache are not counted. If a
mail message is delivered to a mailbox would put the
mailbox over its quota, then the message is
delivered, but a warning message is also delivered
requesting that the user free some space in the
mailbox. If a mailbox is over quota at the time a
message is delivered, then the message is not
delivered, but several delivery re-attempts are made
at later times.
- Black box mail system.
Cyrus
is intended to run on a "sealed" server
machine. The advantages to this are mainly
administrative, and result in increased
scalability, decrease in administrative overhead,
and increased robustness.
Disadvantages in the Cyrus Implementation
- Conversion from Berkeley to Cyrus format
mailboxes.
A conversion program is
available in the public domain to migrate from
Berkeley-style mailboxes to Cyrus mailboxes, but
it was not designed for mass migration of the
magnitude of our current student E-Mail system.
Because Berkeley style mailboxes are not
supported by Cyrus, all users' inboxes and
personal mailboxes would have to be migrated at
the time of the switch from UW to Cyrus.
- Cyrus doesn't scale well vertically.
Cyrus stores a listing of every mailbox name
on the system in a single, ASCII, memory-mapped
file known as the "mailboxes" file. On
a large system such as ours,
"mailboxes" would be, at the very
minimum, as many lines long as there are IMAP
inboxes (at least 20,000 lines long). Because the
file is memory mapped, each IMAP process would
also be at minimum as large as the memory space
that the "mailboxes" file occupies.
- Mail would no longer be stored in
Berkeley format.
As a result, users
would no longer be able to log in to a shell
account and manipulate mail folders using
standard UNIX utilities such as grep, cat, vi, or
gzip. Additionally, traditional Berkeley mail
reader programs such as elm could no longer be
used, unless the program supports the IMAP
protocol.
- A separate machine would be required for
shell login and other utilities.
Cyrus
servers are black box mail servers. UNIX shell
access would not be granted to any user on our
Cyrus server, meaning that if users required
shell access, it would have to be granted on
another machine.
Future plans
Cyrus has not been ruled out as a possibility, but the
advantages of storing E-Mail in Berkeley format make IMAP
servers based on the UW implementation more attractive
than Cyrus. We are in the process of evaluating the Sun
Internet Mail Server (SIMS), a UW-based server that has
been favorably reviewed in recent journals. Results of
our evaluation and decision will be published in an
upcoming issue of Benchmarks and in our UNIX
System News, which can be read at http://people.unt.edu/manage/.
The list of currently available IMAP client software
packages can be found at http://www.imap.org/products/longlist.htmln
|