Amavis is an
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
content filter for
electronic mail
Electronic mail (usually shortened to email; alternatively hyphenated e-mail) is a method of transmitting and receiving Digital media, digital messages using electronics, electronic devices over a computer network. It was conceived in the ...
, implementing mail message transfer, decoding, some processing and checking, and interfacing with external content filters to provide protection against
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
and
viruses
A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are found in almo ...
and other
malware
Malware (a portmanteau of ''malicious software'')Tahir, R. (2018)A study on malware and malware detection techniques . ''International Journal of Education and Management Engineering'', ''8''(2), 20. is any software intentionally designed to caus ...
. It can be considered an interface between a mailer (
MTA, Mail Transfer Agent) and one or more
content filters.
''Amavis'' can be used to:
* detect viruses, spam, banned content types or syntax errors in mail messages
* block, tag, redirect (using
sub-addressing), or forward mail depending on its content, origin or size
* quarantine (and release), or archive mail messages to files, to mailboxes, or to a
relational database
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured for ...
* sanitize passed messages using an external sanitizer
* generate
DKIM
DomainKeys Identified Mail (DKIM) is an email authentication method that permits a person, role, or organization that owns the signing domain to claim some responsibility for a message by associating the domain with the message.
The receiver c ...
signatures
* verify DKIM signatures and provide DKIM-based
whitelisting
Notable features:
* provides
SNMP
Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behavior. Devices that typically su ...
statistics and status monitoring using an extensive
MIB
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
with more than 300 variables
* provides structured
event log in
JSON
JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
format
*
IPv6
Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communication protocol, communications protocol that provides an identification and location system for computers on networks and routes traffic ...
protocol is supported in interfacing, and
IPv6 address forms in mail header section
* properly honors per-recipient settings even in multi-recipient messages, while scanning a message only once.
* supports
international email
International email arises from the combined provision of ''internationalized domain names'' (IDN) and '' email address internationalization'' (EAI).Started with: The result is email that contains international characters (characters which do not ...
(RFC 6530,
SMTPUTF8,
EAI,
IDN)
A common mail filtering installation with ''Amavis'' consists of a
Postfix as an MTA,
SpamAssassin
Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It ...
as a
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
classifier, and
ClamAV as an anti-virus protection, all running under a
Unix-like
A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating system. Many other virus scanners (about 30) and some other spam scanners (
CRM114,
DSPAM,
Bogofilter) are supported, too, as well as some other MTAs.
Interfacing topology
Three topologies for interfacing with an
MTA are supported. The ''amavisd'' process can be sandwiched between two instances of an MTA, yielding a classical
after-queue
mail filtering setup, or ''amavisd'' can be used as an SMTP proxy filter in a before-queue
filtering setup, or the ''amavisd'' process can be consulted to provide mail classification but not to forward a mail message by itself, in which case the
consulting client remains in charge of mail forwarding. This last approach is used in a
Milter
Milter (portmanteau for ''mail filter'') is an extension to the widely used open source mail transfer agents (MTA) Sendmail and Postfix. It allows administrators to add mail filters for filtering spam or viruses in the mail-processing chain. I ...
setup (with some limitations), or with a historical client
program ''amavisd-submit''.
Since version 2.7.0 a before-queue setup is preferred, as it allows for a mail message transfer to be rejected during an SMTP session
with a sending client. In an after-queue setup filtering takes place after a mail message has already been received and enqueued by an MTA, in which case a mail filter can no longer reject a message, but can only deliver it (possibly tagged), or discard it, or generate a non-delivery notification, which can cause unwanted
backscatter in case of
bouncing a message with a fake sender address.
A disadvantage of a before-queue setup
is that it requires resources (CPU, memory) proportional to a current (peak) mail transfer rate, unlike an after-queue setup, where some delay is acceptable and resource usage corresponds to average mail transfer rate. With introduction of an option ''smtpd_proxy_options=speed_adjust'' in Postfix 2.7.0 the resource requirements for a before-queue content filter have been much reduced.
In some countries
the legislation does not permit mail filtering to discard a mail message once it has been accepted by an MTA, so this rules out an after-queue filtering setup with discarding or quarantining of messages, but leaves a possibility of delivering (possibly tagged) messages, or rejecting them in a before-queue setup (SMTP proxy or milter).
Interfacing protocols
''Amavis'' can receive mail messages from an MTA over one or more
sockets of
protocol families PF_INET (
IPv4
Internet Protocol version 4 (IPv4) is the first version of the Internet Protocol (IP) as a standalone specification. It is one of the core protocols of standards-based internetworking methods in the Internet and other packet-switched networks. ...
), PF_INET6 (
IPv6
Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communication protocol, communications protocol that provides an identification and location system for computers on networks and routes traffic ...
) or PF_LOCAL (
Unix domain socket
A Unix domain socket (UDS), a.k.a. local socket, a.k.a. inter-process communication (IPC) socket, is a communication endpoint for exchanging data between processes executing in the same Unix or Unix-like operating system.
The name, ''Unix domain ...
), via protocols
SMTP
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typi ...
,
LMTP, or a simple private protocol AM.PDP can be used with a helper program like ''amavisd-milter''
to interface with
milter
Milter (portmanteau for ''mail filter'') is an extension to the widely used open source mail transfer agents (MTA) Sendmail and Postfix. It allows administrators to add mail filters for filtering spam or viruses in the mail-processing chain. I ...
s. On the output side protocols SMTP or LMTP can be used to pass a message to a back-end MTA instance or to an
LDA, or a message can be passed to a spawned process over a
Unix pipe
In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of process (computing), processes chained together by their standard streams, so that the output text of ...
. When SMTP or LMTP are used, a session can optionally be encrypted using a
TLS STARTTLS
Opportunistic TLS (Transport Layer Security) refers to extensions in plain text communication protocols, which offer a way to upgrade a plain text connection to an encrypted ( TLS or SSL) connection instead of using a separate port for encrypted ...
(RFC 3207) extension to the protocol. SMTP Command Pipelining (RFC 2920) is supported in client and server code.
Interfacing with SpamAssassin
When
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
scanning is enabled, a daemon process ''amavisd'' is conceptually very similar to a ''spamd'' process of a
SpamAssassin
Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It ...
project. In both cases forked child processes call SpamAssassin
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
modules directly, hence their performance is similar.
The main difference is in protocols used: ''Amavis'' typically speaks a standard
SMTP
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typi ...
protocol to an MTA, while in the spamc/spamd case an MTA typically spawns a ''spamc'' program passing a message to it over a Unix pipe, then the ''spamc'' process transfers the message to a ''spamd'' daemon using a private protocol, and ''spamd'' then calls SpamAssassin Perl modules.
Design priorities
Design priorities of the ''amavisd-new'' (from here on just called ''Amavis'') are: reliability, security, adherence to standards, performance, and functionality.
Reliability
With the intention that no mail message could be lost due to unexpected events like I/O failures, resources depletion and unexpected program terminations, the ''amavisd'' program meticulously checks a completion status of every system call and I/O operation. Unexpected events are logged if at all possible, and handled with several layers of event handling. Amavis never takes a responsibility for a mail message delivery away from an MTA: the final success status is reported to an MTA only after the message has been passed on to the back-end MTA instance and reception was confirmed. In case of any fatal failures during processing or transferring of a message, the message being processed just stays in a queue of the front-end MTA instance, to be re-tried later. This approach also covers potential unexpected host failures, crashes of the amavisd process or one of its components.
The use of program resources like memory size, file descriptors, disk usage and creation of subprocesses is controlled. Large mail messages are not kept in memory, so the available memory size does not impose a limit on the size of mail messages that can be processed, and memory resources are not wasted unnecessarily.
Security
A great deal of attention is given to security aspects, required by handling potentially malicious, nonstandard or just garbled data in mail messages coming from untrusted sources.
The process which is handling mail messages runs with reduced privileges under a dedicated user ID. Optionally it can run
chroot
chroot is a shell (computer), shell command (computing), command and a system call on Unix and Unix-like operating systems that changes the apparent root directory for the current running process and its Child process, children. A program that i ...
-ed. Risks of
buffer overflows and memory allocation bugs is largely avoided by implementing all protocol handling and mail processing in
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
, which handles dynamic memory management transparently. Care is taken that content of processed messages does not inadvertently propagate to the system. Perl provides an additional security safety net with its marking of
tainted data originating from the wild, and Amavis is careful to put this Perl feature to good use by avoiding automatic untainting of data (''use re "taint"'') and only untainting it explicitly at strategic points, late in a data flow.
''Amavis'' can use several external programs to enhance its functionality. These are de-
archivers, de-
compressors,
virus
A virus is a submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are ...
scanners and
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
scanners. As these programs are often implemented in languages like
C or
C++, there is a potential risk that a mail message passed to one of these programs can cause its failure or even open a security hole. The risk is limited by running these programs as an unprivileged user ID, and possibly chroot-ed. Nevertheless, external programs like unmaintained de-archivers should be avoided. The use of these external programs is configurable, and they can be disabled selectively or as a group (like all decoders or all virus scanners).
Performance
Despite being implemented in an interpreted programming language
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
, Amavis itself is not slow. The good performance of the functionality implemented by Amavis itself (not speaking of external components) is achieved by dealing with data in large chunks (e.g. not line-by-line), by avoiding unnecessary data copying, by optimizing frequently traversed code paths, by using suitable data structures and algorithms, as well as by some low-level optimizations. Bottlenecks are detected during development by profiling code and by benchmarking. Detailed timing report in the log can help recognize bottlenecks in a particular installation.
Certain external modules or programs like
SpamAssassin
Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It ...
or some command-line
virus
A virus is a submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are ...
scanners can be very slow, and using these would constitute a vast majority of elapsed time and processing resources, making resources used by Amavis itself proportionally quite small.
Components like external mail decoders, virus scanners and
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
scanners can each be selectively disabled if they are not needed. What remains is functionality implemented by Amavis itself, like transferring mail message from and to an MTA using an
SMTP
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typi ...
or
LMTP protocol, checking mail header section validity, checking for banned mail content types, verifying and generating
DKIM
DomainKeys Identified Mail (DKIM) is an email authentication method that permits a person, role, or organization that owns the signing domain to claim some responsibility for a message by associating the domain with the message.
The receiver c ...
signatures.
As a consequence, mail processing tasks like DKIM signing and verification (with other mail checking disabled) can be exceptionally fast and can rival implementations in compiled languages.
Even full checks using a fast virus scanner but with spam scanning disabled can be surprisingly fast.
Adherence to standards
Implementation of protocols and message structures closely follows a set of applicable standards such as RFC 5322, RFC 5321, RFC 2033, RFC 3207, RFC 2045, RFC 2046, RFC 2047, RFC 3461, RFC 3462, RFC 3463, RFC 3464, RFC 4155, RFC 5965, RFC 6376, RFC 5451, RFC 6008, and RFC 4291. In several cases some functionality was re-implemented in the ''Amavis'' code even though a public (
CPAN
The Comprehensive Perl Archive Network (CPAN) is a software repository of over 220,000 software modules and accompanying documentation for 45,500 distributions, written in the Perl programming language by over 14,500 contributors. ''CPAN'' can de ...
)
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
module exists, but lacks attention to detail in following a standard or lacks sufficient checking and handling of errors.
License
Amavis is licensed under a
GPLv2
The GNU General Public Licenses (GNU GPL or simply GPL) are a series of widely used free software licenses, or copyleft, ''copyleft'' licenses, that guarantee end users the freedom to run, study, share, or modify the software. The GPL was th ...
license. This applies to the current code, as well as to historical branches. An exception to this are some of the supporting programs (like monitoring and statistics reporting), which are covered by a
New BSD License.
The project
The project started in 1997 as a Unix
shell
Shell may refer to:
Architecture and design
* Shell (structure), a thin structure
** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses
Science Biology
* Seashell, a hard outer layer of a marine ani ...
script to detect and block e-mail messages containing a
virus
A virus is a submicroscopic infectious agent that replicates only inside the living Cell (biology), cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are ...
. It was intended to block viruses at the MTA (mail transfer agent) or LDA (local delivery) stage, running on a
Unix-like
A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
platform, complementing other virus protection mechanisms running on end-user personal computers.
Next the tool was re-implemented as a
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
program, which later evolved into a
daemonized process. A dozen of developers took turns during the first five years of the project, developing several variants while keeping a common goal, the project name and some of the development infrastructure.
Since December 2008 (until 2018-10-09) the only active branch was officially ''amavisd-new'', which was being developed and maintained by Mark Martinec since March 2002. This was agreed between the developers at the time in a private correspondence: Christian Bricart, Lars Hecking, Hilko Bengen, Rainer Link and Mark Martinec. The project name ''Amavis'' is largely interchangeable with the name of the ''amavisd-new'' branch.
Much functionality has been added through the years, like adding protection against
spam
Spam most often refers to:
* Spam (food), a consumer brand product of canned processed pork of the Hormel Foods Corporation
* Spamming, unsolicited or undesired electronic messages
** Email spam, unsolicited, undesired, or illegal email messages
...
and other unwanted content, besides the original virus protection. The focus is kept on reliability, security, adherence to standards and performance.
A domain ''amavis.org'' in use by the project was registered in 1998 by Christian Bricart, one of the early developers, who is still maintaining the domain name registration. The domain is now entirely dedicated to the only active branch. The project mailing list was moved from
SourceForge
SourceForge is a web service founded by Geoffrey B. Jeffery, Tim Perdue, and Drew Streib in November 1999. SourceForge provides a centralized software discovery platform, including an online platform for managing and hosting open-source soft ...
to amavis.org in March 2011, and is hosted by Ralf Hildebrandt and Patrick Ben Koetter. The project web page and the main distribution site was located at the
Jožef Stefan Institute, Ljubljana,
Slovenia
Slovenia, officially the Republic of Slovenia, is a country in Central Europe. It borders Italy to the west, Austria to the north, Hungary to the northeast, Croatia to the south and southeast, and a short (46.6 km) coastline within the Adriati ...
(until the handover in 2018), where most of the development was taking place between years 2002 and 2018.
Change of Project Leaders Announcement
On October 9 of 2018 Mark Martinec announced
at the general support and discussion mailing list his retirement from the project and also that Patrick Ben Koetter will continue as new project leader.
After that Patrick notified
the migration of the source code to a public GitLab repository and his plan for the next steps regarding the project development.
Branches and the project name
Through the history of the project the name of the project or its branches varied somewhat. Initially the spelling of the project name was ''AMaViS'' (A Mail Virus Scanner), introduced by Christian Bricart. With a rewrite to
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language".
Perl was developed ...
the name of the program was ''Amavis-perl''.
Daemonized versions were initially distributed under a name ''amavisd-snapshot'' and then as ''amavisd''. A modular rewrite by Hilko Bengen was called ''Amavis-ng''.
In March 2002 the ''amavisd-new'' branch was introduced by Mark Martinec, initially as a
patch against ''amavisd-snapshot-20020300''. This later evolved into a self-contained project, which is now the only surviving and actively maintained branch. Nowadays a project name is preferably spelled ''Amavis'' (while the name of the program itself is ''amavisd''). The name ''Amavis'' is now mostly interchangeable with ''amavisd-new''.
See also
*
List of antivirus software
*
SpamAssassin
Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It ...
, a popular open source spam classifier
References
External links
*
{{Perl
Free email software
Free software programmed in Perl
Perl software
Anti-spam