packetmischief.ca

Postfix


Introduction

Postfix is a widely used Mail Transfer Agent written by Wieste Venema. It's written to be fast, secure, and compatible with Sendmail.

The goal here is to monitor servers running Postfix to determine the number of email messages delivered locally and abroad per unit time and to graph that data. The tools used will be SNMP and MRTG.

The methods outlined below are based on the work of Craig Sanders <http://taz.net.au/postfix/mrtg/>. Some things that are done different here:

Code

update-mailstats.pl
Watches the maillog and updates the database file. This script requires the File::Tail module from CPAN.
postfix-stats-get.c
Retrieves data out of the database file.

Compile postfix-stats-get.c and move it to /var/net-snmp.

# gcc -o postfix-stats-get postfix-stats-get.c
# mv postfix-stats-get /var/net-snmp

Linux users: Make sure you have a copy of Berkeley DB installed and then compile using this command:

# gcc -DLINUX -ldb -o postfix-stats-get postfix-stats-get.c

Next, add update-mailstats.pl to the system start-up scripts or to a cronjob so that it runs when the system boots. The script will automatically daemonize itself.

Getting Stats from Postfix

Unfortunately, Postfix doesn't store internal statistics due do the way it's designed. Because of this, statistics will have to be culled from the log file.

The update-mailstats.pl script runs in the background, watching the Postfix mail log. It records the number of sent and recieved email messages based on whether the message is to or from a local address or a remote one. It also records the number of 4XX and 5XX error codes Postfix returns for incoming mail and the number seen when trying to deliver mail. The data is recorded in a hash table, /tmp/postfix-mailstats.db.

The update-mailstats.pl script should be added to the system start-up scripts or to a cronjob so that it runs when the system boots.

Serving Stats via SNMP

Since the goal is to use SNMP to monitor the mail server, the data in the statistics file must be made available via SNMP. The Net-SNMP (formerly net-snmp) SNMP daemon allows for data to be retrieved using local shell scripts or programs. The data retrieved from these scripts is made available under the .1.3.6.1.4.1.2021.8.1 MIB table. More information on how this works is available in the snmpd.conf manpage (look for the exec keyword).

The postfix-stats-get program will retrieve data from the database and pass them back to Net-SNMP. The program takes one command line argument which indicates the datapoint to retrieve:

If you have defined custom delivery methods in postfix's master.cf file (for example, I've defined a method called "legal" which inserts a legal disclaimer at the bottom of outgoing emails) then these tools should automatically create statistics for them too. In my case I can query sent:legal and it just works.

Edit snmpd.conf and add exec statements for each datapoint you want to query. Note that the argument after "exec" is arbitrary; it's just displayed in the extNames oid.

exec postfix-sent-smtp /var/net-snmp/postfix-stats-get sent:smtp
exec postfix-recv-smtp /var/net-snmp/postfix-stats-get recv:smtp
exec postfix-sent-local /var/net-snmp/postfix-stats-get sent:local
exec postfix-recv-local /var/net-snmp/postfix-stats-get recv:local

Once snmpd is restarted, a walk of the .1.3.6.1.4.1.2021.8.1 MIB will show the data from the hash table.

# snmpwalk -c community host .1.3.6.1.4.1.2021.8.1
enterprises.ucdavis.extTable.extEntry.extIndex.1 = 1
enterprises.ucdavis.extTable.extEntry.extIndex.2 = 2
enterprises.ucdavis.extTable.extEntry.extIndex.3 = 3
enterprises.ucdavis.extTable.extEntry.extIndex.4 = 4
enterprises.ucdavis.extTable.extEntry.extNames.1 = postfix-sent-smtp
enterprises.ucdavis.extTable.extEntry.extNames.2 = postfix-recv-smtp
enterprises.ucdavis.extTable.extEntry.extNames.3 = postfix-sent-local
enterprises.ucdavis.extTable.extEntry.extNames.4 = postfix-recv-local
enterprises.ucdavis.extTable.extEntry.extCommand.1 = /var/net-snmp/postfix-stats-get sent:smtp
enterprises.ucdavis.extTable.extEntry.extCommand.2 = /var/net-snmp/postfix-stats-get recv:smtp
enterprises.ucdavis.extTable.extEntry.extCommand.3 = /var/net-snmp/postfix-stats-get sent:local
enterprises.ucdavis.extTable.extEntry.extCommand.4 = /var/net-snmp/postfix-stats-get recv:local
enterprises.ucdavis.extTable.extEntry.extResult.1 = 0
enterprises.ucdavis.extTable.extEntry.extResult.2 = 0
enterprises.ucdavis.extTable.extEntry.extResult.3 = 0
enterprises.ucdavis.extTable.extEntry.extResult.4 = 0
enterprises.ucdavis.extTable.extEntry.extOutput.1 = 0
enterprises.ucdavis.extTable.extEntry.extOutput.2 = 215
enterprises.ucdavis.extTable.extEntry.extOutput.3 = 219
enterprises.ucdavis.extTable.extEntry.extOutput.4 = 4
enterprises.ucdavis.extTable.extEntry.extErrFix.1 = 0
enterprises.ucdavis.extTable.extEntry.extErrFix.2 = 0
enterprises.ucdavis.extTable.extEntry.extErrFix.3 = 0
enterprises.ucdavis.extTable.extEntry.extErrFix.4 = 0
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.1 =
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.2 =
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.3 =
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.4 =

Of interest are the extOutput lines which correspond to messages sent via SMTP, recieved via SMTP, sent locally, and recieved locally, respectively.

Collecting Stats with MRTG

Now that Postfix's statistics are available via SNMP, they can be graphed just the same as anything else.

Create two new targets in mrtg.cfg. Point the first one at .1.3.6.1.4.1.2021.8.1.101.1 and .1.3.6.1.4.1.2021.8.1.101.2 to get the SMTP deliveries and the second at .1.3.6.1.4.1.2021.8.1.101.3 and .1.3.6.1.4.1.2021.8.1.101.4 to get the local deliveries.

Postfix SMTP Deliveries
Postfix SMTP Deliveries

Postfix Local Deliveries
Postfix Local Deliveries
# mail delivered via smtp
Target[mail_smtp]: .1.3.6.1.4.1.2021.8.1.101.1&.1.3.6.1.4.1.2021.8.1.101.2:public@host
Title[mail_smtp]: SMTP Deliveries
LegendI[mail_smtp]: Sent
LegendO[mail_smtp]: Received
Legend1[mail_smtp]: Sent
Legend2[mail_smtp]: Received
YLegend[mail_smtp]: deliveries/min
ShortLegend[mail_smtp]: d/m
Options[mail_smtp]: nopercent, growright, integer, perminute
MaxBytes[mail_smtp]: 100000
PageTop[mail_smtp]: <h2>SMTP Deliveries</h2>

# mail delivered locally
Target[mail_local]: .1.3.6.1.4.1.2021.8.1.101.3&.1.3.6.1.4.1.2021.8.1.101.4:public@host
Title[mail_local]: Local Deliveries
LegendI[mail_local]: Sent
LegendO[mail_local]: Received
Legend1[mail_local]: Sent
Legend2[mail_local]: Received
YLegend[mail_local]: deliveries/min
ShortLegend[mail_local]: d/m
Options[mail_local]: nopercent, growright, integer, perminute
MaxBytes[mail_local]: 100000
PageTop[mail_local]: <h2>Local Deliveries</h2>

Notes

The hash file /tmp/postfix-stats.db has a fixed size; it won't increase in size over time. If the file is deleted for some reason (e.g., if the system reloads and clears out /tmp on start-up), update-mailstats.pl will recreate it. MRTG will also compensate when this happens as the counters it sees via SNMP will start from zero once the file is recreated.

As explained in the snmpd.conf manpage, when snmpd runs external commands such as postfix-stats-get, it caches the results in the file /var/net-snmp/.snmp-exec-cache. This file must be writeable by the user that snmpd is running as or else it will not return the output from the external script being ran.

The File::Tail module does not read the maillog in real time therefore the database is not updated in real time. There may be up to 60 seconds between database updates.

Ongoing Issues

There are a few issues I'm having that I've either not been able to figure out or haven't had the time to try.

Duplicate entries showing up in the enterprises.ucdavis.extTable.extEntry table

I use Net-SNMP v5.1.3 on OpenBSD. My snmpd.conf file has these lines in it:

exec bind9-ok /var/net-snmp/bind9.sh ok
exec bind9-fail /var/net-snmp/bind9.sh fail
exec postfix-sent-smtp /var/net-snmp/postfix-stats-get sent:smtp
exec postfix-recv-smtp /var/net-snmp/postfix-stats-get recv:smtp
exec postfix-sent-local /var/net-snmp/postfix-stats-get sent:local
exec postfix-recv-local /var/net-snmp/postfix-stats-get recv:local

When doing an SNMP walk of the enterprises.ucdavis.extTable.extEntry table, this is the output (cut for brevity):

UCD-SNMP-MIB::extNames.1 = STRING: postfix-recv-local
UCD-SNMP-MIB::extNames.2 = STRING: bind9-fail
UCD-SNMP-MIB::extNames.3 = STRING: postfix-sent-smtp
UCD-SNMP-MIB::extNames.4 = STRING: bind9-ok
UCD-SNMP-MIB::extNames.5 = STRING: postfix-sent-local
UCD-SNMP-MIB::extNames.6 = STRING: postfix-recv-local
UCD-SNMP-MIB::extNames.7 = STRING: bind9-ok
UCD-SNMP-MIB::extNames.8 = STRING: postfix-recv-smtp
UCD-SNMP-MIB::extNames.9 = STRING: bind9-fail
UCD-SNMP-MIB::extNames.10 = STRING: postfix-sent-smtp
UCD-SNMP-MIB::extNames.11 = STRING: postfix-recv-smtp
UCD-SNMP-MIB::extNames.12 = STRING: postfix-sent-local

Each external command listed in the config file is duplicated in a seemingly random way within the MIB. I haven't done very extensive testing to see if it's a Net-SNMP version thing or if it's somehow related to the OS I'm running on. My workaround is to simply snmpwalk this MIB to see which OIDs I need to plug into the MRTG config file.

MRTG Error "error status: noSuchName" / Non-numeric OIDs

When using non-numeric OIDs in the mrtg.cfg file (such as enterprises.ucdavis.extTable.extEntry.extOutput.3) I get the error "error status: noSuchName" every time MRTG does its thing. My mrtg.cfg file contains these lines:

LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-SMI.txt
LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-CONF.txt
LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-TC.txt
LoadMIBs: /usr/local/share/snmp/mibs/OPENBSD-BASE-MIB.txt
LoadMIBs: /usr/local/share/snmp/mibs/OPENBSD-PF-MIB.txt
LoadMIBs: /usr/local/share/snmp/mibs/UCD-SNMP-MIB.txt

I've toyed with this somewhat in that I've verified this behavior happens using any MIB (UCD-SNMP, OPENBSD-PF-MIB, etc) and that the error is always the same. I work around this problem by using numeric OIDs in mrtg.cfg which is why all the examples on this site do likewise.

I'd actually be interested to know if anyone out there sees something I'm doing wrong or if they've overcome this issue themselves and how they did it.