Network Monitoring: BIND 9
- Introduction
- Getting Stats from BIND
- Serving Stats via SNMP
- Collecting Stats with MRTG
- Download
- Notes
- Ongoing Issues
Introduction
BIND is the Berkeley Internet Name Domain. It's one of, if not _the_ most popular DNS server software around.
The goal here is to monitor DNS servers running BIND version 9 to determine the number of successful and failed queries per unit time and to graph that data. The tools used will be SNMP and MRTG.
- BIND homepage: http://www.isc.org/index.pl?/sw/bind/
Getting Stats from BIND
BIND stores a number of statistics internally, including the number of successful and failed queries. To retrieve those stats, issue the rndc stats command. This will instruct BIND to dump the stats to the statistics-file as configured in named.conf.
# rndc stats
# cat /var/named/tmp/named.stats
+++ Statistics Dump +++ (1090945658)
success 268351
referral 0
nxrrset 1056
nxdomain 21191
recursion 5255
failure 78
--- Statistics Dump --- (1090945658)
The success and failure lines are of interest here.
Serving Stats via SNMP
Since the goal is to use SNMP to monitor the DNS server, the data in the statistics file must be made available via SNMP. The Net-SNMP SNMP daemon allows for data to be retrieved using local shell scripts or programs. The data retrieved from these scripts is made available under the .1.3.6.1.4.1.2021.8.1 MIB table. More information on how this works is available in the snmpd.conf manpage (look for the exec keyword).
The following lines are added to snmpd.conf:
exec bind9-ok /var/net-snmp/bind9.sh ok exec bind9-fail /var/net-snmp/bind9.sh fail
The first line will return the number of successful queries, the second the number of failed. The /var/net-snmp/bind9.sh shell script has the task of taking data from BIND's statistics file and passing it to the SNMP daemon. The script is available here: bind9.sh.
A fellow network person who read this page contributed a second version of the script which will also return stats for "referral", "nxrrset", "nxdomain", and "recursion" queries. That script is here: bind9-new.sh. This script is called with the desired query type as its argument, e.g.:
exec bind9-success /var/net-snmp/bind9-new.sh success exec bind9-failure /var/net-snmp/bind9-new.sh failure exec bind9-nxdomain /var/net-snmp/bind9-new.sh nxdomain exec bind9-recursion /var/net-snmp/bind9-new.sh recursion
Once snmpd is restarted, a walk of the .1.3.6.1.4.1.2021.8.1 MIB will show the script in action.
# snmpwalk -c community
host .1.3.6.1.4.1.2021.8.1
enterprises.ucdavis.extTable.extEntry.extIndex.1 = 1
enterprises.ucdavis.extTable.extEntry.extIndex.2 = 2
enterprises.ucdavis.extTable.extEntry.extNames.1 = bind9-ok
enterprises.ucdavis.extTable.extEntry.extNames.2 = bind9-fail
enterprises.ucdavis.extTable.extEntry.extCommand.1 =
/var/net-snmp/bind9.sh ok
enterprises.ucdavis.extTable.extEntry.extCommand.2 =
/var/net-snmp/bind9.sh fail
enterprises.ucdavis.extTable.extEntry.extResult.1 = 0
enterprises.ucdavis.extTable.extEntry.extResult.2 = 0
enterprises.ucdavis.extTable.extEntry.extOutput.1 = 268814
enterprises.ucdavis.extTable.extEntry.extOutput.2 = 78
enterprises.ucdavis.extTable.extEntry.extErrFix.1 = 0
enterprises.ucdavis.extTable.extEntry.extErrFix.2 = 0
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.1 =
enterprises.ucdavis.extTable.extEntry.extErrFixCmd.2 =
Of interest is the extOutput.1 and extOutput.2 results which correspond to successful and failed queries, respectively.
Collecting Stats with MRTG
Now that BIND's statistics are available via SNMP, they can be graphed just the same as anything else.
Create a new target in mrtg.cfg and point it to the appropriate oids under .1.3.6.1.4.1.2021.8.1.101.
MRTG Graph of BIND9 Queries
Download
- bind9.sh
- Author: Joel Knight
Returns number of successful or failed queries. Usage:
bind9.sh ok|fail
- bind9-new.sh (recommended)
- Author: Evgeny Zislis (aka Kesor)
<evgeny.zislis..gmail.com>
Returns stats for all query types. Usage:
bind9-new.sh success|referral|nxrrset|nxdomain|recursion|failure
Notes
Be aware that when rndc stats is run, the statistics file isn't overwritten, it's appened-to. This means the file will continue to grow larger in size everytime the shell script is run (i.e., every time either of the two SNMP oids are queried). A good idea may be to add a weekly cron job to delete the file so that its size can be kept in check.
As explained in the snmpd.conf manpage, when snmpd runs external commands such as bind9.sh, it caches the results in the file /var/net-snmp/.snmp-exec-cache. This file must be writeable by the user that snmpd is running as or else it will not return the output from the external script being ran.
Ongoing Issues
There are a few issues I'm having that I've either not been able to figure out or haven't had the time to try.
Duplicate entries showing up in the enterprises.ucdavis.extTable.extEntry table
I use Net-SNMP v5.1.3 on OpenBSD. My snmpd.conf file has these lines in it:
exec bind9-ok /var/net-snmp/bind9.sh ok exec bind9-fail /var/net-snmp/bind9.sh fail exec postfix-sent-smtp /var/net-snmp/postfix-stats-get sent:smtp exec postfix-recv-smtp /var/net-snmp/postfix-stats-get recv:smtp exec postfix-sent-local /var/net-snmp/postfix-stats-get sent:local exec postfix-recv-local /var/net-snmp/postfix-stats-get recv:local
When doing an SNMP walk of the enterprises.ucdavis.extTable.extEntry table, this is the output (cut for brevity):
UCD-SNMP-MIB::extNames.1 = STRING: postfix-recv-local UCD-SNMP-MIB::extNames.2 = STRING: bind9-fail UCD-SNMP-MIB::extNames.3 = STRING: postfix-sent-smtp UCD-SNMP-MIB::extNames.4 = STRING: bind9-ok UCD-SNMP-MIB::extNames.5 = STRING: postfix-sent-local UCD-SNMP-MIB::extNames.6 = STRING: postfix-recv-local UCD-SNMP-MIB::extNames.7 = STRING: bind9-ok UCD-SNMP-MIB::extNames.8 = STRING: postfix-recv-smtp UCD-SNMP-MIB::extNames.9 = STRING: bind9-fail UCD-SNMP-MIB::extNames.10 = STRING: postfix-sent-smtp UCD-SNMP-MIB::extNames.11 = STRING: postfix-recv-smtp UCD-SNMP-MIB::extNames.12 = STRING: postfix-sent-local
Each external command listed in the config file is duplicated in a seemingly random way within the MIB. I haven't done very extensive testing to see if it's a Net-SNMP version thing or if it's somehow related to the OS I'm running on. My workaround is to simply snmpwalk this MIB to see which OIDs I need to plug into the MRTG config file.
MRTG Error "error status: noSuchName" / Non-numeric OIDs
When using non-numeric OIDs in the mrtg.cfg file (such as enterprises.ucdavis.extTable.extEntry.extOutput.3) I get the error "error status: noSuchName" every time MRTG does its thing. My mrtg.cfg file contains these lines:
LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-SMI.txt LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-CONF.txt LoadMIBs: /usr/local/share/snmp/mibs/SNMPv2-TC.txt LoadMIBs: /usr/local/share/snmp/mibs/OPENBSD-BASE-MIB.txt LoadMIBs: /usr/local/share/snmp/mibs/OPENBSD-PF-MIB.txt LoadMIBs: /usr/local/share/snmp/mibs/UCD-SNMP-MIB.txt
I've toyed with this somewhat in that I've verified this behavior happens using any MIB (UCD-SNMP, OPENBSD-PF-MIB, etc) and that the error is always the same. I work around this problem by using numeric OIDs in mrtg.cfg which is why all the examples on this site do likewise.
I'd actually be interested to know if anyone out there sees something I'm doing wrong or if they've overcome this issue themselves and how they did it.