UPDATE: Did you knew there’s an official Cacti guide? Find it at
Cacti 0.8 Beginner’s
Guide.
For more info about SNMP please don’t hesitate to take a look at
Essential SNMP, Second
Edition
.
To take a look at existing MIBs, free tools are available on the net,
IMHO the best one being
MibBrowser. This
multiplatform [Java] MIB browser has a free version which should be more
than enough for our basic task. The screen capture shown here depicts a
“Get Subtree” operation on the ‘.1.3.6.1.4.1.2021.11′ MIB; the result is
a list of single value MIBs, such for instance
‘.1.3.6.1.4.1.2021.11.11.0′ which has the alias ‘ssCpuIdle.0′ and value
97 [meaning that the CPU is 97% idle]. You can see the alias by loading
the corresponding MIB file [select File/Load MIB then choose
‘UCD-SNMP-MIB.txt’ from the list of predefined MIBs].
From command line, in order to display existing MIB values, you can use snmpwalk:
snmpwalk -Os -c [community_name] -v 1 [hostname] .1.3.6.1.4.1.111111.1
^3^ and the result is:
.1.3.6.1.4.1.2021.11 OID (.iso.org.dod.internet.private.enterprises.ucdavis.systemStats)snmpwalk -v 1 -c sncq localhost .1.3.6.1.4.1.2021.11UCD-SNMP-MIB::ssIndex.0 = INTEGER: 1UCD-SNMP-MIB::ssErrorName.0 = STRING: systemStatsUCD-SNMP-MIB::ssSwapIn.0 = INTEGER: 0UCD-SNMP-MIB::ssSwapOut.0 = INTEGER: 0UCD-SNMP-MIB::ssIOSent.0 = INTEGER: 4UCD-SNMP-MIB::ssIOReceive.0 = INTEGER: 2UCD-SNMP-MIB::ssSysInterrupts.0 = INTEGER: 4UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 1UCD-SNMP-MIB::ssCpuUser.0 = INTEGER: 2UCD-SNMP-MIB::ssCpuSystem.0 = INTEGER: 1UCD-SNMP-MIB::ssCpuIdle.0 = INTEGER: 96UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 17096084UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 24079UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 6778580UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 599169454UCD-SNMP-MIB::ssCpuRawKernel.0 = Counter32: 6778580UCD-SNMP-MIB::ssIORawSent.0 = Counter32: 998257634UCD-SNMP-MIB::ssIORawReceived.0 = Counter32: 799700984UCD-SNMP-MIB::ssRawInterrupts.0 = Counter32: 711143737UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 1163331309UCD-SNMP-MIB::ssRawSwapIn.0 = Counter32: 23015UCD-SNMP-MIB::ssRawSwapOut.0 = Counter32: 13730
Each of this values has its own significance, like for instance
‘ssCpuIdle.0′ which announces that the CPU is 96% idle.
In order to retrieve just a single value of the list, use its alias as a parameter to the snmpget command, for instance
snmpget -Os -c [community_name] -v 1 [hostname] UCD-SNMP-MIB::ssCpuIdle.0 Sometimes, you want to monitor something which you do not seem to find in the list of MIBs. Say, for instance, the performance of a MySQL database that your’re pounding pretty hard with your webapp^4^. The easiest way of doing this is to pass through a script – snmp implementations can take the result of any script and expose it through the protocol, line by line. Supposing you want to keep track of the values obtained with the following script: #!/bin/sh/usr/bin/mysqladmin -uroot status | /usr/bin/awk ‘{printf(“%fn%dn%dn”,$4/10,$6/1000,$9)}’ The mysqladmin command and a bit of simple awk magic display the following three values, each on a separate line: - number of opened connections / 10 - number of queries / 1000 - number of slow queries It is interesting to not that, while the first value is instantaneous gauge-like, the following two are incremental, growing and growing as long as new queries and new slow queries are recorded. Will keep this in mind for later, when we will track these values. But for now, let’s see how these three values are exposed through snmp. The first step is to tell the SNMP daemon that the script has an associated MIB. This is done in the configuration file, usually located at /etc/snmp/snmp.d. The following line attaches the script [for example /home/user/myscript.sh] execution to a certain OID: exec .1.3.6.1.4.1.111111.1 MySQLParameters /home/user/myscript.sh the ‘.1.3.6.1.4.1.111111.1′ OID is a branch of ‘.1.3.6.1.4.1′ [meaning ‘.iso.org.dod.internet.private.enterprises’]. We tried to make it look ‘legitimate’ but obviously you can use here any sequence you want to. After restarting the daemon, let’s interrogate Mibbrowser for the freshly created OID, see the following image snmpwalk -Os -c [community_name] -v 1 [hostname] .1.3.6.1.4.1.111111.1 ; the result is: enterprises.111111.1.1.1 = INTEGER: 1enterprises.111111.1.2.1 = STRING: “MySQLParameters”enterprises.111111.1.3.1 = STRING: “/etc/snmp/mysql_params.sh”enterprises.111111.1.100.1 = INTEGER: 0enterprises.111111.1.101.1 = STRING: “0.900000”enterprises.111111.1.101.2 = STRING: “18551”enterprises.111111.1.101.3 = STRING: “108”enterprises.111111.1.102.1 = INTEGER: 0enterprises.111111.1.103.1 = “” Great ! Now we have the proof that it really works and our specific values extracted with a custom script are visible through SNMP. Let’s go back to Cacti and see how we can make some nice charts out of them^5^. Cacti has this nice feature of defining ‘templates’ that you can reuse afterwards. My strategy is to define a data template for each one of the 3 parameters I want to chart, using the ‘Duplicate’ function applied to the ‘SNMP – Generic OID Template’.
On the duplicate datasource template, you have to change the datasource
title, name to display in charts, data source type [use DERIVE for
incremental counters and GAUGE for instantaneous values], specific OID
and the snmp community. Do it for the three values.
Using the three new datasource templates, create a chart template for
‘MySQL Activity’. That’s a bit more complicated, but it boils down to
the following procedure, repeated for each of the 3 data sources:
- add a data source and associate a graph [I always use AREA for the
first graph as a background and LINE3 for the other, but it’s just a
matter of taste]
- associate labels with current or computed values: CURRENT, AVERAGE,
MAX in this example
All the rest is really fine tuning – deciding for better colors, wether
to use autoscale or fixed scale and so on. By now, your graph template
should be ready to use.
Note that for the incremental values [‘DERIVE’ type data sources] I’ve
used titles such as ‘Thousands queries/5 min’ – the 5 minutes come from
the Cacti poller which is set to query for data each 5 minutes. The end
result is something like this one :
On this real production chart you’ll see a few interesting patterns. For
instance, at 3 o’clock in the morning, there is a huge spike in all the
charted parameters – indeed, a cron’ed script was provoking this spike.
From time to time, a small burst of slow queries is recorded – still
under investigation. What is interesting here is that these spikes were
previously undetectable on the load average chart, which look clean and innocuous:
To conclude, SNMP is a valuable resource for server performance
monitoring. Often, investigating specific parameters and displaying them
in tools such as Cacti can bring interesting insights upon the behavior
of servers.
Some SNMP implementations in different programming languages:
- Java: Westhawk’s Java SNMP stack
[free w commercial support], AdventNet SNMP
API [commercial, with a
feature-restricted un-expiring free version], iREASONING SNMP
API [commercial
implementation], SNMP4J [free and
feature-rich implementation - thank you Mathias for the tip]
- PHP: client-only supported by the php-snmp extension, part
of the PHP distribution
[free]
- Python: PySNMP is a Python SNMP
framework, client+agents [free].
- Ruby: client-only implementation Ruby
SNMP [free]
^1^ If you’re running Debian,
Cacti comes with apt so it’s a breeze to install and run [apt-get
install cacti]
^2^ a bit out of the scope of
this article, SNMP also allows writing values on remote servers, not
only retrieving monitored values.
^3^ Replace [hostname] with
the server hostname and [community_name] with the SNMP community –
default being ‘public’. The SNMP community is a way of authenticating a
client to a SNMP server; although the system can be used for pretty
sophisticated stuff, most of the time the servers have a read-only
passwordless community, visible only in the internal network for
monitoring purposes.
^4^ In fact, a commercial
implementation of SNMP for MySQL does exist.
^5^ The procedure described
here applies to Cacti v0.8.6.c
Comments