Jonathan
Jonathan Author of Robopenguins

Trying to Justify SNMP

Simple Network Management Protocol (SNMP) is a protocol for collecting info from devices like resource usage or configuration. It’s come up a few times when I researched how to gather monitoring data especially from routers. It was always too much of a rabbit hole to bother with, but this time I finally spent the time to understand and use it. Was it worth it?

Why Understand SNMP?

This was brought on since I figured SNMP might be a good fit for gathering data for my LAN Pets project.

The first time I remember actually making any real use of it, was when I was working on using checkmk as a home network dashboard:

SNMP allows the checkmk server to automatically gather data from devices that support it without them needing to run the checkmk agent that usually gathers data from a monitored device. I was amazed that without any configuration I was able to see that my printer was low on ink, and how many pages it had printed in its lifetime.

In my current effort to better understand SNMP, I was amused to find a whole series of articles from checkmk from the more straight forward reference articles, to some blog articles mostly just griping about how bad it can be:

A quote from one of the reference articles was actually what spurred me to write this:

We have discussed SNMP before and how it is not the right choice in most use cases. Despite the issues, lack of performance improvements of the protocol, and its growing list of alternatives, SNMP is leaving us not just yet. There are various reasons for this that go beyond the scope of this article. Suffice to say, SNMP is well-established and present on many networks to this day. Support from vendors is not dropping anytime soon either, forcing administrators to face configuring SNMP sooner or later (or rather, willingly or not).

It’s interesting that this is a simple, widely adopted protocol, but you very rarely hear about it. Most articles about it are either very cursory, or assume you’re intimately familiar with its details. I wanted to make this write up as a jumping off point for people hearing about it and considering integrating it into a modern project.

What Makes SNMP “Hard”

SNMP is pretty simple at both a high and low level. Its complexity comes from the amount of context you need to hold in your head at once to get started using it. To illustrate this lets compare one of the more basic and advanced use cases.

Case 1: Monitor a Router Using a Commercial Monitoring Tool

Let’s say you’re using a monitoring tool that supports collecting SNMP data and your router supports running an SNMP server. You’ll need to:

  1. Enable the SNMP server on the router, possibly setting some credentials
  2. Point the monitoring tool to the routers IP along with any credentials

This should “just work” and enable the monitoring tool to get a wide range of information about the network and the routers status.

Really, the only thing you need to know for this use case, is that SNMP exists, and how to set it up on the monitor and the device.

Case 2: You Want to Query SNMP Data Yourself

Let’s say you’re making a tool that wants to gather data from SNMP devices.

Using snmpwalk

While other tools exist (mostly very 80’s style windows apps), the command line tool snmpwalk that is pretty much the de-facto way to query SNMP results.

Here’s an example usage:

1
2
3
4
5
6
7
8
9
10
# Install command for Ubuntu.
apt-get install snmp
# Get up time.
# -v sets the SNMP version to 1
# -c sets the SNMP community to public
# 192.168.1.1 is the IP of the device to query
# .1.3.6.1.2.1.1.3.0 is the OID tree to walk.
snmpwalk -v 1 -c public 192.168.1.1 .1.3.6.1.2.1.1.3.0 
# Outputs: iso.3.6.1.2.1.1.3.0 = Timeticks: (41170513) 4 days, 18:21:45.13
# This indicates that the value corresponding to .1.3.6.1.2.1.1.3.0  was a time tick value of 411705.13 seconds.

The main thing to understand here is the community, and the OID.

The community is used as a super insecure plaintext password. “public” is the default used by many devices.

The OID is a hierarchal address of for referencing data. In the case of .1.3.6.1.2.1.1.3.0, the first 7 digits (.1.3.6.1.2.1.1) identifies a set of system management data this device reports. The next two digit (3.0) specifies the specific uptime value in that set. See https://checkmk.com/blog/snmp-a-necessary-evil for the full explanation of the OID hierarchy.

One feature of the hierarchical OID specification, is that it can be “walked”. The protocol has a GetNextRequest command that allows the client to discover what OIDs are supported by the device. snmpwalk, will use this to discover all of the values “under” the value you pass in the CLI:

1
2
3
4
5
6
7
8
9
10
11
12
13
# Query values in the "SNMP MIB-2 System"
snmpwalk -v 1 -c public 192.168.1.1 .1.3.6.1.2.1.1
# .1.3.6.1.2.1.1.1.0 = STRING: Omada Gigabit Multi-WAN VPN Router
# .1.3.6.1.2.1.1.2.0 = OID: .1.3.6.1.4.1.8072.3.2.10
# .1.3.6.1.2.1.1.3.0 = Timeticks: (41641547) 4 days, 19:40:15.47
# .1.3.6.1.2.1.1.4.0 = STRING: www.tp-link.com
# .1.3.6.1.2.1.1.5.0 = STRING: ER605
# .1.3.6.1.2.1.1.6.0 = STRING: TP-Link
# .1.3.6.1.2.1.1.8.0 = Timeticks: (3) 0:00:00.03
# .1.3.6.1.2.1.1.9.1.2.1 = OID: .1.3.6.1.2.1.4
# .1.3.6.1.2.1.1.9.1.2.2 = OID: .1.3.6.1.6.3.1
# .1.3.6.1.2.1.1.9.1.2.3 = OID: .1.3.6.1.2.1.49
# ...

This is used both as a discovery mechanism, and to iterate over arrays and tables of data that don’t have predefined sizes.

Adding MIB Files

The main complexities of querying with SNMP is that the device and the monitor need to have a common understanding of what OIDs are supported and what the values actually mean. The mapping of the OIDs to their interpretation are called management information bases (MIBs). There is an human format that encodes these descriptions form that can be passed into tools like snmpwalk.

If the device supports it, there is the sysORTable (.1.3.6.1.2.1.1.9.1) which lists the MIBs supported by the device.

You can download these txt files and directly pass them to snmpwalk, or you can download/install the set of common MIBs. This lets the tools resolve the OIDs to text names.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Install standard MIBs
sudo apt-get install snmp-mibs-downloader
# Also needed to edit /etc/snmp/snmp.conf as described in:
# https://unix.stackexchange.com/questions/590636/what-is-oid-mib-check-transfer-amount-by-check-snmp#590778
# Request the sysName from the SNMPv2-MIB MIB.
snmpwalk -v 1 -c public 192.168.1.1 SNMPv2-MIB::sysName
# Outputs: SNMPv2-MIB::sysName.0 = STRING: ER605
# With the MIB loaded, you can properly load tables of data defined in the MIB:
snmptable -v1 -c public 192.168.1.1 SNMPv2-MIB::sysORTable
#                                        sysORID                                                                     sysORDescr  sysORUpTime
#                                RFC1213-MIB::ip                        The MIB module for managing IP and ICMP implementations 0:0:00:00.01
#                            SNMPv2-MIB::snmpMIB                                             The MIB module for SNMPv2 entities 0:0:00:00.01
#                                TCP-MIB::tcpMIB                                The MIB module for managing TCP implementations 0:0:00:00.01
#                                UDP-MIB::udpMIB                                The MIB module for managing UDP implementations 0:0:00:00.01
#        SNMP-VIEW-BASED-ACM-MIB::vacmBasicGroup                                      View-based Access Control Model for SNMP. 0:0:00:00.01
# SNMP-FRAMEWORK-MIB::snmpFrameworkMIBCompliance                                          The SNMP Management Architecture MIB. 0:0:00:00.01
#                SNMP-MPD-MIB::snmpMPDCompliance                                The MIB for Message Processing and Dispatching. 0:0:00:00.01
#       SNMP-USER-BASED-SM-MIB::usmMIBCompliance The management information definitions for the SNMP User-based Security Model. 0:0:00:00.01
#                          TUNNEL-MIB::tunnelMIB                    RFC 2667 TUNNEL-MIB implementation for Linux 2.2.x kernels. 0:0:00:00.03

The snmptable example above shows how the MIBs also specify how data is interpreted. Here it establishes the definition of the OIDs, and how some of them are indexes that are used to define a table of values.

The last thing to note is that while many metrics are in the “standard” MIBs, there are also many proprietary MIBs used by different manufacturers. You may be able to download these MIBs from the manufacturer, but otherwise you’ll have no way to know what this data describes.

An example is the https://github.com/bieniu/brother python project that maps the Brother printer MIB to a python script https://github.com/bieniu/brother/blob/master/brother/const.py.

Programmatically Gathering Data

Using snmpwalk is probably the most straightforward way to get data over SNMP. Even if it means calling out to a subprocess. snmpwalk is handling the transport, protocol and deserialization that you would otherwise need to implement. Even fairly high level libraries can only abstract this so much and have a steep learning curve. Here’s my attempt to make the simplest possible Python script for querying processor utilization using PySNMP:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
import socket
import sys
import time
from typing import Any

# ASN.1 is the serialization scheme SNMP uses
from pyasn1.codec.ber import encoder, decoder
# This defines the structure of the request and response messages.
# PySNMP also can handle the transport, but it is asyncio based and significantly complicates things for a small example.
from pysnmp.proto import api

def send_requests(host: str, community: str, oids: list[str]) -> dict[str, Any]:
    results = {oid: None for oid in oids}
    # Protocol version to use
    pMod = api.PROTOCOL_MODULES[api.SNMP_VERSION_1]
    # Build PDU for requesting OIDs
    reqPDU = pMod.GetRequestPDU()
    pMod.apiPDU.set_defaults(reqPDU)
    pMod.apiPDU.set_varbinds(
        # The request message has an empty (Null) body
        reqPDU, tuple((oid, pMod.Null('')) for oid in oids)
    )

    # Build message
    reqMsg = pMod.Message()
    pMod.apiMessage.set_defaults(reqMsg)
    pMod.apiMessage.set_community(reqMsg, community)
    pMod.apiMessage.set_pdu(reqMsg, reqPDU)

    # Handle the networking directly instead of using PySNMP
    # Create a UDP socket
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    # Bind the socket to the client address
    sock.bind(("0.0.0.0", 1161))

    try:
        # Send the message
        sock.sendto(encoder.encode(reqMsg), (host, 161))
        # Wait for a response (with a timeout)
        sock.settimeout(1)  # Set a 1-second timeout
        data, addr = sock.recvfrom(1024) 

        # Decode the message and extract the response PDU
        rspMsg, wholeMsg = decoder.decode(data, asn1Spec=pMod.Message())
        rspPDU = pMod.apiMessage.get_pdu(rspMsg)

        # Check for SNMP errors reported
        errorStatus = pMod.apiPDU.get_error_status(rspPDU)
        if errorStatus:
            print(errorStatus.prettyPrint())
        else:
            # Extract the response values
            for oid, val in pMod.apiPDU.get_varbinds(rspPDU):
                results[str(oid)] = val
    except socket.timeout:
        print("No response received")
    finally:
        sock.close()

    return results


def get_load_averages(host: str, community: str) -> tuple[float, float, float]:
    indexes = range(1, 4)
    # The 1,5 and 15 minute load averages DisplayString (UCD-SNMP-MIB::laLoad).
    base_oid = '1.3.6.1.4.1.2021.10.1.3.'
    # Queries 1.3.6.1.4.1.2021.10.1.3.1, 1.3.6.1.4.1.2021.10.1.3.2, 1.3.6.1.4.1.2021.10.1.3.3
    response = send_requests(sys.argv[1], sys.argv[2], [base_oid + str(i) for i in indexes])
    
    results: list[float] = []
    for i in indexes:
        oid = base_oid + str(i)
        value = response.get(oid)
        if value is None:
            results.append(float('NaN'))
        else:
            results.append(float(value))
    return tuple(results) # type: ignore


if __name__ == '__main__':
    while True:
        one_min_load, five_min_load, fifteen_min_load = get_load_averages(sys.argv[1], sys.argv[2])
        print(f' 1: {one_min_load}')
        print(f' 5: {five_min_load}')
        print(f'15: {fifteen_min_load}\n')
        time.sleep(10)

This shows that you need to have at least a basic understanding of the SNMP PDU structure: http://www.tcpipguide.com/free/t_SNMPVersion1SNMPv1MessageFormat.htm

As well as at least some concept of how the data is being packed as per the ASN.1 standard: https://www.ranecommercial.com/legacy/note161.html

Case 2: You Want to Add SNMP to Your Own Device

Let’s say you’re making a new device and you want to leverage SNMP to get data off of it.

You’ll pretty much need to already understand all the background discussed for the previous case to even know where to start.

Since you’re not a huge device manufacturer, you’d need to implement the metrics for an existing MIB for this to make any sense, but if you’re already in an environment that uses SNMP, it could make sense to support the MIB that’s already being monitored.

Linux Setup

Enabling SNMP on a Linux device isn’t terribly complicated. There’s a Docker container https://hub.docker.com/r/polinux/snmpd or you can install snmpd with a package manager. https://checkmk.com/blog/how-configure-snmp-linux goes into more details on this setup process.

The two main gotchas are:

  1. The default port 161 is usually restricted, so the server either needs to run as root or use iptables to map the server’s actual port to 161.
  2. Since the security scheme is so bad, snmpd will not return any information by default. You’ll need to customize the snmpd.conf file to set the authentication, and the information that’s exposed.

Custom Setup

For a custom server, PySNMP could be used. It has a few examples that are reasonably high level like https://github.com/etingof/pysnmp/blob/master/examples/v1arch/asyncore/agent/cmdrsp/implementing-scalar-mib-objects-over-ipv4-and-ipv6.py.

For an embedded application, https://github.com/patricklaf/SNMP actually seems like a pretty straightforward Arduino library.

Penetration Testing

As mentioned, SNMP is by default not a very secure protocol. The tool nmap even has a whole series of scripts that probe snmp servers:

Used like: sudo nmap -sU -p 161 --script=snmp-info 192.168.1.1

This is mostly to not totally wave away the security concerns of having SNMP servers running. I would certainly never expose an SNMP server to the internet.

Conclusion

So was it worth spending the time to research this?

One thing that may not come across from exploring these use cases, is that this is not really any more complicated then the alternative ways to collect data. Two obvious alternatives would be implementing an HTTP API (e.x. a Prometheus or OpenTelemetry client) or use MQTT for a more embedded friendly approach.

These approaches would likely be much more complicated if you were approaching it at the same level as these SNMP examples. The thing that they has going for them is that since they’re is built off of immensely more popular technologies. This lets the tooling and documentation reduce the mental burden for getting started. There are 100’s of relatively well supported tools instead of a 1 or 2. This means you’re much more likely to find a library or tool that provides the right level of abstraction and defaults. This lets you get things working without needing to understand a mountain of details for basic tasks.

That lack of support makes it hard to generally recommend SNMP. For me that takes precedence over the technical issues with SNMP with its efficiency, security, and implementation inconsistencies which are covered in https://checkmk.com/blog/snmp-a-necessary-evil.

The only situation where I think it’s worth digging into SNMP is where you have devices like (like routers or printers) you want to collect data from and SNMP is available. Using a SNMP standard MIB beats out scraping a web UI or a proprietary interface.