Detail python concurrent access to snmp information and performance tests

  • 2020-05-27 06:10:02
  • OfStack

python & snmp

Getting snmp information with python there are several libraries available, the more common of which is netsnmp and pysnmp Two libraries. There are many examples of two libraries on the web.

This article focuses on how to obtain snmp data concurrently, that is, obtain snmp information of multiple machines at the same time.

netsnmp

Said netsnmp first. The netsnmp of python is actually from the net-snmp package.

python gets the data by calling net-snmp's interface through an c file.

Therefore, when multiple machines are acquired concurrently, coroutine retrieval cannot be used. Because of the use of coroutines, when get data is used, the coroutine will wait for the net-snmp interface to return data, instead of switching CPU to other coroutines while waiting for data, as socket used to do. At this point, there is no difference between using coroutine and serial fetching.

So how do you solve the problem of concurrent fetching? You can use threads, multithreaded fetching (and of course you can use multiple processes). Multiple threads call net-snmp's interface to get data at the same time, and cpu switches back and forth between multiple threads. After 1 thread gets 1 result, it can continue to call the interface to get the next 1 snmp data.

Here I wrote a sample program. First, all the host and oid are put into the queue, and then multiple threads are started to perform the fetch task. For example:


import threading
import time
import netsnmp
import Queue

start_time = time.time()
hosts = ["192.20.150.109", "192.20.150.110", "192.20.150.111", "192.20.150.112", "192.20.150.113", "192.20.150.114",
     "192.20.150.115", "192.20.150.116", "192.20.150.117", "192.20.150.118", "192.20.150.119", "192.20.150.120",
     "192.20.150.121", "192.20.80.148", "192.20.80.149", "192.20.96.59", "192.20.82.14", "192.20.82.15",
     "192.20.82.17", "192.20.82.19", "192.20.82.12", "192.20.80.139", "192.20.80.137", "192.20.80.136",
     "192.20.80.134", "192.20.80.133", "192.20.80.131", "192.20.80.130", "192.20.81.141", "192.20.81.140",
     "192.20.82.26", "192.20.82.28", "192.20.82.23", "192.20.82.21", "192.20.80.128", "192.20.80.127",
     "192.20.80.122", "192.20.81.159", "192.20.80.121", "192.20.80.124", "192.20.81.151", "192.20.80.118",
     "192.20.80.119", "192.20.80.113", "192.20.80.112", "192.20.80.116", "192.20.80.115", "192.20.78.62",
     "192.20.81.124", "192.20.81.125", "192.20.81.122", "192.20.81.121", "192.20.82.33", "192.20.82.31",
     "192.20.82.32", "192.20.82.30", "192.20.81.128", "192.20.82.39", "192.20.82.37", "192.20.82.35",
     "192.20.81.130", "192.20.80.200", "192.20.81.136", "192.20.81.137", "192.20.81.131", "192.20.81.133",
     "192.20.81.134", "192.20.82.43", "192.20.82.45", "192.20.82.41", "192.20.79.152", "192.20.79.155",
     "192.20.79.154", "192.25.76.235", "192.25.76.234", "192.25.76.233", "192.25.76.232", "192.25.76.231",
     "192.25.76.228", "192.25.20.96", "192.25.20.95", "192.25.20.94", "192.25.20.93", "192.24.163.14",
     "192.24.163.21", "192.24.163.29", "192.24.163.6", "192.18.136.22", "192.18.136.23", "192.24.193.2",
     "192.24.193.19", "192.24.193.18", "192.24.193.11", "192.20.157.132", "192.20.157.133", "192.24.212.232",
     "192.24.212.231", "192.24.212.230"]
oids = [".1.3.6.1.4.1.2021.11.9.0",".1.3.6.1.4.1.2021.11.10.0",".1.3.6.1.4.1.2021.11.11.0",".1.3.6.1.4.1.2021.10.1.3.1",
    ".1.3.6.1.4.1.2021.10.1.3.2",".1.3.6.1.4.1.2021.10.1.3.3",".1.3.6.1.4.1.2021.4.6.0",".1.3.6.1.4.1.2021.4.14.0",
    ".1.3.6.1.4.1.2021.4.15.0"]
myq = Queue.Queue()
rq = Queue.Queue()

# the host and oid Of the task 
for host in hosts:
  for oid in oids:
    myq.put((host,oid))

def poll_one_host():
  while True:
    try:
      # The loop fetches tasks from the queue until the queue task is empty 
      host, oid = myq.get(block=False)
      session = netsnmp.Session(Version=2, DestHost=host, Community="cluster",Timeout=3000000,Retries=0)
      var_list = netsnmp.VarList()
      var_list.append(netsnmp.Varbind(oid))
      ret = session.get(var_list)
      rq.put((host, oid, ret, (time.time() - start_time)))
    except Queue.Empty:
      break

thread_arr = []

# Enable multithreading 
num_thread = 50
for i in range(num_thread):
  t = threading.Thread(target=poll_one_host, kwargs={})
  t.setDaemon(True)
  t.start()
  thread_arr.append(t)

# Wait for the mission to be completed 
for i in range(num_thread):
  thread_arr[i].join()

while True:
  try:
    info = rq.get(block=False)
    print info
  except Queue.Empty:
    print time.time() - start_time
    break

In addition to get operations, netsnmp also supports walk operations, that is, traversing an oid.

However, care should be taken when using walk to avoid problems such as high latency. For details, please refer to the previous blog post on the analysis of snmpwalk's high latency problems.

pysnmp

pysnmp is a library of snmp protocol implemented with python. It itself provides support for asynchrony.


import time
import Queue
from pysnmp.hlapi.asyncore import *
t = time.time()
myq = Queue.Queue()

# The callback function. Triggered when data is returned 
def cbFun(snmpEngine, sendRequestHandle, errorIndication, errorStatus, errorIndex, varBinds, cbCtx):
   myq.put((time.time()-t, varBinds))
hosts = ["192.20.150.109", "192.20.150.110", "192.20.150.111", "192.20.150.112", "192.20.150.113", "192.20.150.114",
     "192.20.150.115", "192.20.150.116", "192.20.150.117", "192.20.150.118", "192.20.150.119", "192.20.150.120",
     "192.20.150.121", "192.20.80.148", "192.20.80.149", "192.20.96.59", "192.20.82.14", "192.20.82.15",
     "192.20.82.17", "192.20.82.19", "192.20.82.12", "192.20.80.139", "192.20.80.137", "192.20.80.136",
     "192.20.80.134", "192.20.80.133", "192.20.80.131", "192.20.80.130", "192.20.81.141", "192.20.81.140",
     "192.20.82.26", "192.20.82.28", "192.20.82.23", "192.20.82.21", "192.20.80.128", "192.20.80.127",
     "192.20.80.122", "192.20.81.159", "192.20.80.121", "192.20.80.124", "192.20.81.151", "192.20.80.118",
     "192.20.80.119", "192.20.80.113", "192.20.80.112", "192.20.80.116", "192.20.80.115", "192.20.78.62",
     "192.20.81.124", "192.20.81.125", "192.20.81.122", "192.20.81.121", "192.20.82.33", "192.20.82.31",
     "192.20.82.32", "192.20.82.30", "192.20.81.128", "192.20.82.39", "192.20.82.37", "192.20.82.35",
     "192.20.81.130", "192.20.80.200", "192.20.81.136", "192.20.81.137", "192.20.81.131", "192.20.81.133",
     "192.20.81.134", "192.20.82.43", "192.20.82.45", "192.20.82.41", "192.20.79.152", "192.20.79.155",
     "192.20.79.154", "192.25.76.235", "192.25.76.234", "192.25.76.233", "192.25.76.232", "192.25.76.231",
     "192.25.76.228", "192.25.20.96", "192.25.20.95", "192.25.20.94", "192.25.20.93", "192.24.163.14",
     "192.24.163.21", "192.24.163.29", "192.24.163.6", "192.18.136.22", "192.18.136.23", "192.24.193.2",
     "192.24.193.19", "192.24.193.18", "192.24.193.11", "192.20.157.132", "192.20.157.133", "192.24.212.232",
     "192.24.212.231", "192.24.212.230"]

oids = [".1.3.6.1.4.1.2021.11.9.0",".1.3.6.1.4.1.2021.11.10.0",".1.3.6.1.4.1.2021.11.11.0",".1.3.6.1.4.1.2021.10.1.3.1",
    ".1.3.6.1.4.1.2021.10.1.3.2",".1.3.6.1.4.1.2021.10.1.3.3",".1.3.6.1.4.1.2021.4.6.0",".1.3.6.1.4.1.2021.4.14.0",
    ".1.3.6.1.4.1.2021.4.15.0"]
    
snmpEngine = SnmpEngine()

# Add tasks 
for oid in oids:
  for h in hosts:
    getCmd(snmpEngine,
      CommunityData('cluster'),
      UdpTransportTarget((h, 161), timeout=3, retries=0,),
      ContextData(),
      ObjectType(ObjectIdentity(oid)),
      cbFun=cbFun)
time1 = time.time() - t

# Performing asynchronous fetch snmp
snmpEngine.transportDispatcher.runDispatcher()

# Print the result 
while True:
  try:
    info = myq.get(block=False)
    print info
  except Queue.Empty:
    print time1
    print time.time() - t
    break

pysnmp itself only supports the most basic get and getnext commands, so if you want to use walk, you need to implement it yourself.

The performance test

Both were tested for performance in the same environment. They collected 198 host and 10 oid.

测试组 耗时(sec)
netsnmp(20线程) 6.252
netsnmp(50线程) 3.269
netsnmp(200线程) 3.265
pysnmp 4.812

You can see that the acquisition speed of netsnmp is related to the number of threads. When the number of threads increases to 1, the acquisition time is no longer shortened. Because threading also takes time. The existing thread is sufficient for processing.

The performance of pysnmp is slightly less than 1. A detailed analysis of pysnmp consumed about 1.2s in the addition task (when getCmd was executed) and about 3.3 seconds in the subsequent acquisition.

After increasing the number of oid, the experiment was carried out. host is still 198, oid is 42.

测试组 耗时(sec)
netsnmp(20线程) 30.935
netsnmp(50线程) 12.914
netsnmp(200线程) 4.044
pysnmp 11.043

You can see the gap being widened by one step. With enough threads, netsnmp is significantly more efficient than pysnmp.

Since both of them support the parallel acquisition of multiple host, netsnmp is simpler in terms of ease of use, and netsnmp supports walk. This article recommends netsnmp even more.

Installing netsnmp requires installing net-snmp. If centos is used, then yum is more convenient.


Related articles: