Tutorial on tools for detecting Linux running information in Python

2020-04-02 14:46:59
OfStack

In this article, we will explore how to use the Python language as a tool to detect the various running information of Linux systems. Let's study together.

What kind of Python?

When I refer to Python, I generally mean (link: http://python.org/) (2.7 to be exact). When the same code cannot be run in CPython3 (3.3), we explicitly point it out and give alternative code to explain the differences between them. Make sure you have CPython installed, and if you type python or python3 into a terminal you will see the python prompt appear in your terminal.

Note that all scripts begin with #! The /usr/bin/env python as the first line means that we want the python parser to run the scripts. So, if you use the chmod +x your-script.py command to add executable rights to your script, you can use the./your-script.py command to run your script directly (you'll see this in this article).

Explore platform module

The platform module in the standard library has a number of functions that let us examine various system information. Let's open the Python interpreter and explore some of its functions. Let's start with the platform-uname () function:


>>> import platform
>>> platform.uname()
('Linux', 'fedora.echorand', '3.7.4-204.fc18.x86_64', '#1 SMP Wed Jan 23 16:44:29 UTC 2013', 'x86_64')

If you know the uname command on Linux, you will realize that this function is an interface to the uname command. In Python 2, this function returns a tuple of system type (or kernel type), hostname, version number, release number, host hardware architecture, and processor type. You can use the index to get a single attribute, like this:


>>> platform.uname()[0]
'Linux'

In Python 3, this function returns a default named tuple:


>>> platform.uname()
 
uname_result(system='Linux', node='fedora.echorand',
release='3.7.4-204.fc18.x86_64', version='#1 SMP Wed Jan 23 16:44:29
UTC 2013', machine='x86_64', processor='x86_64')

Since the return value is a default named tuple, we can easily get a single property from the variable name without having to remember the index of each property, like this:


>>> platform.uname().system
'Linux'

The platfrom module also provides some direct interfaces to get the above property values, like these:


>>> platform.system()
'Linux'
 
>>> platform.release()
'3.7.4-204.fc18.x86_64'

The function linx_distribution() returns the details of the Linux distribution you are using. For example, on Fedora 18, this command returns the following information:


>>> platform.linux_distribution()
('Fedora', '18', 'Spherical Cow')

The return value is a tuple consisting of the release name, release number, and code name. You can print which distributions your version of Python supports by using the _supported_dists property:


>>> platform._supported_dists
('SuSE', 'debian', 'fedora', 'redhat', 'centos', 'mandrake',
'mandriva', 'rocks', 'slackware', 'yellowdog', 'gentoo',
'UnitedLinux', 'turbolinux')

If your Linux distribution isn't one of those (or a derivative of one of those), you won't see any useful information when you call the above functions.

The last platfrom function we'll explore is the architecture() function. When you call this function without adding any arguments, it returns a tuple consisting of a bit schema and a Python executable format. Such as:


>>> platform.architecture()
('64bit', 'ELF')

On a 32-bit Linux system, you'll see:


>>> platform.architecture()
('32bit', 'ELF')

If you specify any other system executable as a parameter, you will get a similar result:


>>> platform.architecture(executable='/usr/bin/ls')
('64bit', 'ELF')

We encourage you to explore other functions in the platfrom module so that you can find the version of Python you are currently using. If you're curious about how this module gets this information, you can look at the Lib/ platfro.py file in the Python source directory.

The OS and sys modules are also useful modules for obtaining general properties like the local BYTEORDER. Next, we will explore some common ways to get Linux system information without using the Python standard library module, this time through the proc and sys file systems. Note that the information captured through these file systems varies from hardware architecture to hardware architecture. So keep it in mind as you read this article and write scripts to get system information from these files.

CPU information

/proc/cpuinfo this file contains the processing unit information for your system. For example, here is a Python script that does the same thing as typing cat /proc/cpuinfo on the command line


#! /usr/bin/env python
""" print out the /proc/cpuinfo
  file
"""
 
from __future__ import print_function
 
with open('/proc/cpuinfo') as f:
  for line in f:
    print(line.rstrip('n'))

When you run this script in Python 2 or Python 3, you will see everything in the /proc/cpuinfo file displayed on your screen. In the above script, the rstrip() method removes the newline character from each line.

The next code listing USES the string method startwith() to display the number of processing units on your computer


#! /usr/bin/env python
 
""" Print the model of your
  processing units
 
"""
 
from __future__ import print_function
 
with open('/proc/cpuinfo') as f:
  for line in f:
    # Ignore the blank line separating the information between
    # details about two processing units
    if line.strip():
      if line.rstrip('n').startswith('model name'):
        model_name = line.rstrip('n').split(':')[1]
        print(model_name)

When you run this script, you will see the models of all the processing units on your machine. For example, here's what I saw on my computer:


Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz

So far we've had several ways to capture the architecture of our computer systems. Technically, all of these methods actually represent the architecture of the system kernel that you're running on. So, if your computer is actually a 64-bit machine, but running a 32-bit kernel, the above method will show that your computer is 32-bit. To find out the correct architecture for your computer, you can look at the lm property in the property list in /proc/cpuinfo. The 1m attribute represents Long mode and appears only on 64-bit computers. The following script shows you how:


#! /usr/bin/env python
 
""" Find the real bit architecture
"""
 
from __future__ import print_function
 
with open('/proc/cpuinfo') as f:
  for line in f:
    # Ignore the blank line separating the information between
    # details about two processing units
    if line.strip():
      if line.rstrip('n').startswith('flags')
          or line.rstrip('n').startswith('Features'):
        if 'lm' in line.rstrip('n').split():
          print('64-bit')
        else:
          print('32-bit')

As we've seen so far, we can access the /proc/cpuinfo file and use simple text-processing techniques to read the information we're looking for. In order to make the data available to other programs in a friendly way, it is probably best to convert the content retrieved from /proc/cpuinfo to a standard data mechanism, such as a dictionary type. The approach is simple: if you look at the file, you will see that for each unit of processing there is a key-value pair (in the previous example, when we printed the processor model name, the model name here was a key). The information for each different processor unit is separated by blank lines. This makes it easy to build dictionary data structures with each processing unit's data as the key. Each of these keys has a value, and each value corresponds to all the information in the /proc/cupinfo file for each processing unit. The next code listing shows you how to do this:


#!/usr/bin/env/ python
 
"""
/proc/cpuinfo as a Python dict
"""
from __future__ import print_function
from collections import OrderedDict
import pprint
 
def cpuinfo():
  ''' Return the information in /proc/cpuinfo
  as a dictionary in the following format:
  cpu_info['proc0']={...}
  cpu_info['proc1']={...}
 
  '''
 
  cpuinfo=OrderedDict()
  procinfo=OrderedDict()
 
  nprocs = 0
  with open('/proc/cpuinfo') as f:
    for line in f:
      if not line.strip():
        # end of one processor
        cpuinfo['proc%s' % nprocs] = procinfo
        nprocs=nprocs+1
        # Reset
        procinfo=OrderedDict()
      else:
        if len(line.split(':')) == 2:
          procinfo[line.split(':')[0].strip()] = line.split(':')[1].strip()
        else:
          procinfo[line.split(':')[0].strip()] = ''
 
  return cpuinfo
 
if __name__=='__main__':
  cpuinfo = cpuinfo()
  for processor in cpuinfo.keys():
    print(cpuinfo[processor]['model name'])

This code USES an OrderedDict (ordered dictionary) instead of the usual dictionary type to sort the key-value pairs found in the file before saving them. Therefore, the data information of the first processing unit is shown first, followed by the second, and so on. If you call this function, it will return a dictionary type to you. Each key of the dictionary is a processing unit. Then you can use the key to filter the information you are looking for (as shown in the if/s =' s/s' block). When the above script runs, the model name of each processing unit is printed again (shown by print(cpuinfo[processor]['model name'] statement)


Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz

Memory information

Similar to /proc/cpuinfo, /proc/meminfo contains the main memory information of your computer. The next script generates a dictionary containing the contents of the file and outputs it.


#!/usr/bin/env python
 
from __future__ import print_function
from collections import OrderedDict
 
def meminfo():
  ''' Return the information in /proc/meminfo
  as a dictionary '''
  meminfo=OrderedDict()
 
  with open('/proc/meminfo') as f:
    for line in f:
      meminfo[line.split(':')[0]] = line.split(':')[1].strip()
  return meminfo
 
if __name__=='__main__':
  #print(meminfo())
 
  meminfo = meminfo()
  print('Total memory: {0}'.format(meminfo['MemTotal']))
  print('Free memory: {0}'.format(meminfo['MemFree']))

As seen above, you can also use a specific key to get as much information as you want (shown in the if/s/s ==' s/s/s' sentence). When you run this script, you can see output like the following:


Total memory: 7897012 kB
Free memory: 249508 kB

Network statistics

Next, we'll explore the networking devices of our computer systems. We will retrieve the network interface of the system and the bytes of data sent and received after the system is turned on. This information can be obtained in the /proc/net/dev file. If you review the contents of this file, you will see that the first two lines contain header information - the first column in the i.e. file is the name of the network interface, and the second and third columns show information about bytes received and transmitted (e.g., total sent bytes, number of packets, error statistics, and so on). We are interested in how to obtain the total data sent and received by different network devices. The next code listing shows how we can extract this information from /proc/net/dev:


#!/usr/bin/env python
from __future__ import print_function
from collections import namedtuple
 
def netdevs():
  ''' RX and TX bytes for each of the network devices '''
 
  with open('/proc/net/dev') as f:
    net_dump = f.readlines()
 
  device_data={}
  data = namedtuple('data',['rx','tx'])
  for line in net_dump[2:]:
    line = line.split(':')
    if line[0].strip() != 'lo':
      device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0),
                        float(line[1].split()[8])/(1024.0*1024.0))
 
  return device_data
 
if __name__=='__main__':
 
  netdevs = netdevs()
  for dev in netdevs.keys():
    print('{0}: {1} MiB {2} MiB'.format(dev, netdevs[dev].rx, netdevs[dev].tx))

When you run the script above, the data received and sent from your network device after your last reboot will be output in MiB. As shown below:


em1: 0.0 MiB 0.0 MiB
wlan0: 2651.40951061 MiB 183.173976898 MiB

You may use a persistent storage mechanism and this script to write your own data using the monitor.

process

The /proc directory also contains the directory for each running process. The names of these directories are named after the corresponding process ids. So, if you walk through all the directories in the /proc directory named by Numbers, you'll get a list of ids for all the currently running processes. The process_list() function in the code listing below returns a list of all currently running process ids. The length of this list is equal to the total number of processes the system runs, as you can see by running this script:


#!/usr/bin/env python
"""
 List of all process IDs currently active
"""
 
from __future__ import print_function
import os
def process_list():
 
  pids = []
  for subdir in os.listdir('/proc'):
    if subdir.isdigit():
      pids.append(subdir)
 
  return pids
 
if __name__=='__main__':
 
  pids = process_list()
  print('Total number of running processes:: {0}'.format(len(pids)))

When you run the script above, the output looks like this:

Each process directory contains a large number of other files and directories that contain various information about process invocation commands, Shared libraries used, and more.

Piece of equipment

The following script lists all the block device information by accessing the sysfs virtual file system. You can find all the block devices on your system in the /sys/block directory. Therefore, your system will have /sys/block/sda, /sys/block/ SDB and other similar directories. To find these devices, we can walk through the /sys/block directory and use simple regular expressions to match what we are looking for.


#!/usr/bin/env python
 
"""
Read block device data from sysfs
"""
 
from __future__ import print_function
import glob
import re
import os
 
# Add any other device pattern to read from
dev_pattern = ['sd.*','mmcblk*']
 
def size(device):
  nr_sectors = open(device+'/size').read().rstrip('n')
  sect_size = open(device+'/queue/hw_sector_size').read().rstrip('n')
 
  # The sect_size is in bytes, so we convert it to GiB and then send it back
  return (float(nr_sectors)*float(sect_size))/(1024.0*1024.0*1024.0)
 
def detect_devs():
  for device in glob.glob('/sys/block/*'):
    for pattern in dev_pattern:
      if re.compile(pattern).match(os.path.basename(device)):
        print('Device:: {0}, Size:: {1} GiB'.format(device, size(device)))
 
if __name__=='__main__':
  detect_devs()

If you run the script, you will see output similar to the following:


Device:: /sys/block/sda, Size:: 465.761741638 GiB
Device:: /sys/block/mmcblk0, Size:: 3.70703125 GiB

When I run this script, I insert an extra SD card. So you'll see that the script detects it. You can also extend the script to identify other block devices (such as virtual hard drives).

Build the command line tool

Allowing users to specify command-line arguments to customize the default behavior of a program is a common feature of all Linux command-line tools. The argparse module gives your program a similar interface to the built-in tool interface. The next code listing shows a program that takes all the users on your system and prints out their corresponding login shells.


#!/usr/bin/env python
 
"""
Print all the users and their login shells
"""
 
from __future__ import print_function
import pwd
 
# Get the users from /etc/passwd
def getusers():
  users = pwd.getpwall()
  for user in users:
    print('{0}:{1}'.format(user.pw_name, user.pw_shell))
 
if __name__=='__main__':
  getusers()

When you run the script above, it prints out all the users on your system and their login shells

Now, let's say you want script users to be able to choose whether or not they want to see other users of the system (e.g., daemon, apache). We did this by extending the previous code with the argparse module, like the following.


#!/usr/bin/env python
 
"""
Utility to play around with users and passwords on a Linux system
"""
 
from __future__ import print_function
import pwd
import argparse
import os
 
def read_login_defs():
 
  uid_min = None
  uid_max = None
 
  if os.path.exists('/etc/login.defs'):
    with open('/etc/login.defs') as f:
      login_data = f.readlines()
 
    for line in login_data:
      if line.startswith('UID_MIN'):
        uid_min = int(line.split()[1].strip())
 
      if line.startswith('UID_MAX'):
        uid_max = int(line.split()[1].strip())
 
  return uid_min, uid_max
 
# Get the users from /etc/passwd
def getusers(no_system=False):
 
  uid_min, uid_max = read_login_defs()
 
  if uid_min is None:
    uid_min = 1000
  if uid_max is None:
    uid_max = 60000
 
  users = pwd.getpwall()
  for user in users:
    if no_system:
      if user.pw_uid >= uid_min and user.pw_uid <= uid_max:
        print('{0}:{1}'.format(user.pw_name, user.pw_shell))
    else:
      print('{0}:{1}'.format(user.pw_name, user.pw_shell))
 
if __name__=='__main__':
 
  parser = argparse.ArgumentParser(description='User/Password Utility')
 
  parser.add_argument('--no-system', action='store_true',dest='no_system',
            default = False, help='Specify to omit system users')
 
  args = parser.parse_args()
  getusers(args.no_system)

Run the script above using the whelp option, and you'll see a friendly help message with options (and actions)


$ ./getusers.py --help
usage: getusers.py [-h] [--no-system]
 
User/Password Utility
 
optional arguments:
 -h, --help  show this help message and exit
 --no-system Specify to omit system users

An example of the above script is called as follows:


$ ./getusers.py --no-system
gene:/bin/bash

When you pass an invalid parameter, the script will report an error:


$ ./getusers.py --param
usage: getusers.py [-h] [--no-system]
getusers.py: error: unrecognized arguments: --param

Let's take a quick look at how we used the argparse module's parser= argparse.argumentparser (description='User/Password Utility') in the above script to create a new ArgumentParser object using an optional parameter that describes what the script does.

Then, in the next line of code: parser.add_argument(' wadd-system ', action='store_true', dest='no_system', default = False, help='Specify to omit system users'), add some arguments using the add_argument() method to make the script aware of the command-line options. The first argument to this method is the name of the option that the script user provides as an argument when the script is invoked. The next parameter action = store_true indicates that this is a Boolean option. That is, whether or not this parameter will affect the program to some extent. The dest parameter specifies a variable to hold the option value and supply it to the script. If the user does not provide an option, you can set the default value to False with the parameter default=False. The last parameter is that the script displays help information about this option. Finally, the parse_args() method is used to handle the parameter: args=parser.parse_args(). Once processed, use args.option_dest to get the value of the options provided by the user. Here, option_dest is the dest variable specified when you set the parameters. This line of code getusers(args.no_system) calls getusers() with the value of the no_system option provided by the user as an argument.

The following script shows how you can provide the user with a non-boolean option in your script. This script rewrites the code in listing 6, adding additional options to let you specify which network devices you are interested in detecting.


#!/usr/bin/env python
from __future__ import print_function
from collections import namedtuple
import argparse
 
def netdevs(iface=None):
  ''' RX and TX bytes for each of the network devices '''
 
  with open('/proc/net/dev') as f:
    net_dump = f.readlines()
 
  device_data={}
  data = namedtuple('data',['rx','tx'])
  for line in net_dump[2:]:
    line = line.split(':')
    if not iface:
      if line[0].strip() != 'lo':
        device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0),
                          float(line[1].split()[8])/(1024.0*1024.0))
    else:
      if line[0].strip() == iface:
        device_data[line[0].strip()] = data(float(line[1].split()[0])/(1024.0*1024.0),
                          float(line[1].split()[8])/(1024.0*1024.0))  
  return device_data
 
if __name__=='__main__':
 
  parser = argparse.ArgumentParser(description='Network Interface Usage Monitor')
  parser.add_argument('-i','--interface', dest='iface',
            help='Network interface')
 
  args = parser.parse_args()
 
  netdevs = netdevs(iface = args.iface)
  for dev in netdevs.keys():
    print('{0}: {1} MiB {2} MiB'.format(dev, netdevs[dev].rx, netdevs[dev].tx))

When you run the script with no parameters, it will actually run the same as the previous version. However, you can also specify network devices that you may be interested in. For example:


$ ./net_devs_2.py
 
em1: 0.0 MiB 0.0 MiB
wlan0: 146.099492073 MiB 12.9737148285 MiB
virbr1: 0.0 MiB 0.0 MiB
virbr1-nic: 0.0 MiB 0.0 MiB
 
$ ./net_devs_2.py --help
usage: net_devs_2.py [-h] [-i IFACE]
 
Network Interface Usage Monitor
 
optional arguments:
 -h, --help      show this help message and exit
 -i IFACE, --interface IFACE
            Network interface
 
$ ./net_devs_2.py -i wlan0
wlan0: 146.100307465 MiB 12.9777050018 MiB

Enable your script to run anywhere

With the help of this article, you've probably been able to write yourself one or more useful scripts that you want to use every day, just like any other Linux command. The easiest way to do this is to get these scripts running and to set BASH aliases for these commands. You can also remove the.py suffix and put the file in a standard location like /usr/local/sbin.

Other useful standard library modules

In addition to the standard libraries we've seen so far in this article, there are many others that might be useful: subprocess, ConfigParser, readline, and curses.

The next step?

At this stage, based on your own Python experience and deep exploration of Linux, you choose one of the following. If you've written a lot of shell scripts or command pipelines to explore Linux in depth, try Python. If you want an easier way to write your own scripting tools to perform various tasks, try Python. Finally, if you've already programmed on Linux using other types of Python, have fun exploring Linux in depth using Python.