A simple guide to extending the Python program using the C language

  • 2020-05-09 18:47:33
  • OfStack

1. Introduction

Python is a powerful high-level scripting language. Its power is not only reflected in its own functions, but also in its good scalability. For this reason, Python has been favored by more and more people, and has been successfully applied in the development of various large-scale software systems.

Different from other common scripting languages, Python programmers can use API provided by Python to extend the functionality of Python by C or C++, so that they can take advantage of Python's convenient and flexible syntax and functions, and get almost the same performance as C or C++. Slow execution speed is a common feature of almost all scripting languages, and it is also an important factor that has been criticized by people. Python solves this problem by combining C with Python, thus greatly expanding the application scope of scripting languages.

When developing real software systems with Python, C/C++ are often used to extend Python. The most common situation is that there is already a library written in C, and you need to use some of its functionality in Python, and you can use the extended functionality provided by Python. In addition, since Python is essentially a scripting language, it may be difficult for some functions to be implemented by Python to meet the requirements of the actual software system for execution efficiency. At this time, these key code segments can be implemented by C or C++ with the help of the extended functions provided by Python, so as to provide the execution performance of the program.

This article mainly introduces the C language extension interface provided by Python, and how to use these interfaces and C/C++ language to extend the functionality of Python, with specific examples to explain how to implement the function extension of Python.

2. C language interface of Python

Python is a scripting language implemented in C language. It has excellent openness and extensibility, and provides a convenient and flexible application program interface (API), so that C/C++ programmers can extend the functions of Python interpreter at all levels. Before you can use C/C++ to extend the functionality of Python, you must first master the C language interface provided by Python interpretation.
2.1 Python objects (PyObject)

Python is an object-oriented scripting language. All objects are represented as PyObject in Python interpreter. The PyObject structure contains Pointers to all members of Python objects, and the type information and reference count of Python objects are maintained. When programming an Python extension, when an Python object is processed in C or C++, it means that an PyObject structure is maintained.

In Python's C language extension interface, most functions have one or more arguments of PyObject pointer type, and the return value is mostly PyObject pointer.
2.2 reference counting

To simplify memory management, Python implements automatic garbage collection via reference counting. Each object in Python has a reference count that counts how many times it has been referenced in different places. Every time the Python object is referenced once, the corresponding reference count will increase by 1, and every time the Python object is destroyed once, the corresponding reference will decrease by 1. Only when the reference count is zero, the Python object is actually deleted from memory.

The following example shows how the Python interpreter USES reference counting to manage Pyhon objects:

Case 1: refcount py


class refcount:
  # etc.
r1 = refcount() #  The reference count is 1
r2 = r1     #  The reference count is 2
del(r1)     #  The reference count is 1
del(r2)     #  The reference count is 0 , delete the object 

Proper maintenance of reference counts is a key issue when dealing with C/C++ objects, and memory leaks can easily occur if handled poorly. Python's C language interface provides macros to maintain reference counts, most commonly Py_INCREF() to increase the reference count of Python objects by 1 and Py_DECREF() to decrease the reference count of Python objects by 1.
2.3 data types

Python defines six data types: integer, floating point, string, tuple, list, and dictionary. When extending the functionality of Python with C, you should first understand how to convert between the data types of C and Python.

2.3.1 integer, floating point and string

Using integer, floating point, and string data types in Python's C language extension is relatively simple. You just need to know how to generate and maintain them. The following example shows how to use the three data types of Python in the C language:

Example 2: typeifs c


// build an integer
PyObject* pInt = Py_BuildValue("i", 2003);
assert(PyInt_Check(pInt));
int i = PyInt_AsLong(pInt);
Py_DECREF(pInt);
// build a float
PyObject* pFloat = Py_BuildValue("f", 3.14f);
assert(PyFloat_Check(pFloat));
float f = PyFloat_AsDouble(pFloat);
Py_DECREF(pFloat);
// build a string
PyObject* pString = Py_BuildValue("s", "Python");
assert(PyString_Check(pString);
int nLen = PyString_Size(pString);
char* s = PyString_AsString(pString);
Py_DECREF(pString);

2.3.2 tuples

A tuple in Python is a fixed-length array, and when the Python interpreter calls a method in the C language extension, all non-keyword (non-keyword) arguments are passed as tuples. The following example demonstrates how to use the tuple type of Python in the C language:

Example 3: typetuple c


// create the tuple
PyObject* pTuple = PyTuple_New(3);
assert(PyTuple_Check(pTuple));
assert(PyTuple_Size(pTuple) == 3);
// set the item
PyTuple_SetItem(pTuple, 0, Py_BuildValue("i", 2003));
PyTuple_SetItem(pTuple, 1, Py_BuildValue("f", 3.14f));
PyTuple_SetItem(pTuple, 2, Py_BuildValue("s", "Python"));
// parse tuple items
int i;
float f;
char *s;
if (!PyArg_ParseTuple(pTuple, "ifs", &i, &f, &s))
  PyErr_SetString(PyExc_TypeError, "invalid parameter");
// cleanup
Py_DECREF(pTuple);

2.3.3 list

A list in the Python language is an array of variable length. Lists are more flexible than tuples and can be used for random access to the Python objects they store. The following example demonstrates how Python's list type can be used in the C language:

Example 4: typelist c


// create the list
PyObject* pList = PyList_New(3); // new reference
assert(PyList_Check(pList));
// set some initial values
for(int i = 0; i < 3; ++i)
  PyList_SetItem(pList, i, Py_BuildValue("i", i));
// insert an item
PyList_Insert(pList, 2, Py_BuildValue("s", "inserted"));
// append an item
PyList_Append(pList, Py_BuildValue("s", "appended"));
// sort the list
PyList_Sort(pList);
// reverse the list
PyList_Reverse(pList);
// fetch and manipulate a list slice
PyObject* pSlice = PyList_GetSlice(pList, 2, 4); // new reference
for(int j = 0; j < PyList_Size(pSlice); ++j) {
 PyObject *pValue = PyList_GetItem(pList, j);
 assert(pValue);
}
Py_DECREF(pSlice);
// cleanup
Py_DECREF(pList);

2.3.4 dictionary

The dictionary in the Python language is a data type that is accessed by a keyword. The following example demonstrates how to use Python's dictionary type in the C language:

Case 5: typedic c


// create the dictionary
PyObject* pDict = PyDict_New(); // new reference
assert(PyDict_Check(pDict));
// add a few named values
PyDict_SetItemString(pDict, "first", 
           Py_BuildValue("i", 2003));
PyDict_SetItemString(pDict, "second", 
           Py_BuildValue("f", 3.14f));
// enumerate all named values
PyObject* pKeys = PyDict_Keys(); // new reference
for(int i = 0; i < PyList_Size(pKeys); ++i) {
 PyObject *pKey = PyList_GetItem(pKeys, i);
 PyObject *pValue = PyDict_GetItem(pDict, pKey);
 assert(pValue);
}
Py_DECREF(pKeys);
// remove a named value
PyDict_DelItemString(pDict, "second");
// cleanup
Py_DECREF(pDict);

3. C language extensions for Python
3.1 module encapsulation

Once you understand Python's C language interface, you can use these interfaces provided by the Python interpreter to write Python's C language extension, assuming that you have the following C language function:

Example 6: example c


int fact(int n)
{
 if (n <= 1) 
  return 1;
 else 
  return n * fact(n - 1);
}

If you want to call this function in the Python interpreter, you should first implement it as a module in Python. You need to write the corresponding encapsulated interface, as shown below:

Example 7: wrap c


#include <Python.h>
PyObject* wrap_fact(PyObject* self, PyObject* args) 
{
 int n, result;
 
 if (! PyArg_ParseTuple(args, "i:fact", &n))
  return NULL;
 result = fact(n);
 return Py_BuildValue("i", result);
}
static PyMethodDef exampleMethods[] = 
{
 {"fact", wrap_fact, METH_VARARGS, "Caculate N!"},
 {NULL, NULL}
};
void initexample() 
{
 PyObject* m;
 m = Py_InitModule("example", exampleMethods);
}

A typical Python extension module should contain at least three parts: an export function, a list of methods, and an initialization function.
3.2 derived function

To use a function from the C language in the Python interpreter, you first write an appropriate export function for it, wrap_fact in the example above. In Python's C language extension, all exported functions have the same function prototype:


PyObject* method(PyObject* self, PyObject* args);

This function is the interface between the Python interpreter and the C function, with two parameters: self and args. The parameter self is only used if the C function is implemented as an inline method (built-in method), and usually has a null value (NULL). The parameter args contains all the parameters that the Python interpreter will pass to the C function, and these values are usually obtained using the function PyArg_ParseTuple() provided by Python's C language extension interface.

All exported functions return one PyObject pointer. If the corresponding C function does not have a true return value (that is, the return value type is void), a global None object (Py_None) should be returned, and its reference count should be increased by 1, as shown below:


PyObject* method(PyObject *self, PyObject *args) 
{
 Py_INCREF(Py_None);
 return Py_None;
}


3.3 list of methods

The list of methods that can be used by the Python interpreter is given. The list of methods corresponding to the above example is:


static PyMethodDef exampleMethods[] = 
{
 {"fact", wrap_fact, METH_VARARGS, "Caculate N!"},
 {NULL, NULL}
};

Each item in the list of methods consists of four parts: the method name, the exported function, how the arguments are passed, and the method description. The method name is the name used when calling the method from the Python interpreter. The parameter transfer mode specifies the specific form of Python passing parameters to C function. The two options are METH_VARARGS and METH_KEYWORDS. METH_VARARGS is the standard form of parameter transfer, which passes parameters between Python interpreter and C function through the tuple of Python. Parameters are passed between the Python interpreter and the C function through the dictionary type of Python.
3.4 initialization function

All Python extension modules must have an initialization function so that the Python interpreter can properly initialize the module. The Python interpreter specifies that all initializer function names must begin with init, with the module name. For module example, the corresponding initialization function is:


// build an integer
PyObject* pInt = Py_BuildValue("i", 2003);
assert(PyInt_Check(pInt));
int i = PyInt_AsLong(pInt);
Py_DECREF(pInt);
// build a float
PyObject* pFloat = Py_BuildValue("f", 3.14f);
assert(PyFloat_Check(pFloat));
float f = PyFloat_AsDouble(pFloat);
Py_DECREF(pFloat);
// build a string
PyObject* pString = Py_BuildValue("s", "Python");
assert(PyString_Check(pString);
int nLen = PyString_Size(pString);
char* s = PyString_AsString(pString);
Py_DECREF(pString);

0 When Python interpreter needs to import the module, according to the name of the module, to find the corresponding initialization function, 1 denier find call this function for the corresponding initialization, initialization function, by calling the Python C Py_InitModule language extension interface provided by the function (), to Python interpreter registration all methods can be used in the module.
3.5 compile links

To use an extension module written in the C language in the Python interpreter, you must compile it into a dynamically linked library. Taking RedHat Linux 8.0 as an example, here is how to compile the Python extension module written by C into a dynamically linked library:


// build an integer
PyObject* pInt = Py_BuildValue("i", 2003);
assert(PyInt_Check(pInt));
int i = PyInt_AsLong(pInt);
Py_DECREF(pInt);
// build a float
PyObject* pFloat = Py_BuildValue("f", 3.14f);
assert(PyFloat_Check(pFloat));
float f = PyFloat_AsDouble(pFloat);
Py_DECREF(pFloat);
// build a string
PyObject* pString = Py_BuildValue("s", "Python");
assert(PyString_Check(pString);
int nLen = PyString_Size(pString);
char* s = PyString_AsString(pString);
Py_DECREF(pString);

1

3.6 introduce the Python interpreter

When the Python extension module's dynamic link library is generated, the Python interpreter can use the Python extension module, like the Python native module 1, which is also introduced through the import command, as shown below:


// build an integer
PyObject* pInt = Py_BuildValue("i", 2003);
assert(PyInt_Check(pInt));
int i = PyInt_AsLong(pInt);
Py_DECREF(pInt);
// build a float
PyObject* pFloat = Py_BuildValue("f", 3.14f);
assert(PyFloat_Check(pFloat));
float f = PyFloat_AsDouble(pFloat);
Py_DECREF(pFloat);
// build a string
PyObject* pString = Py_BuildValue("s", "Python");
assert(PyString_Check(pString);
int nLen = PyString_Size(pString);
char* s = PyString_AsString(pString);
Py_DECREF(pString);

2

4. The conclusion

As a powerful scripting language, Python will be more widely used in various fields. In order to overcome the slow execution speed of scripting languages, Python provides the corresponding C language extension interface. By implementing the key code that affects the performance of execution in C language, the script written in Python can be greatly improved at runtime to meet the actual needs.


Related articles: