Use ctypes to speed up the execution of Python

  • 2020-05-10 18:23:44
  • OfStack

preface

ctypes is an external library of Python functions. It provides C compatible data types and allows calls to functions in the dynamically linked library/Shared library. It can wrap these libraries for use by Python. This interface, which introduces the C language, can help us do a lot of things, such as a small problem where we need to call the C code to improve performance. You can access the kernel32.dll and msvcrt.dll dynamic link libraries on the Windows system, as well as the libc.so.6 libraries on the Linux system. You can also use your own compiled Shared libraries

Let's start with a simple example where we use Python to find a prime number within 1 million, repeat the process 10 times, and calculate the running time.


import math
from timeit import timeit


def check_prime(x):
  values = xrange(2, int(math.sqrt(x)) + 1)
  for i in values:
    if x % i == 0:
      return False
  return True


def get_prime(n):
  return [x for x in xrange(2, n) if check_prime(x)]

print timeit(stmt='get_prime(1000000)', setup='from __main__ import get_prime',
       number=10)

The output


42.8259568214

Let's write one in C check_prime Function and then import it as a Shared library (dynamically linked library)


#include <stdio.h>
#include <math.h>
int check_prime(int a)
{
  int c;
  for ( c = 2 ; c <= sqrt(a) ; c++ ) {
    if ( a%c == 0 )
      return 0;
  }
  return 1;
}

Use the following command to generate the.so (shared object) file


gcc -shared -o prime.so -fPIC prime.c

import ctypes
import math
from timeit import timeit
check_prime_in_c = ctypes.CDLL('./prime.so').check_prime


def check_prime_in_py(x):
  values = xrange(2, int(math.sqrt(x)) + 1)
  for i in values:
    if x % i == 0:
      return False
  return True


def get_prime_in_c(n):
  return [x for x in xrange(2, n) if check_prime_in_c(x)]


def get_prime_in_py(n):
  return [x for x in xrange(2, n) if check_prime_in_py(x)]


py_time = timeit(stmt='get_prime_in_py(1000000)', setup='from __main__ import get_prime_in_py',
         number=10)
c_time = timeit(stmt='get_prime_in_c(1000000)', setup='from __main__ import get_prime_in_c',
        number=10)
print "Python version: {} seconds".format(py_time)

print "C version: {} seconds".format(c_time)

The output


Python version: 43.4539749622 seconds
C version: 8.56250786781 seconds

We can see a significant performance gap and there are more ways to determine whether a number is prime or not

Let's look at a more complicated example of quicksort

mylib.c


#include <stdio.h>

typedef struct _Range {
  int start, end;
} Range;

Range new_Range(int s, int e) {
  Range r;
  r.start = s;
  r.end = e;
  return r;
}

void swap(int *x, int *y) {
  int t = *x;
  *x = *y;
  *y = t;
}

void quick_sort(int arr[], const int len) {
  if (len <= 0)
    return;
  Range r[len];
  int p = 0;
  r[p++] = new_Range(0, len - 1);
  while (p) {
    Range range = r[--p];
    if (range.start >= range.end)
      continue;
    int mid = arr[range.end];
    int left = range.start, right = range.end - 1;
    while (left < right) {
      while (arr[left] < mid && left < right)
        left++;
      while (arr[right] >= mid && left < right)
        right--;
      swap(&arr[left], &arr[right]);
    }
    if (arr[left] >= arr[range.end])
      swap(&arr[left], &arr[range.end]);
    else
      left++;
    r[p++] = new_Range(range.start, left - 1);
    r[p++] = new_Range(left + 1, range.end);
  }
}

gcc -shared -o mylib.so -fPIC mylib.c

One problem with using ctypes is that the native C code USES a type that may not explicitly correspond to Python. Like what is an array in Python here? List? Again, one array in the array module. So we need to convert

test.py


import ctypes
import time
import random

quick_sort = ctypes.CDLL('./mylib.so').quick_sort
nums = []
for _ in range(100):
  r = [random.randrange(1, 100000000) for x in xrange(100000)]
  arr = (ctypes.c_int * len(r))(*r)
  nums.append((arr, len(r)))

init = time.clock()
for i in range(100):
  quick_sort(nums[i][0], nums[i][1])
print "%s" % (time.clock() - init)

The output


1.874907

Compare with Python list's sort method


42.8259568214
0

The output


42.8259568214
1

As for the structure, you need to define a class that contains the corresponding fields and types


42.8259568214
2

In addition to importing our own C language extension files, we can also import directly the library files provided by the system, such as glibc, the implementation of c standard library under linux


import time
import random
from ctypes import cdll
libc = cdll.LoadLibrary('libc.so.6') # Linux system 
# libc = cdll.msvcrt # Windows system 
init = time.clock()
randoms = [random.randrange(1, 100) for x in xrange(1000000)]
print "Python version: %s seconds" % (time.clock() - init)
init = time.clock()
randoms = [(libc.rand() % 100) for x in xrange(1000000)]
print "C version : %s seconds" % (time.clock() - init)

The output


42.8259568214
4

conclusion

The above is the whole content of this article, I hope to help you to learn or use Python can have 1 definite help, if you have any questions, you can leave a message to communicate.


Related articles: