Use Psyco to speed up Python

  • 2020-04-02 14:26:14
  • OfStack

Psyco operates strictly at Python runtime. That is, the Python source code is compiled into bytecode by Python commands in exactly the same way as before (except for a few import statements and function calls added for Psyco). But when the Python interpreter runs the application, Psyco checks from time to time to see if it can replace regular Python bytecode operations with some specialized machine code. This specialized compilation is very similar to what the Java just-in-time compiler does (generally, at least) and is architecture-specific. So far, Psyco is only available for i386 CPU architectures. The beauty of Psyco is that you can use the Python code you've been writing (exactly the same!). But it can make it run faster.

How does Psyco work

To fully understand Psyco, you probably need a good grasp of the Python interpreter's eval_frame() function and i386 assembly language. Unfortunately, I can't give an expert opinion on any of them myself - but I think I can summarize Psyco roughly.
In regular Python, the eval_frame() function is an inner loop of the Python interpreter. The eval_frame() function primarily looks at the current bytecode in the execution context and switches control outward to a function that is appropriate for implementing that bytecode. The exact details of what the supporting functions will do often depend on the state of the various Python objects held in memory. Simply put, adding Python objects "2" and "3" and adding objects "5" and "6" produce different results, but both operations are dispatched in a similar manner.
Psyco replaces the eval_frame () function with a compound evaluation unit. Psyco has several ways to improve what Python does. First, Psyco compiles operations into somewhat optimized machine code; Because machine code does the same thing as Python's dispatch function, there is only a slight improvement in itself. Furthermore, the "specialized" content in Psyco compilation is not just a choice of Python bytecode, Psyco also specializes the values of variables known in the execution context. For example, in code similar to the following, the variable x is known within the duration of the loop:


x = 5
l = []
for i in range(1000):
l.append(x*i)

The optimized version of this code does not require you to multiply each I by "the contents of the x variable/object," which is less expensive than simply multiplying each I by 5, omitting the find/indirect reference step.
In addition to creating i386 specific code for small operations, Psyco also caches this compiled machine code for future reuse. If Psyco can recognize that a particular operation is the same as the earlier (" specialized ") operation, it can rely on the cached code without having to recompile the code segment. This saves some time.
But the real time-saver in Psyco is that Psyco divides operations into three different levels. For Psyco, there are run-time, compile-time, and virtual-time variables. Psyco raises and lowers the level of variables as needed. Run-time variables are just the raw bytecode and object structure that the regular Python interpreter handles. Once Psyco compiles operations to machine code, compile-time variables are represented in machine registers and directly accessible memory locations.
The most interesting level is the virtual time variable. Internally, a Python variable is a complete structure with many members-even when the object represents only an integer. Psyco virtual-time variables represent Python objects that might be built if needed, but the details of these objects are ignored until they become Python objects. For example, consider the following assignment:
X is equal to 15 times 14 plus 13 minus 12/11.
Standard Python builds and destroys many objects to calculate this value. Build a complete integer object to hold the (12/11) value; Then "pull" a value from the temporary object's structure and use it to evaluate the new temporary object (13-pyint). Psyco, on the other hand, skips these objects and only computes the values because it knows that it can create an object from the value "if needed."

Using Psyco

Explaining Psyco is relatively difficult, but using Psyco is very easy. Basically, it's all about telling the Psyco module which functions/methods to "specialize". The code for any Python function or class itself does not need to be changed.
There are several ways to specify what Psyco should do. The "shotgun" approach makes it possible to use Psyco for real-time operations anywhere. To do this, place the following lines at the top of the module:


import psyco ; psyco.jit()
from psyco.classes import *

The first line tells Psyco to "work its magic" on all global functions. The second line (in Python 2.2 and above) tells Psyco to do the same for class methods. To determine Psyco's behavior more precisely, use the following command:
Psyco. Bind (somefunc) # or method, a class
Newname = psyco. Proxy (func)
The second form USES func as a standard Python function, but optimizes calls involving newname. In almost all cases except testing and debugging, you will use the psyco.bind() form.

The performance of the Psyco

As magical as Psyco is, using it still requires a bit of thought and testing. The main thing to understand is that Psyco is useful for dealing with blocks that loop multiple times, and it knows how to optimize operations involving integers and floating point Numbers. For operations on acyclic functions and other types of objects, Psyco mostly just adds overhead to its analysis and internal compilation. Furthermore, for applications with a large number of functions and classes, enabling Psyco across the entire application adds a significant burden in terms of machine code compilation and memory usage for this cache. It is much better to selectively bind the functions that can benefit most from Psyco's optimizations.
I started my testing process in a very childish way. I'm just thinking about the applications that I've been running recently, but I'm not thinking about acceleration yet. The first example that came to mind was a Text manipulation program for converting my upcoming Text Processing in Python to LaTeX format. The application USES string methods, regular expressions, and program logic driven primarily by regular expressions and string matching. It was actually a bad choice to use as a test candidate for Psyco, but I used it anyway, and I got started.
In the first test, all I did was add psyco.jit() to the top of the script. This is done without any effort. Unfortunately, the results were (as expected) disappointing. The original script took 8.5 seconds to run, but after Psyco's "acceleration" it took about 12 seconds. So lame! My guess is that the startup overhead required for just-in-time compilation drags down the run time. So next I try to work with a larger input file (made up of multiple copies of the original input file). This time it was a small success, reducing the running time from about 120 seconds to 110 seconds. The acceleration effect was consistent in several runs, but the effect was not significant.
In the second test of this process candidate. I only added the line psyco.bind(main) instead of a total psyco.jit() call, because the main() function does loop multiple times (but only with minimal integer arithmetic). This is nominally better than this. This method cuts the normal running time by a few tenths of a second, and by a few seconds in the case of larger input versions. But there are still no dramatic results (and no harm done).

For a more appropriate Psyco test, I scoured out some of the neural network code I wrote in a previous article (see resources). The code_recognizer application can be "trained" to recognize the possible distribution of different ASCII values written in different programming languages. Something like this might be useful in guessing the file type (for example, missing network packets); But the code is actually completely generic about what it "trains" -- it can easily learn to recognize faces, sounds, or tidal patterns. In any case, the "code recognizer" is based on the Python library BPNN, which the Psyco 4.0 distribution includes (in modified form) as a test case. In this article, it is important to note that the code recognizer does a lot of floating point loops and takes a long time to run. Here we have a good candidate for Psyco testing.
After using it for a while, I built up some detailed information about the use of Psyco. For an application with a small number of classes and functions, there is little difference between using just-in-time binding and using target binding. But the best result is still a few percentage points improvement by selectively binding optimization classes. However, it is more important to understand the scope of the Psyco binding.
The code_recognizer.py script contains lines like this:

Import NN from BPNN
The class NN2 (NN) :
# customized output methods, math core inherited
That is, from Psyco's point of view, the interesting thing is in the class bpnn.nn. Adding psyco.jit() or psyco.bind(NN2) to the code_recognizer.py script doesn't help. For Psyco to perform the desired optimization, you need to add psyco.bind(NN) to code_recognizer.py or psyco.jit() to bpnn.py. Contrary to what you might assume, just-in-time optimization does not occur when an instance is created or when a method is run, but within the scope of the defined class. In addition, binding derived classes does not specialize methods they inherit from elsewhere.
Once the fine details of the appropriate Psyco binding are found, the acceleration is quite noticeable. Using the same test cases and training methods provided in the reference article (500 training modes, 1000 training iterations), the neural network training time was reduced from about 2000 seconds to about 600 seconds - more than three times faster. If the number of iterations is reduced to 10, the multiple of acceleration is reduced proportionally (but the ability to recognize the neural network is not effective), and the intermediate value of the iteration will also change in this way.
I found that with two lines of new code, I could reduce the running time from more than half an hour to about 10 minutes. This acceleration is still likely to be slower than similar applications written in C, and it is certainly slower than the 100-fold acceleration reflected in several separate Psyco test cases. But the application is quite "real", and in many environments the improvements are significant enough.


Related articles: