Take a look at ways to make Python code more efficient

2020-04-02 13:49:04
OfStack

The first move: snake hit seven inches: positioning the bottleneck

First, the first step is to locate the bottleneck. For example, one function can be optimized from 1 second to 0.9 seconds, and another function can be optimized from 1 minute to 30 seconds. According to the short board principle, of course, choose the second one.

An experienced programmer would hesitate here, etc.? Function? So you also have to think about the number of calls, right? If the first function needs to be called 100,000 times in the whole program, and the second function needs to be called 1 time in the whole program, this is not necessarily the case. This chestnut, is to show that the bottleneck of the program sometimes not at a glance to see. Again, the programmer's you should be aware that most of the time, a function that can be optimized from one minute to 30 seconds is easier to catch our attention than a function that can be optimized from one second to 0.9 seconds, because there is a lot of room for improvement.

So, so much nonsense, the first move, profile. This is python's built-in bottleneck locator! Although it offers three options profile, cProfile, and hotshot. There are also built-in and external. However, I think one is enough, external cProfile. The mental method is as follows:


python -m profile  Funny than program .py

The effect of this trick is to output a number of things, such as how many times the function was called, how much time it took, how much of that was spent by the function's children, how much time it took each time, and so on. A picture is worth a thousand words:

Filename :lineno(function): line (function name)
The goods were made several ncalls
Tottime: the total amount of time spent on the product itself, that is, to eliminate the cost of the internal functions
Percall: average time spent percall, tottime divided by ncalls
Cumtime: the total cost of the product and all its internal functions
Percall: similar to the percall above, but cumtime over ncalls
Find the best point to optimize and do it.

Second move: a snake zen: just one move

I remember when I first started to learn Python, one of my upperclassmen told me that Python had a great ideal that everyone who used it could write exactly the same program. The zen of Python has the following:

There should be one -- and preferably only one -- obvious way to do it

So the professional zen master of Python department provided some common functions of only one. I took a look at the legendary PythonWiKi: PerformanceTips, summed up a few "don't how I" "want it".

Don't do this when merging strings:


s = "" for substring in list: s += substring

Be brief:


s = "".join(slist)

Don't do this when formatting strings:


out = "<html>" + head + prologue + query + tail + "</html>"

Be brief:


out = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)

Don't use loops when you don't have to, like,


newlist = [] for word in oldlist: newlist.append(word.upper())

Be brief:


newlist = map(str.upper, oldlist)

Or maozi:


newlist = [s.upper() for s in oldlist]

Dictionary initialization, commonly used:


wdict = {} for word in words: if word not in wdict: wdict[word] = 0 wdict[word] += 1

If you repeat too many words, consider using the maozi pattern to save a lot of judgment:


wdict = {} for word in words: try: wdict[word] += 1 except KeyError: wdict[word] = 1

Minimize the number of function calls and replace them with an inner loop.


x = 0 def doit1(i): global x x = x + i list = range(100000) t = time.time() for i in list: doit1(i)

Be brief:


x = 0 def doit2(list): global x for i in list: x = x + i list = range(100000) t = time.time() doit2(list)

Third move: snake snipe: high speed search

This comes partly from IBM: the Python code performance optimization technique, where the highest level of search algorithm complexity is O (1). That's the Hash Table. I learned a little bit about data structures when I was an undergraduate. You know that Python's lists are implemented in a linked list-like way. If the list is too large, it is very inefficient to use if X in list_a to search and judge in a vast number of items.

I use Python's tuple very little and don't comment. The other two I use a lot are set and dict. These are the implementation methods of similar Hash Table.

So try not to:


k = [10,20,30,40,50,60,70,80,90] for i in xrange(10000): if i in k: #Do something continue

Be brief:


``` k = [10,20,30,40,50,60,70,80,90] k_dict = {i:0 for i in k}

So let's convert list to dictionary


for i in xrange(10000): if i in k_dict: #Do something continue ```

Find the intersection of the list, don't say:


list_a = [1,2,3,4,5]
list_b = [4,5,6,7,8]
list_common = [a for a in list_a if a in list_b]

Be brief:


list_a = [1,2,3,4,5]
list_b = [4,5,6,7,8]
list_common = set(list_a)&set(list_b)

Fourth recruit: small snake snake...... : I can't think of any names, just Tips

Variable switching does not require intermediate variables: a,b = b,a.
If you use python2.x, replace range with xrange, if you use python3.x, range is already xrange, xrange is gone. Instead of generating a list like range, xrange generates an iterator, stored in the province.
You can use the x > y > Z instead of x > Y and y > Z. It's more efficient and readable. Of course in theory x > y
So add of x,y is usually going to be faster than a plus b, right? First, add cannot be used directly. Import operator is needed. Second, my experimental results show that add(x,y) is not as fast as a+b, not to mention the readability sacrifice.
While 1 is actually a little bit faster than while True. We did two experiments, about 15 percent faster.
No snake is better than snake: performance beyond code

Beyond the code, beyond the hardware, is the compiler, and pypy is highly recommended. Pypy is a just-in-time compiler called just-in-time. The feature of this compiler is to compile every sentence, and the difference between the static compiler, I saw a very vivid metaphor on zhihu:

If you're a director, static compilation is having the actors memorize the entire script and then play it for an hour. Dynamic compilation is having an actor perform for two minutes, then think about it, then look at the script, then act for two minutes...

Dynamic compilation and static compilation have their merits, depending on whether you are in a movie or a play.

There is also a Cython that can build some C code into python. I use very little, but it really works at critical moments.