Python collections module example

  • 2020-04-02 13:34:00
  • OfStack

Basic introduction to the collections module

As we all know, Python has some built-in data types, such as STR, int, list, tuple, dict, etc. The collections module provides several additional data types on top of these built-in data types:

1. Namedtuple (): generates a subclass of tuple that can access the contents of an element using its name
2. Deque: double-ended queue, which can quickly append and push out objects from the other side
E.g. < 1 > it's not a Counter
4.OrderedDict: ordered dictionary
5. Defaultdict: dictionary with default values

Namedtuple ()

Namedtuples are primarily used to produce data objects that can access elements using their names, and are often used to enhance the readability of code, especially when accessing some tuple type data.

Take a chestnut


# -*- coding: utf-8 -*-
"""
 Let's say we have a data structure where each object has three elements tuple . 
 use namedtuple Methods can be easily passed tuple To generate more readable and useful data structures. 
"""
from collections import namedtuple
websites = [
    ('Sohu', 'http://www.google.com/', u' zhang '),
    ('Sina', 'http://www.sina.com.cn/', u' Zhi-dong wang '),
    ('163', 'http://www.163.com/', u' ding ')
]
Website = namedtuple('Website', ['name', 'url', 'founder'])
for website in websites:
    website = Website._make(website)
    print website
# Result:
Website(name='Sohu', url='http://www.google.com/', founder=u'u5f20u671du9633')
Website(name='Sina', url='http://www.sina.com.cn/', founder=u'u738bu5fd7u4e1c')
Website(name='163', url='http://www.163.com/', founder=u'u4e01u78ca')

A deque

Deque is short for double-ended queue, which translates to double-ended queue, and its greatest benefit is that it can quickly add and remove objects from the head of queue:.popleft(),.appendleft().

You might say, well, a native list can also add and remove objects from the header, right? Like this:


l.insert(0, v)
l.pop(0)

It is worth noting, however, that the time complexity of these two USES of the list object is O(n), which means that the time consumption increases linearly as the number of elements increases. Using deque objects is O(1), so remember to use deque when your code has such requirements.

As a double-ended queue, deque also provides some other handy methods, such as rotate and so on.

Take a chestnut


# -*- coding: utf-8 -*-
"""
 The following is an interesting example, mainly used deque the rotate Method to implement an infinite loop 
 Load animation of 
"""
import sys
import time
from collections import deque
fancy_loading = deque('>--------------------')
while True:
    print 'r%s' % ''.join(fancy_loading),
    fancy_loading.rotate(1)
    sys.stdout.flush()
    time.sleep(0.08)
# Result:
#  An endless loop of racing lights 
------------->-------


Counter

Counters are a very common feature requirement, and collections has been kind enough to provide you with this feature.

Take a chestnut


# -*- coding: utf-8 -*-
"""
 The following example is used Counter The module counts the number of occurrences of all characters in a sentence 
"""
from collections import Counter
s = '''A Counter is a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counter class is similar to bags or multisets in other languages.'''.lower()
c = Counter(s)
#  Get the one that appears most frequently 5 A character 
print c.most_common(5)
# Result:
[(' ', 54), ('e', 32), ('s', 25), ('a', 24), ('t', 24)]

OrderedDict

In Python, dict is a data structure that is unordered due to the nature of hash, which sometimes gets us into trouble. Fortunately, the collections module provides us with OrderedDict, which is the right thing to use when you want to get an ordered dictionary object.

Take a chestnut


# -*- coding: utf-8 -*-
from collections import OrderedDict
items = (
    ('A', 1),
    ('B', 2),
    ('C', 3)
)
regular_dict = dict(items)
ordered_dict = OrderedDict(items)
print 'Regular Dict:'
for k, v in regular_dict.items():
    print k, v
print 'Ordered Dict:'
for k, v in ordered_dict.items():
    print k, v
# Result:
Regular Dict:
A 1
C 3
B 2
Ordered Dict:
A 1
B 2
C 3

defaultdict

We all know that when using Python's native data structure dict, if you access it in such a way as d[key], a KeyError exception will be thrown when the specified key does not exist.

However, if you use defaultdict, as soon as you pass in a default factory method, a request for a nonexistent key will call the factory method and use the result as the default value for the key.


# -*- coding: utf-8 -*-
from collections import defaultdict
members = [
    # Age, name
    ['male', 'John'],
    ['male', 'Jack'],
    ['female', 'Lily'],
    ['male', 'Pony'],
    ['female', 'Lucy'],
]
result = defaultdict(list)
for sex, name in members:
    result[sex].append(name)
print result
# Result:
defaultdict(<type 'list'>, {'male': ['John', 'Jack', 'Pony'], 'female': ['Lily', 'Lucy']})

The resources

The above is a very brief overview of the collections module's main content, the main purpose of which is to be able to remember and use them with half the effort when you come across a good place to use them.

If you want to have a more comprehensive and in-depth understanding of them, it is recommended to read the official documentation and module source code.

https://docs.python.org/2/library/collections.html#module-collections


Related articles: