Introduction to the use of Python serialization pickle and cPickle modules

  • 2020-04-02 14:27:36
  • OfStack

The concept of Python serialization is simple. There is a data structure in memory that you want to save, reuse, or send to someone else. What would you do? It depends on how you want to save it, how you want to reuse it, and who you want to send it to. Many games allow you to save progress when you exit, and then return to where you left off when you restarted. In this case, a data structure that captures the current progress needs to be saved to the hard drive when you exit, and then loaded from the hard drive when you restart.

The Python standard library provides the pickle and cPickle modules. CPickle is encoded in C, which is more efficient than pickle, but the types defined in the cPickle module cannot be inherited (in fact, most of the time, we do not need to inherit from these types, and cPickle is recommended). The serialization/deserialization rules for cPickle are the same as for pickle, which serializes an object and can be deserialized using cPickle. At the same time, the two modules get "smarter" when dealing with self-referential types, which do not recursively serialize self-referential objects indefinitely, but only once for multiple references to the same object.

The two main functions in the pickle module are dump() and load(). The dump() function takes a data object and a file handle as arguments, saving the data object in a particular format to a given file. When we use the load () function to retrieve the saved objects from the file, pickle knows how to restore them to their original format.

The dumps() function performs the same serialization as the dump() function. Instead of taking a stream object and saving the serialized data to a disk file, this function simply returns the serialized data.
The load() function performs the same deserialization as the load() function. Instead of taking a stream object and going to the file to read the serialized data, it takes the STR object containing the serialized data and returns the object directly.

CPickle. Dump (obj file, protocol = 0)
Serializes the object and writes the resulting data stream to the file object. The parameter protocol is a serialization mode with a default value of 0, which means serialization as text. The value of protocol can also be 1 or 2 to indicate serialization in binary form.

CPickle. Load (file)
Deserialize objects. Parses the data in the file into a Python object.

Here is a simple example to illustrate the use of the above two methods:


>>> import pickle,cPickle
>>> info_dict = {'name':'yeho','age':100,'Lang':'Python'}
>>> f = open('info.pkl','wb')
>>> pickle.dump(info_dict,f)
>>> f.close()
>>> exit()

# cat info.pkl
(dp0
S'Lang'
p1
S'Python'
p2
sS'age'
p3
I100
sS'name'
p4
S'yeho'
p5
s.

>>> import cPickle
>>> info_dict
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
NameError: name 'info_dict' is not defined
>>> f = open('info.pkl','r+')
>>> info2_dict = cPickle.load(f)
>>> info2_dict
{'Lang': 'Python', 'age': 100, 'name': 'yeho'}
>>> info2_dict['age'] = 110
>>> cPickle.dump(info2_dict,f)
>>> f.close()
>>> exit()

>>> import pickle
>>> f = open('info.pkl','r+')
>>> info_dict = pickle.load(f)
>>> info_dict
{'Lang': 'Python', 'age': 100, 'name': 'yeho'}
>>> info2_dict = pickle.load(f)
>>> info2_dict
{'Lang': 'Python', 'age': 110, 'name': 'yeho'}
>>> info3_dict = pickle.load(f)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib64/python2.6/pickle.py", line 1370, in load
 return Unpickler(file).load()
 File "/usr/lib64/python2.6/pickle.py", line 858, in load
 dispatch[key](self)
 File "/usr/lib64/python2.6/pickle.py", line 880, in load_eof
 raise EOFError
EOFError

Related articles: