Detailed explanation of the difference between Python copy of and deepcopy of

  • 2021-07-18 08:17:44
  • OfStack

Recently, in my internship, boss assigned a small task of python. In the process of learning, I found that copy () and deepcopy () are a bit too much for good gay friends, and the bloggers are a bit silly and indistinguishable. However, in the spirit of probing into the end, I still need to check the information to find out the difference between these good gay friends.

In fact, the distinction between copy () and deepcopy () must involve the way python stores data.

First of all, come to the conclusion directly:

-Our common sense of replication is deep replication, that is, the replicated object is completely replicated again and exists as an independent new individual. Therefore, changing the original copied object will not affect the new copied object.

-Shallow copying does not produce an independent object to exist alone, but only labels the original data block with a new label, so when one label is changed, the data block will change and the other label will change accordingly. This is different from copying in our ordinary sense.

For simple object, there is no difference between shallow copy and deep copy

Complex object, such as list in list, and list in shallow copy, are not really "independent" from the original object. That is to say, if you change one element in list, a child of object, your copy will change with one. This is different from our intuitive understanding of "replication".

It doesn't matter if you can't read the text. Let's look at the code


>>> import copy
>>> origin = [1, 2, [3, 4]]
#origin  There are 3 Elements: 1 ,  2 , [3, 4]
>>> cop1 = copy.copy(origin)
>>> cop2 = copy.deepcopy(origin)
>>> cop1 == cop2
True
>>> cop1 is cop2
False 
#cop1  And  cop2  It looks the same, but it is no longer the same 1 A object
>>> origin[2][0] = "hey!" 
>>> origin
[1, 2, ['hey!', 4]]
>>> cop1
[1, 2, ['hey!', 4]]
>>> cop2
[1, 2, [3, 4]]
# Put origin The child inside list [3, 4]  Get rid of it 1 Elements, observe  cop1  And  cop2

You can see that cop1, that is, shallow copy changed with origin. But cop2, that is, deep copy, has not changed.

It seems that deep copy is more in line with our intuitive definition of "replication": once it is copied, it should be independent. If we want a literal "copy", just use deep_copy.

So why is there a "fake" copy like shallow copy? That's what's interesting.

Data storage mode of python

Python stores variables differently than other OOP languages. Rather than assigning a value to a variable, it establishes an reference to a specific value for the variable.

When a = something in Python, it should be understood that something is labeled as a. When you re-assign a value to a, it's like taking the label a off the original something and pasting it on other objects to create a new reference. This explains some strange situations that may be encountered in Python:


>> a = [1, 2, 3]
>>> b = a
>>> a = [4, 5, 6] // Assign a new value to  a
>>> a
[4, 5, 6]
>>> b
[1, 2, 3]
# a  After the value of changes, b  Didn't follow  a  Change 

>>> a = [1, 2, 3]
>>> b = a
>>> a[0], a[1], a[2] = 4, 5, 6 // Change the original  list  Elements in the 
>>> a
[4, 5, 6]
>>> b
[4, 5, 6]
# a  After the value of changes, b  With  a  Changed 

In the above two pieces of code, the value of a has changed. The difference is that the new value (from [1, 2, 3] to [4, 5, 6]) is directly assigned to a in the first code; In paragraph 2, each element in list is changed separately.

The effect on b is different, one did not change the value of b, and the other changed. How to explain this strange difference with the above reasons?

For the first time, [1, 2, 3] is regarded as an item. a = [1, 2, 3] is equivalent to labeling the item a. And b = a is to put another b label on this item.

Case 1:

a = [4, 5, 6] is equivalent to tearing the a label from [1, 2, 3] and pasting it on [4, 5, 6].

In the process, [1, 2, 3] does not disappear. The b is stuck on [1, 2, 3] all along, since this reference has not changed. The value of b is naturally unchanged.

Case 2:

a [0], a [1], a [2] = 4, 5, 6 directly alters [1, 2, 3] itself. Every part inside it has been refitted once. After the internal modification, [1, 2, 3] itself becomes [4, 5, 6].

In the process, a and b did not move, and they were still attached to the item. Therefore, the values of natural a b become [4, 5, 6].

After understanding this, we have to ask, for a shallow copy of a complex object, what happened at copy?
Look at one more piece of code:


>>> import copy
>>> origin = [1, 2, [3, 4]]
#origin  There are 3 Elements: 1 ,  2 , [3, 4]
>>> cop1 = copy.copy(origin)
>>> cop2 = copy.deepcopy(origin)
>>> cop1 == cop2
True
>>> cop1 is cop2
False 
#cop1  And  cop2  It looks the same, but it is no longer the same 1 A object
>>> origin[2][0] = "hey!" 
>>> origin
[1, 2, ['hey!', 4]]
>>> cop1
[1, 2, ['hey!', 4]]
>>> cop2
[1, 2, [3, 4]]
# Put origin The child inside list [3, 4]  Get rid of it 1 Elements, observe  cop1  And  cop2

Those who have studied docker should be familiar with the concept of mirror image. We can apply the concept of mirror image to copy.

copy does not completely copy a child object of a complex object. What is a child object of a complex object? For example, nested sequences in sequences and nested sequences in dictionaries are all children of complex objects. For a child object, python stores it as a public image, and all copies of it are treated as a reference, so the image has been changed when one of the references changes the image and the other uses the image.

So look at origin [2] here, that is, [3, 4] this list. According to the definition of shallow copy, cop1 [2] refers to the same list [3, 4]. So, if we change this list here, it will cause origin and cop1 to change at the same time. That's why origin [2] [0] = "hey!" After that, cop1 became [1, 2, ['hey!', 4]].

deepcopy will copy a single individual for every layer of complex objects.

At this time, origin [2] and cop2 [2] are equal to [3, 4], but they are no longer the same list. That is, replication in our ordinary sense.


Related articles: