python version pit :md5 example of python2 differs from md5 in python3
- 2020-06-07 04:41:30
- OfStack
start
For some characters,python2 and python3's md5 are encrypted differently.
# python2.7
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd).hexdigest()
print checkcode # ea25a328180680aab82b2ef8c456b4ce
# python3.6
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd.encode("utf-8")).hexdigest()
print(checkcode) # b517e074034d1913b706829a1b9d1b67
In terms of code differences will be in
python3
, you need to do the string
encode
Operation, if no, an error will be reported:
checkcode = hashlib.md5(pwd).hexdigest()
TypeError: Unicode-objects must be encoded before hashing
This is because encryption requires converting the string to
bytes
Type,3 the default encoding is
utf-8
So I use utf-8 to decode.
Analysis of the
If it's not in the string
chr(163)
, the result of the two versions is 1, that is to say, the problem is this
chr(163)
In:
# python2.7
>>> chr(163)
'\xa3'
# python3.6
>>> chr(163)
'\xa3'
Let's say pass here
chr
I'm going to get a result of 1, so I'm going to convert it to theta
bytes
See the types:
# python2.7
>>> bytes(chr(163))
'\xa3'
# python3.6
>>> chr(163).encode()
b'\xc2\xa3'
python3, in
num<128
"
chr(num).encode('utf-8')
Get is
encode
0
The character base ascii106, and
num>128
"
chr(num).encode('utf-8')
Get is
两个
Base ascii106 of bytes.
To solve
To switch to
latin1
Encode and decode:
# python3.6
pwd = "xxx" + chr(163) + "fj"
checkcode = hashlib.md5(pwd.encode("latin1")).hexdigest()
print(checkcode) # ea25a328180680aab82b2ef8c456b4ce
additional
Why is it
latin1
Coding. The answer is interesting.
Let's start with the chr function, ok
help(chr)
You can check:
chr(...)
chr(i) -> Unicode character
Return a Unicode string of one character with ordinal i; 0 <= i <= 0x10ffff.
It returns 1 character at the specified position in the Unicode encoding
encode
And then it will be coded
bytes
Type.
In ascii encoding, each character encoding is 1 byte, but only 1-127. More than 128-255 belong to
Extended ASCII
This part is not included in python3 by default, so if you run chr(163).encode("ascii"), you will report an error 'ascii' codec encode '\xa3' position 3: ordinal in range(128)
Therefore, one encoding containing some characters in 128-255 is required, and one Byte is used to fix the large site code, such as
bytes
0
, that is,
bytes
1
Of course there are other codes such as
cp1252
Also contains these characters.