Detailed Python3 data fingerprint MD5 check and comparison

  • 2021-06-28 09:25:59
  • OfStack

The MD5 message digest algorithm (English: MD5 Message-Digest Algorithm), a widely used cryptographic hash function, produces a 128-bit (16-byte) hash value (hash value) to ensure complete and consistent transmission of information.Designed by American cryptographer Ronald Livester (Ronald Linn Rivest), MD5 was released in 1992 to replace the MD4 algorithm.

Summary

The MD5 check code is computed through a hash function and can generate data "fingerprints" of any data, that is, we can use MD5 to compress messages or data into a summary, which means the amount of data becomes smaller, making it easier to compare and verify the integrity and correctness of the data.Since it is almost impossible for two different files to have the same MD5 hash value, any non-malicious change to a file will result in an MD5 hash value change.So the MD5 hash term checks file integrity, especially for file transfers, disk errors, or other situations.

MD5

In Python, we can use the built-in module hashlib to complete the implementation and use of MD5.


import hashlib

m = hashlib.md5()
#  Assume file content 
src = 'I like Python'
m.update(src.encode('utf-8'))
print(m.hexdigest())

Example results:

17008b7417701b0c233b999d20c13f1d

File Verification

Assuming there are two existing files, we need to verify that they are the same


import hashlib


def out_md5(src):
  #  Simple encapsulation 
  m = hashlib.md5()
  m.update(src.encode('utf-8'))
  return m.hexdigest()


with open('1.txt', 'r') as f:
  src = f.read()
  m1 = out_md5(src)
  print(m1)

with open('2.txt', 'r') as f:
  src = f.read()
  m2 = out_md5(src)
  print(m2)

if m1 == m2:
  print(True)
else:
  print(False)

Example results:

bb0c1b519a0a2b8e6c74703e44538c60
43cb091db43a710d85ce45fb202438cd
False


Related articles: