Python implements full backup and differential backup of website files

  • 2020-04-02 14:27:53
  • OfStack

There were previous writes that used md5 to do differential backups, but this md5 way of writing has the following problems:

The & # 8226; There is a problem with md5sum getting the MD5 value for some soft connections
The & # 8226; Backing up an empty directory is not supported because md5sum cannot get the md5 value of an empty directory
The & # 8226; Permission to modify md5sum cannot be determined

Solutions:

Use the mtime ctime of the file

Mtime (Modified time) is changed when writing to a file as the contents of the file change
Ctime (Create time) is changed as the contents of the Inode change while writing to a file, changing the owner, permissions, or link Settings
Nonsense said directly on the code:


#!/usr/bin/env python
import time,os,sys,cPickle

fileInfo = {}

def logger(time,fileName,status,fileNum):
  f = open('backup.log','a')
  f.write("%st%st%stt%sn" % (time,fileName,status,fileNum))

def tar(sDir,dDir,fileNum):
  command = "tar zcf %s %s >/dev/null 2>&1" % (dDir + ".tar.gz",sDir)
  if os.system(command) == 0:
    logger(time.strftime('%F %X'),dDir + ".tar.gz",'success',fileNum)
  else:
    logger(time.strftime('%F %X'),dDir + ".tar.gz",'failed',fileNum)

def fullBak(path):
  fileNum = 0
  for root,dirs,files in os.walk(path):
    for name in files:
      file = os.path.join(root, name)
      mtime = os.path.getmtime(file)
      ctime = os.path.getctime(file)
      fileInfo[file] = (mtime,ctime)
      fileNum += 1
  f = open(P,'w')
  cPickle.dump(fileInfo,f)
  f.close()
  tar(S,D,fileNum)

def diffBak(path):
  for root,dirs,files in os.walk(path):
    for name in files:
      file = os.path.join(root,name)
      mtime = os.path.getmtime(file)
      ctime = os.path.getctime(file)
      fileInfo[file] = (mtime,ctime)

  if os.path.isfile(P) == 0:
    f = open(P,'w')
    f.close()

  if os.stat(P).st_size == 0:
    f = open(P,'w')
    cPickle.dump(fileInfo,f)
    fileNum = len(fileInfo.keys())
    f.close()
    print fileNum
    tar(S,D,fileNum)
  else:
    f = open(P)
    old_fileInfo = cPickle.load(f)
    f.close()
    difference = dict(set(fileInfo.items())^set(old_fileInfo.items()))
    fileNum = len(difference)
    print fileNum

    difference_file = ' '.join(difference.keys())
    print difference_file

    tar(difference_file,D,fileNum)
    f = open(P,'w')
    cPickle.dump(fileInfo,f)
    f.close()

def Usage():
  print '''
    Syntax: python file_bakcup.py pickle_file model source_dir filename_bk
      model: 1:Full backup 2:Differential backup

    example: python file_backup.py fileinfo.pk 2 /etc etc_$(date +%F)
      explain: Automatically add '.tar.gz' suffix
  '''
  sys.exit()

if len(sys.argv) != 5:
  Usage()

P = sys.argv[1]
M = int(sys.argv[2])
S = sys.argv[3]
D = sys.argv[4]

if M == 1:
  fullBak(S)
elif M == 2:
  diffBak(S)
else:
  print "033[;31mDoes not support this mode033[0m"
  Usage()

Testing:


$ python file_backup.py data.pk 1 data data_$(date +%F) # Full backup 
$ > data/www.jb51.net # Test create file, modify file permissions 
$ chmod 777 data/py/eshop_bk/data.db
$ python file_backup.py data.pk 2 data data_$(date +%F)_1 # Backup the changed files 
2
data/py/eshop_bk/data.db data/www.jb51.net

Read blogger code, very inspired, but there is a problem, if I finish after a full backup, deleted a file, then do the differential backup, can detect the deleted files, but tar execution will go wrong, because this file is not exist, so before the execution of tar, had better use OS. The path. The exists () to judge the existence of differences between the file path and if not, don't executed tar, a file deletion information feedback.


Related articles: