Detailed Explanation of Five Methods of python Decompression and Compression Package
- 2021-07-10 20:09:48
- OfStack
Here we discuss using Python to extract, for example, the following five compressed files:
.gz .tar .tgz .zip .rar
Brief introduction
gz: gzip. Usually only one file can be compressed. Combined with tar, it can be packaged first and then compressed.
tar: Packaging tool under linux system. Just pack. Uncompressed
tgz: tar. gz. The file is packaged with tar and then compressed with gz
zip: Different from gzip. Although similar algorithms are used, multiple files can be packaged and compressed. Just compress the files separately. The compression ratio is lower than that of tar.
rar: Package the zip file. Originally used in DOS, it is based on window operating system.
The compression ratio is higher than that of zip, but the speed is slower. Random questions are also slow.
About various ratios between zip and rar. Visible:
http://www.comicer.com/stronghorse/water/software/ziprar.htm
gz
Because gz1 only compresses one file, it often works with other packaging tools. For example, tar can be packaged into XXX. tar first, and then compressed into XXX. tar. gz
Extracting gz is actually reading out the single 1 file. Python methods are as follows:
import gzip
import os
def un_gz(file_name):
"""ungz zip file"""
f_name = file_name.replace(".gz", "")
# Gets the name of the file, removing the
g_file = gzip.GzipFile(file_name)
# Create gzip Object
open(f_name, "w+").write(g_file.read())
#gzip Object with read() When opened, write to open() In the file created.
g_file.close()
# Shut down gzip Object
tar
After decompressing XXX. tar. gz, XXX. tar is obtained, and it needs to be decompressed in one step.
* Note: tgz is in the same format as tar. gz. The extension of the old version number DOS is up to 3 characters, so it is represented by tgz.
Because there are multiple files here, let's read all the file names first. Then decompress. For example, the following:
import tarfile
def un_tar(file_name):
untar zip file"""
tar = tarfile.open(file_name)
names = tar.getnames()
if os.path.isdir(file_name + "_files"):
pass
else:
os.mkdir(file_name + "_files")
# Because there are many files after decompression, the directory with the same name is established in advance
for name in names:
tar.extract(name, file_name + "_files/")
tar.close()
* Note: tgz files are extracted in the same way as tar files.
zip
Similar to tar, multiple file names are first read and then extracted. For example, the following:
import zipfile
def un_zip(file_name):
"""unzip zip file"""
zip_file = zipfile.ZipFile(file_name)
if os.path.isdir(file_name + "_files"):
pass
else:
os.mkdir(file_name + "_files")
for names in zip_file.namelist():
zip_file.extract(names,file_name + "_files/")
zip_file.close()
rar
Since rar is usually used under window, an additional Python package rarfile is required.
Available address: http://sourceforge.net/projects/rarfile.berlios/files/rarfile-2. 4. tar.gz/download
Unzip to the/Scripts/folder in the Python installation folder, open the command line on the current form,
Input
Python setup.py install
Installation complete.
import rarfile
import os
def un_rar(file_name):
"""unrar zip file"""
rar = rarfile.RarFile(file_name)
if os.path.isdir(file_name + "_files"):
pass
else:
os.mkdir(file_name + "_files")
os.chdir(file_name + "_files"):
rar.extractall()
rar.close()
tar Packaging
When you add a file using tar. add (), you add the path of the file itself, and add arcname to add the file to the tar package according to your own naming rules
Packaging code:
#!/usr/bin/env /usr/local/bin/python
# encoding: utf-8
import tarfile
import os
import time
start = time.time()
tar=tarfile.open('/path/to/your.tar,'w')
for root,dir,files in os.walk('/path/to/dir/'):
for file in files:
fullpath=os.path.join(root,file)
tar.add(fullpath,arcname=file)
tar.close()
print time.time()-start
You can set compression rules during packaging, such as packaging in gz compression format
tar=tarfile.open('/path/to/your.tar.gz','w:gz')
Other formats, such as the following table:
tarfile. open There are many kinds of mode:
mode action
tar Unpack
tar unpacking can also be decompressed according to different compression formats.
#!/usr/bin/env /usr/local/bin/python
# encoding: utf-8
import tarfile
import time
start = time.time()
t = tarfile.open("/path/to/your.tar", "r:")
t.extractall(path = '/path/to/extractdir/')
t.close()
print time.time()-start
The above code is all decompressed, and can also do different processing one by one, but it is assumed that there are too many files in tar package, so be careful of memory ~
tar = tarfile.open(filename, 'r:gz')
for tar_info in tar:
file = tar.extractfile(tar_info)
do_something_with(file)
ps: python Realizes rar File Extraction
1.pip
3 install rarfile
Install the rarfile library
(Note that decompression is not supported.)
#coding=utf-8
import rarfile
path = "E:\\New\\New.rar"
path2 = "E:\\New"
rf = rarfile.RarFile(path) # File to be unzipped
rf.extractall(path2) # Unzip the specified file path
Summarize