Summary of Python file handling considerations

  • 2020-05-30 20:25:42
  • OfStack

Summary of notes for handling Python files

File handling is a common operation in programming. Opening, closing, renaming, deleting, appending, copying, random reading and writing of files are very easy to understand and use. What needs to be noted is the safe closing of the file, which is easy and convenient with the with statement:


with open(pathname, " r " ) as myfile: 
do_some_with(myfile)

1. File processing of CSV

The csv module handles csv files well, while the Pandas module handles large csv files well, HTML, etc., and provides chunking.

2.XML file processing

For smaller xml files, it is better to use cElementTree or at least ElementTree, and for larger files, lxml.

3. Serialization and normal serialization of file contents

It is very simple to serialize and deserialize with pickle, dump () and load (). However, it is important to note that pickle does not atomize, the data source is sensitive, and there are security risks. Another form of serialization is json. dumps (), load (), good scalability, you can specify decoder, performance is a little worse than pickle.

4. Log file processing

When using the logging module for logging files, it is important to note that logging is thread-safe to avoid multiple processes writing to the same log at the same time.

5. Image file processing

For the usual file image processing, the PIL module is adequate. In the Linux environment, you should pay attention to the installation of the relevant image library, 1 type ImageMagick library is preferred. If it involves deep image processing such as image recognition, it is necessary to seek the help of OpenCV

Thank you for reading, I hope to help you, thank you for your support of this site!


Related articles: