Six Common Ways of Reading and Writing excel Data Files in python Summary of of

  • 2021-10-27 07:56:14
  • OfStack

Directory 1. python built-in methods (read, readline, readlines) 2. Built-in modules (csv)
3. Use numpy libraries (loadtxt, load, fromfile) 4. Use pandas libraries (read_csv, read_excel, etc.)
5. Read and write excel files (xlrd, xlwt, openpyxl, etc.)
6. Operational databases (pymysql, cx_Oracle, etc.)

What ways can python read data files?

1. python built-in methods (read, readline, readlines)

read (): Reads the entire file content once. The read (size) method is recommended, and the larger the size, the longer the runtime readline (): Read 1 line at a time. Use when there is not enough memory, 1 is not used very much readlines (): Read the whole file once and return to list in line, which is convenient for us to traverse

2. Built-in Module (csv)

python has built-in csv module for reading and writing csv files. csv is a comma separator file, which is one of the most common data storage formats in data science.
csv module can easily complete the reading and writing operations of various volume data. Of course, large data volume needs optimization at the code level.

csv module reads files


#  Read csv Documents 
import csv  
with open('test.csv','r') as myFile:  
    lines=csv.reader(myFile)  
    for line in lines:  
        print (line)  

csv module writes to file


import csv  
with open('test.csv','w+') as myFile:      
    myWriter=csv.writer(myFile)  
    # writerrow1 Row 1 Row writing 
    myWriter.writerow([7,8,9])  
    myWriter.writerow([8,'h','f'])  
    # writerow Multiline write 
    myList=[[1,2,3],[4,5,6]]  
    myWriter.writerows(myList)  

3. Use the numpy library (loadtxt, load, fromfile)

loadtxt method

loadtxt is used to read text files (including txt, csv, etc.) and compressed files in. gz or. bz2 format, provided that the file data must have the same number of values every 1 line.


import numpy as np
# loadtxt() In dtype Parameter is set by default to float
#  Here set to str String for easy display 
np.loadtxt('test.csv',dtype=str)
# out : array(['1,2,3', '4,5,6', '7,8,9'], dtype='<U5')

load method

load is used to read. npy,. npz, or pickled persistence files dedicated to numpy.


import numpy as np
#  Mr. Cheng npy Documents 
np.save('test.npy', np.array([[1, 2, 3], [4, 5, 6]]))
#  Use load Loading npy Documents 
np.load('test.npy')
'''
out:array([[1, 2, 3],
       [4, 5, 6]])
'''

fromfile method

The fromfile method can read simple text data or binary data, and the data comes from binary data saved by tofile method. When reading data, the user needs to specify the element type and modify the shape of the array appropriately.


import numpy as np
x = np.arange(9).reshape(3,3)
x.tofile('test.bin')
np.fromfile('test.bin',dtype=np.int)
# out:array([0, 1, 2, 3, 4, 5, 6, 7, 8])

4. Use the pandas library (read_csv, read_excel, etc.)

pandas is one of the most commonly used analysis libraries for data processing. It can read data files in various formats and output dataframe format.
Such as: txt, csv, excel, json, clipboard, database, html, hdf, parquet, pickled file, sas, stata, etc

read_csv method

The read_csv method is used to read the csv format file and output the dataframe format.


import pandas as pd
pd.read_csv('test.csv')

read_excel method

Read excel files, including xlsx, xls, xlsm formats


import pandas as pd
pd.read_excel('test.xlsx')

read_table method
Read any text file by controlling sep parameters (separators)

read_json method

Read json format file


df = pd.DataFrame([['a', 'b'], ['c', 'd']],index=['row 1', 'row 2'],columns=['col 1', 'col 2'])
j = df.to_json(orient='split')
pd.read_json(j,orient='split')

read_html method

Read html form

read_clipboard method

Read the contents of the clipboard

read_pickle method

Read plckled persistence file

read_sql method

Read the database data, connect the database, and pass in sql statement

read_dhf method

Read hdf5 file, suitable for large file reading

read_parquet method

Read the parquet file

read_sas method

Read the sas file

read_stata method

Read the stata file

read_gbq method

Read google bigquery data

5. Read and write excel files (xlrd, xlwt, openpyxl, etc.)

There are many libraries for python to read and write excel files, including xlrd, xlwt, openpyxl, xlwings and so on.

Main modules:

xlrd Library: Read data from excel, support xls, xlsx xlwt Library: Modify excel. Modification of xlsx format is not supported xlutils Library: Modify 1 existing file in xlw and xlrd openpyxl: Mainly for reading and editing excel in xlsx format xlwings: Read, write and modify xlsx, xls and xlsm format files xlsxwriter: Used to generate excel tables, insert data, insert icons and other table operations, does not support reading Microsoft Excel API: pywin32 needs to be installed, communicates directly with Excel process, and can do anything that can be done in Excel, but it is slow

6. Operational databases (pymysql, cx_Oracle, etc.)

python almost supports the interaction of all databases. After connecting to the database, you can use sql statements for addition, deletion and modification.
Main modules:

pymysql: Used for interaction with the mysql database sqlalchemy: Used for interaction with the mysql database cx_Oracle: Used for interaction with the oracle database sqlite3: Built-in library for interaction with sqlite database pymssql: Used to interact with the sql server database pymongo: Used for interaction with mongodb non-relational databases redis, pyredis: Used for interaction with redis non-relational databases

Related articles: