python reads and writes sample code for csv format files
- 2020-06-15 09:46:35
- OfStack
Data analysis often involves accessing data from csv files and writing data to csv files. It is very convenient and easy to read the data in the csv file as dict and DataFrame directly. The following code takes the iris data as an example.
The csv file reads dict
code
# -*- coding: utf-8 -*-
import csv
with open('E:/iris.csv') as csvfile:
reader = csv.DictReader(csvfile, fieldnames=None) # fieldnames The default is None, If the reading csv The file does not have a table header, you need to specify
list_1 = [e for e in reader] # Each row of data as 1 a dict Put it in a linked list
csvfile.close()
print list_1[0]
The output
{'Petal.Length': '1.4', 'Sepal.Length': '5.1', 'Petal.Width': '0.2', 'Sepal.Width': '3.5', 'Species': 'setosa'}
If each piece of data read in needs to be processed separately and the amount of data is large, it is recommended to process it item by item before putting it in.
list_1 = list()
for e in reader:
list_1.append(your_func(e)) # your_func The processing function for each piece of data
Multiple data types of dict are written to csv files
code
# data
data = [
{'Petal.Length': '1.4', 'Sepal.Length': '5.1', 'Petal.Width': '0.2', 'Sepal.Width': '3.5', 'Species': 'setosa'},
{'Petal.Length': '1.4', 'Sepal.Length': '4.9', 'Petal.Width': '0.2', 'Sepal.Width': '3', 'Species': 'setosa'},
{'Petal.Length': '1.3', 'Sepal.Length': '4.7', 'Petal.Width': '0.2', 'Sepal.Width': '3.2', 'Species': 'setosa'},
{'Petal.Length': '1.5', 'Sepal.Length': '4.6', 'Petal.Width': '0.2', 'Sepal.Width': '3.1', 'Species': 'setosa'}
]
# header
header = ['Petal.Length', 'Sepal.Length', 'Petal.Width', 'Sepal.Width', 'Species']
print len(data)
with open('E:/dst.csv', 'wb') as dstfile: # Write mode selection wb Otherwise, I'm free
writer = csv.DictWriter(dstfile, fieldnames=header)
writer.writeheader() # Written to the header
writer.writerows(data) # Batch write
dstfile.close()
The above code writes the data as a whole to the csv file, using the writerows function if you have a lot of data and want to see how much data is being written in real time.
Read the csv file as DataFrame
code
# read csv File for DataFrame
import pandas as pd
dframe = pd.DataFrame.from_csv('E:/iris.csv')
You can also twist it a little bit:
import csv
import pandas as pd
with open('E:/iris.csv') as csvfile:
reader = csv.DictReader(csvfile, fieldnames=None) # fieldnames The default is None, If the reading csv The file does not have a table header, you need to specify
list_1 = [e for e in reader] # Each row of data as 1 a dict Put it in a linked list
csvfile.close()
dfrme = pd.DataFrame.from_records(list_1)
Reads the specified csv file as DataFrame from the zip file
dst.zip contains dst.csv and other files, now read dst.csv without compression as DataFrame.
import pandas as pd
import zipfile
z_file = zipfile.ZipFile('E:/dst.zip')
dframe = pd.read_csv(z_file.open('dst.csv'))
z_file.close()
print dframe
DataFrame writes to csv
dfrme.to_csv('E:/dst.csv', index=False) # Do not number each line
Read the txt file as DataFrame
import pandas as pd
# `path` Is a file path or file handle, `header` The first file 1 Is the row a header, `delimiter` The delimiter for each field, `dtype` The storage type after the data has been read in.
frame = pd.read_table(path, header=None, index_col=False, delimiter='\t', dtype=str)