Python USES openpyxl to read and write excel files
- 2020-06-07 04:43:44
- OfStack
This is a third square library, and we can handle it
xlsx
Excel file in format.
pip install openpyxl
The installation. If you're using Aanconda, you should already have it.
Read the Excel file
You need to import the relevant functions.
from openpyxl import load_workbook
# The default is read-write and can be specified if necessary write_only and read_only for True
wb = load_workbook('mainbuilding33.xlsx')
The default open file is read-write, and you can specify parameters if necessary
read_only
for
True
.
Get the worksheet --Sheet
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
Get cell
# Get the value of a cell and observe excel Discovery is also a sequence of letters followed by Numbers, that is, columns followed by rows
b4 = sheet['B4']
# Return, respectively,
print(f'({b4.column}, {b4.row}) is {b4.value}') # The number returned is int type
# So instead of just using subscripts, you can use subscripts cell function , I'm going to put a number in here B2
b4_too = sheet.cell(row=4, column=2)
print(b4_too.value)
b4.column
return
B
,
b4.row
Returns 4, and value is the value of that cell. cell also has 1 attribute
coordinate
Cells like b4 return coordinates
B4
.
Get the largest row and the largest column
# Get the largest column and the largest row
print(sheet.max_row)
print(sheet.max_column)
Gets the rows and columns
sheet.rows is a generator that contains data for each row, which is then wrapped by one tuple. sheet.columns is similar, but inside each tuple is a cell per column.
# Returns because it presses a row A1, B1, C1 In this order
for row in sheet.rows:
for cell in row:
print(cell.value)
# A1, A2, A3 In this order
for column in sheet.columns:
for cell in column:
print(cell.value)
The above code gets the data for all the cells. What if you want to get data for a particular row? I'm just going to give it an index, because
sheet.rows
It's a generator type. You can't use indexes. You can use indexes after converting to list.
pip install openpyxl
0
This gets the tuple object on line 2.
for cell in list(sheet.rows)[2]:
print(cell.value)
How do I get arbitrary interval cells?
You can use the range function, written as follows, to obtain all cells with A1 as the upper left corner and B3 as the lower right corner of the rectangular region. Note that range starts at 1, because in openpyxl it is not the programming language's custom to represent the first value with 0 in order to match the expression 1 in Excel.
for i in range(1, 4):
for j in range(1, 3):
print(sheet.cell(row=i, column=j))
# out
<Cell mainbuilding33.A1>
<Cell mainbuilding33.B1>
<Cell mainbuilding33.A2>
<Cell mainbuilding33.B2>
<Cell mainbuilding33.A3>
<Cell mainbuilding33.B3>
It can also be used as a slice.
sheet['A1':'B3']
Returns 1 tuple, which is still a tuple internally, consisting of 1 tuple per row of cells.
for row_cell in sheet['A1':'B3']:
for cell in row_cell:
print(cell)
for cell in sheet['A1':'B3']:
print(cell)
# out
(<Cell mainbuilding33.A1>, <Cell mainbuilding33.B1>)
(<Cell mainbuilding33.A2>, <Cell mainbuilding33.B2>)
(<Cell mainbuilding33.A3>, <Cell mainbuilding33.B3>)
Gets the column number based on the letter, returns the letter based on the column number
You need to import these two functions
openpyxl.utils
from openpyxl.utils import get_column_letter, column_index_from_string
# Returns the letter based on the number of the column
print(get_column_letter(2)) # B
# Returns the number of the column based on the letter
print(column_index_from_string('D')) # 4
Write the data to Excel
Worksheet correlation
You need to import
WorkBook
from openpyxl import Workbook
wb = Workbook()
This creates a new worksheet (just not yet saved).
To specify write-only mode, you can specify parameters
write_only=True
. The default writable and readable mode is generally fine.
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
0
Write cell
You can also use formulas
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
1
But if you're reading it you have to add it
data_only=True
If you read B9, you return the number. If you do not add this parameter, you return the formula itself '=AVERAGE(B2:B8)'.
append function
You can add more than one line at a time, writing from the blank line at line 1 (all below are blank lines).
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
2
Due to the
append
Functions can only be written on a line. What if we want to write it as a column. Can append meet the requirements? Consider the list above nested as a matrix. You just take the transpose of the matrix. This can be done using the zip() function, but the inner list becomes a tuple. Both are iterable objects, not affected.
list(zip(*rows))
# out
[('Number', 2, 3, 4, 5, 6, 7),
('data1', 40, 40, 50, 30, 25, 50),
('data2', 30, 25, 30, 10, 5, 10)]
Explain what's up there
list(zip(*rows))
First of all,
*rows
I'm going to break up the list, so I'm going to fill in a bunch of parameters,
zip
Extracted from a list of the first value is combined into one tuple, to extract the second value from each list combined into one tuple, 1 until the very end of the list of the shortest one value extracted after the end, after a longer list of values, in other words, the number of tuples in the final by the original each parameter (object iteration) determined the shortest length. For example, now delete any value, the minimum length of the list is 2, data2, the value of the 1 column (see vertical) is all deleted.
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
4
Finally, zip returns an zip object, and no data is seen. Just use the list conversion. zip makes it easy to write data as columns.
Save the file
Remember to save the file at the end of all operations. Specify the path and file name with the suffix xlsx.
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
5
Set the cell style --Style
Lead in the required classes
read_only
0
You can specify font correlation, color, and alignment, respectively.
The font
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
6
The above code specifies the isoline 24 in bold italics with a red font color. Assign the Font object to Font using the font attribute directly.
alignment
The cell attribute is also used directly
read_only
1
, where vertical center and horizontal center are specified. In addition to center, it can also be used
right、left
And so on.
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
7
Set the row height and column width
Sometimes the data is too long to display, the need to lengthen the cell.
# The first 2 High line
sheet.row_dimensions[2].height = 40
# C Column column width
sheet.column_dimensions['C'].width = 30
Merge and split cells
A merged cell takes the cell in the upper left corner of the merged area as a reference and overwrites other cells to make them one large cell.
Instead, splitting the cell returns the value of the large cell to its original upper-left position.
# Get all sheet The name of the
print(wb.get_sheet_names())
# According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
# To obtain sheet The name
print(a_sheet.title)
# Gets what is currently displayed sheet, You can also use wb.get_active_sheet()
sheet = wb.active
9
After merging, you can only write data to the upper left corner, which is the coordinate in the interval: the left side.
If all the cells to be merged have data, only the upper-left corner is preserved and the rest is discarded. In other words, if data is not written in the upper left corner before merging, there will be no data in the merged cell.
Here is the code to split the cells. After splitting, the value is returned to A1.
sheet.unmerge_cells('A1:C3')
Here we take the commonly used words, specifically look at the openpyxl document