Python USES openpyxl to read and write excel files

  • 2020-06-07 04:43:44
  • OfStack

This is a third square library, and we can handle it xlsx Excel file in format. pip install openpyxl The installation. If you're using Aanconda, you should already have it.

Read the Excel file

You need to import the relevant functions.


from openpyxl import load_workbook

#  The default is read-write and can be specified if necessary write_only and read_only for True
wb = load_workbook('mainbuilding33.xlsx')

The default open file is read-write, and you can specify parameters if necessary read_only for True .

Get the worksheet --Sheet


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

Get cell


#  Get the value of a cell and observe excel Discovery is also a sequence of letters followed by Numbers, that is, columns followed by rows 
b4 = sheet['B4']
#  Return, respectively, 
print(f'({b4.column}, {b4.row}) is {b4.value}') #  The number returned is int type 

#  So instead of just using subscripts, you can use subscripts cell function ,  I'm going to put a number in here B2
b4_too = sheet.cell(row=4, column=2)
print(b4_too.value)

b4.column return B , b4.row Returns 4, and value is the value of that cell. cell also has 1 attribute coordinate Cells like b4 return coordinates B4 .

Get the largest row and the largest column


#  Get the largest column and the largest row 
print(sheet.max_row)
print(sheet.max_column)

Gets the rows and columns

sheet.rows is a generator that contains data for each row, which is then wrapped by one tuple. sheet.columns is similar, but inside each tuple is a cell per column.

#  Returns because it presses a row A1, B1, C1 In this order 
for row in sheet.rows:
  for cell in row:
    print(cell.value)

# A1, A2, A3 In this order 
for column in sheet.columns:
  for cell in column:
    print(cell.value)

The above code gets the data for all the cells. What if you want to get data for a particular row? I'm just going to give it an index, because sheet.rows It's a generator type. You can't use indexes. You can use indexes after converting to list. pip install openpyxl0 This gets the tuple object on line 2.


for cell in list(sheet.rows)[2]:
  print(cell.value)

How do I get arbitrary interval cells?

You can use the range function, written as follows, to obtain all cells with A1 as the upper left corner and B3 as the lower right corner of the rectangular region. Note that range starts at 1, because in openpyxl it is not the programming language's custom to represent the first value with 0 in order to match the expression 1 in Excel.


for i in range(1, 4):
  for j in range(1, 3):
    print(sheet.cell(row=i, column=j))
    
# out
<Cell mainbuilding33.A1>
<Cell mainbuilding33.B1>
<Cell mainbuilding33.A2>
<Cell mainbuilding33.B2>
<Cell mainbuilding33.A3>
<Cell mainbuilding33.B3>

It can also be used as a slice. sheet['A1':'B3'] Returns 1 tuple, which is still a tuple internally, consisting of 1 tuple per row of cells.


for row_cell in sheet['A1':'B3']:
  for cell in row_cell:
    print(cell)
    

for cell in sheet['A1':'B3']:
  print(cell)

# out
(<Cell mainbuilding33.A1>, <Cell mainbuilding33.B1>)
(<Cell mainbuilding33.A2>, <Cell mainbuilding33.B2>)
(<Cell mainbuilding33.A3>, <Cell mainbuilding33.B3>)

Gets the column number based on the letter, returns the letter based on the column number

You need to import these two functions openpyxl.utils


from openpyxl.utils import get_column_letter, column_index_from_string

#  Returns the letter based on the number of the column 
print(get_column_letter(2)) # B
#  Returns the number of the column based on the letter 
print(column_index_from_string('D')) # 4

Write the data to Excel

Worksheet correlation

You need to import WorkBook


from openpyxl import Workbook

wb = Workbook()

This creates a new worksheet (just not yet saved).

To specify write-only mode, you can specify parameters write_only=True . The default writable and readable mode is generally fine.


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

0

Write cell

You can also use formulas


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

1

But if you're reading it you have to add it data_only=True If you read B9, you return the number. If you do not add this parameter, you return the formula itself '=AVERAGE(B2:B8)'.

append function

You can add more than one line at a time, writing from the blank line at line 1 (all below are blank lines).


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

2

Due to the append Functions can only be written on a line. What if we want to write it as a column. Can append meet the requirements? Consider the list above nested as a matrix. You just take the transpose of the matrix. This can be done using the zip() function, but the inner list becomes a tuple. Both are iterable objects, not affected.


list(zip(*rows))

# out
[('Number', 2, 3, 4, 5, 6, 7),
 ('data1', 40, 40, 50, 30, 25, 50),
 ('data2', 30, 25, 30, 10, 5, 10)]

Explain what's up there list(zip(*rows)) First of all, *rows I'm going to break up the list, so I'm going to fill in a bunch of parameters, zip Extracted from a list of the first value is combined into one tuple, to extract the second value from each list combined into one tuple, 1 until the very end of the list of the shortest one value extracted after the end, after a longer list of values, in other words, the number of tuples in the final by the original each parameter (object iteration) determined the shortest length. For example, now delete any value, the minimum length of the list is 2, data2, the value of the 1 column (see vertical) is all deleted.


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

4

Finally, zip returns an zip object, and no data is seen. Just use the list conversion. zip makes it easy to write data as columns.

Save the file

Remember to save the file at the end of all operations. Specify the path and file name with the suffix xlsx.


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

5

Set the cell style --Style

Lead in the required classes read_only0

You can specify font correlation, color, and alignment, respectively.

The font


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

6

The above code specifies the isoline 24 in bold italics with a red font color. Assign the Font object to Font using the font attribute directly.

alignment

The cell attribute is also used directly read_only1 , where vertical center and horizontal center are specified. In addition to center, it can also be used right、left And so on.


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

7

Set the row height and column width

Sometimes the data is too long to display, the need to lengthen the cell.


#  The first 2 High line 
sheet.row_dimensions[2].height = 40
# C Column column width 
sheet.column_dimensions['C'].width = 30

Merge and split cells

A merged cell takes the cell in the upper left corner of the merged area as a reference and overwrites other cells to make them one large cell.

Instead, splitting the cell returns the value of the large cell to its original upper-left position.


#  Get all sheet The name of the 
print(wb.get_sheet_names())
#  According to the sheet The name for sheet
a_sheet = wb.get_sheet_by_name('Sheet1')
#  To obtain sheet The name 
print(a_sheet.title)
#  Gets what is currently displayed sheet,  You can also use wb.get_active_sheet()
sheet = wb.active 

9

After merging, you can only write data to the upper left corner, which is the coordinate in the interval: the left side.

If all the cells to be merged have data, only the upper-left corner is preserved and the rest is discarded. In other words, if data is not written in the upper left corner before merging, there will be no data in the merged cell.

Here is the code to split the cells. After splitting, the value is returned to A1.


sheet.unmerge_cells('A1:C3')

Here we take the commonly used words, specifically look at the openpyxl document


Related articles: