Method of operating ppt file through python pptx module

2021-08-31 08:32:46
OfStack

ppt has become a necessary skill for professionals through its exquisite visualization skills and good demonstration effect. The design of ppt is a university question, and special courses are derived from both design skills and operation methods.

This paper mainly introduces the skills of python operating ppt, The advantage of programming is processing speed, For the design of ppt in Gaoda, it still needs to be "people-oriented", so the use scenario of this module is mainly the extraction and addition of basic elements of ppt, which is suitable for the transformation of a large number of contents, such as word to ppt, reducing a lot of tedious manual operations. Although it provides some basic style designs, it cannot meet the aesthetic requirements of ppt in daily office.

In this module, ppt is split into the following elements

1. presentations, representing the entire ppt document

2. sliders. Represents every 1 page of an ppt document

3. shapes

4. placeholders

Common operations corresponding to the above classification are as follows

1. presentations

Used to open, create, and save ppt documents as follows


>>> from pptx import Presentation
#  Create a new ppt Document 
>>> prs = Presentation()
#  Open 1 A ppt Document 
>>> prs = Presentation('input.pptx')
#  Save ppt Document 
>>> prs.save('test.pptx')

2. slides

When creating a 1-page ppt, you need to specify the corresponding layout. In this module, the following 9 layouts are built in

1. Title

2. Title and Content

3. Section Header

4. Two Content

5. Comparison

6. Title Only

7. Blank

8. Content with Caption

9. Picture with Caption

Accessed by numeric subscripts 0 through 9, the usage of adding 1 page ppt to the specified layout is as follows


>>> title_slide_layout = prs.slide_layouts[0]
>>> slide = prs.slides.add_slide(title_slide_layout)

3. shapes

shapes represents a container. When making ppt, various basic elements, such as text boxes, tables, pictures, etc., occupy one part of ppt, or rectangular areas, or other custom shapes. shapes represents the sum of all the base elements, and the corresponding shapes is accessed as follows


shapes = slide.shapes

For shapes, we can get and set its various properties, such as the most commonly used text properties, which are used as follows


>>> shapes.text = 'hello world'

You can also add various elements through the add series of methods, and add text boxes as follows


>>> from pptx.util import Inches, Pt
>>> left = top = width = height = Inches(1)
>>> txBox = slide.shapes.add_textbox(left, top, width, height)
>>> tf = txBox.text_frame
>>> tf.text = "first paragraph"
>>> p = tf.add_paragraph()
>>> p.text = "second paragraph"

The way to add a table is as follows


>>> rows = cols = 2
>>> left = top = Inches(2.0)
>>> width = Inches(6.0)
>>> height = Inches(0.8)
>>> table = shapes.add_table(rows, cols, left, top, width, height).table
>>> table.columns[0].width = Inches(2.0)
>>> table.columns[1].width = Inches(4.0)
>>> # write column headings
>>> table.cell(0, 0).text = 'Foo'
>>> table.cell(0, 1).text = 'Bar'

4. placeholders

shapes represents the sum of all basic elements, while placeholders represents every specific element, so placeholders is a subset of shapes, and the corresponding placeholder is accessed by digital subscript, which is used as follows


>>> slide.placeholders[1]
<pptx.shapes.placeholder.SlidePlaceholder object at 0x03F73A90>
>>> slide.placeholders[1].placeholder_format.idx
1
>>> slide.placeholders[1].name
'Subtitle 2'

placeholders is an existing element on the page. After obtaining the corresponding placeholders, you can add new elements to it through insert series methods.

Understanding the above hierarchy is helpful for us to read and write ppt. In addition to writing operations, you can also batch extract specific elements in ppt by reading operations. Taking text as an example, the extraction methods are as follows


from pptx import Presentation
 
prs = Presentation(path_to_presentation)
 
text_runs = []
 
for slide in prs.slides:
 for shape in slide.shapes:
  if not shape.has_text_frame:
   continue
  for paragraph in shape.text_frame.paragraphs:
   for run in paragraph.runs:
    text_runs.append(run.text)

Through this module, we can quickly build the basic framework of ppt, and also extract specific elements in ppt in batches, such as extracting words and converting them into word, or extracting tables and converting them into excel files. In a word, this module is suitable for replacing a large number of tedious manual copy and paste operations.