Method of operating ppt file through python pptx module
- 2021-08-31 08:32:46
- OfStack
ppt has become a necessary skill for professionals through its exquisite visualization skills and good demonstration effect. The design of ppt is a university question, and special courses are derived from both design skills and operation methods.
This paper mainly introduces the skills of python operating ppt, The advantage of programming is processing speed, For the design of ppt in Gaoda, it still needs to be "people-oriented", so the use scenario of this module is mainly the extraction and addition of basic elements of ppt, which is suitable for the transformation of a large number of contents, such as word to ppt, reducing a lot of tedious manual operations. Although it provides some basic style designs, it cannot meet the aesthetic requirements of ppt in daily office.
In this module, ppt is split into the following elements
1. presentations, representing the entire ppt document
2. sliders. Represents every 1 page of an ppt document
3. shapes
4. placeholders
Common operations corresponding to the above classification are as follows
1. presentations
Used to open, create, and save ppt documents as follows
>>> from pptx import Presentation
# Create a new ppt Document
>>> prs = Presentation()
# Open 1 A ppt Document
>>> prs = Presentation('input.pptx')
# Save ppt Document
>>> prs.save('test.pptx')
2. slides
When creating a 1-page ppt, you need to specify the corresponding layout. In this module, the following 9 layouts are built in
1. Title
2. Title and Content
3. Section Header
4. Two Content
5. Comparison
6. Title Only
7. Blank
8. Content with Caption
9. Picture with Caption
Accessed by numeric subscripts 0 through 9, the usage of adding 1 page ppt to the specified layout is as follows
>>> title_slide_layout = prs.slide_layouts[0]
>>> slide = prs.slides.add_slide(title_slide_layout)
3. shapes
shapes represents a container. When making ppt, various basic elements, such as text boxes, tables, pictures, etc., occupy one part of ppt, or rectangular areas, or other custom shapes. shapes represents the sum of all the base elements, and the corresponding shapes is accessed as follows
shapes = slide.shapes
For shapes, we can get and set its various properties, such as the most commonly used text properties, which are used as follows
>>> shapes.text = 'hello world'
You can also add various elements through the add series of methods, and add text boxes as follows
>>> from pptx.util import Inches, Pt
>>> left = top = width = height = Inches(1)
>>> txBox = slide.shapes.add_textbox(left, top, width, height)
>>> tf = txBox.text_frame
>>> tf.text = "first paragraph"
>>> p = tf.add_paragraph()
>>> p.text = "second paragraph"
The way to add a table is as follows
>>> rows = cols = 2
>>> left = top = Inches(2.0)
>>> width = Inches(6.0)
>>> height = Inches(0.8)
>>> table = shapes.add_table(rows, cols, left, top, width, height).table
>>> table.columns[0].width = Inches(2.0)
>>> table.columns[1].width = Inches(4.0)
>>> # write column headings
>>> table.cell(0, 0).text = 'Foo'
>>> table.cell(0, 1).text = 'Bar'
4. placeholders
shapes represents the sum of all basic elements, while placeholders represents every specific element, so placeholders is a subset of shapes, and the corresponding placeholder is accessed by digital subscript, which is used as follows
>>> slide.placeholders[1]
<pptx.shapes.placeholder.SlidePlaceholder object at 0x03F73A90>
>>> slide.placeholders[1].placeholder_format.idx
1
>>> slide.placeholders[1].name
'Subtitle 2'
placeholders is an existing element on the page. After obtaining the corresponding placeholders, you can add new elements to it through insert series methods.
Understanding the above hierarchy is helpful for us to read and write ppt. In addition to writing operations, you can also batch extract specific elements in ppt by reading operations. Taking text as an example, the extraction methods are as follows
from pptx import Presentation
prs = Presentation(path_to_presentation)
text_runs = []
for slide in prs.slides:
for shape in slide.shapes:
if not shape.has_text_frame:
continue
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
text_runs.append(run.text)
Through this module, we can quickly build the basic framework of ppt, and also extract specific elements in ppt in batches, such as extracting words and converting them into word, or extracting tables and converting them into excel files. In a word, this module is suitable for replacing a large number of tedious manual copy and paste operations.