windows Python implements the pdf file into png format image method

  • 2020-06-12 09:55:26
  • OfStack

An example of windows shows how Python converts pdf files into png images. To share for your reference, specific as follows:

Recently, I need to convert pdf files into pictures. I want to use Python to realize it. So I searched and searched and searched on the Internet for a long time.

1, the first code I found, I tried 1 seems to be the reverse, can only achieve the image to pdf, but not pdf into a picture...

Reference links: https: / / zhidao baidu. com question / 745221795058982452. html

The code is as follows:


#!/usr/bin/env python
import os
import sys
from reportlab.lib.pagesizes import A4, landscape
from reportlab.pdfgen import canvas
f = sys.argv[1]
filename = ''.join(f.split('/')[-1:])[:-4]
f_jpg = filename+'.jpg'
print f_jpg
def conpdf(f_jpg):
 f_pdf = filename+'.pdf'
 (w, h) = landscape(A4)
 c = canvas.Canvas(f_pdf, pagesize = landscape(A4))
 c.drawImage(f, 0, 0, w, h)
 c.save()
 print "okkkkkkkk."
conpdf(f_jpg)

2. The second one is more detailed in the article. Unfortunately, it is the code under linux, so it is still useless.

3. The third article points out that there is a library PythonMagick that can realize this function. It needs to download a library ES30en-0.9.10-cp27-ES32en-win_amd64.

Here I have to say that I made another mistake, because I downloaded a version of python 2.7 from the official website of python. I thought it was a 64-bit version, but it was actually a 32-bit version. As a result, the version of python (32-bit) did not match the version of PythonMagick (64-bit) that I downloaded.

4, then, continue to use the search engine search, find a lot of stackoverflow problem posts, found two code, but to download PyPDF2 and ghostscript module.

First install PyPDF2, PythonMagick and ghostscript modules via pip.


C:\Users\Administrator>pip install PyPDF2
Collecting PyPDF2
 Using cached PyPDF2-1.25.1.tar.gz
Installing collected packages: PyPDF2
 Running setup.py install for PyPDF2
Successfully installed PyPDF2-1.25.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
C:\Users\Administrator>pip install C:\PythonMagick-0.9.10-cp27-none-win_amd64.whl
Processing c:\pythonmagick-0.9.10-cp27-none-win_amd64.whl
Installing collected packages: PythonMagick
Successfully installed PythonMagick-0.9.10
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
C:\Users\Administrator>pip install ghostscript
Collecting ghostscript
 Downloading ghostscript-0.4.1.tar.bz2
Requirement already satisfied (use --upgrade to upgrade): setuptools in c:\python27\lib\site-packages (from ghostscript)
Installing collected packages: ghostscript
 Running setup.py install for ghostscript
Successfully installed ghostscript-0.4.1
You are using pip version 7.1.2, however version 8.1.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

Here's the code

Code 1:


import os
import ghostscript
from PyPDF2 import PdfFileReader, PdfFileWriter
from tempfile import NamedTemporaryFile
from PythonMagick import Image
reader = PdfFileReader(open("C:/deep.pdf", "rb"))
for page_num in xrange(reader.getNumPages()):
 writer = PdfFileWriter()
 writer.addPage(reader.getPage(page_num))
 temp = NamedTemporaryFile(prefix=str(page_num), suffix=".pdf", delete=False)
 writer.write(temp)
 print temp.name
 tempname = temp.name
 temp.close()
 im = Image(tempname)
 #im.density("3000") # DPI, for better quality
 #im.read(tempname)
 im.write("some_%d.png" % (page_num))
 os.remove(tempname)

Code 2:


import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = "C:\deep.pdf"
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
print '1'
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
 im = PythonMagick.Image()
 im.density('300')
 im.read(pdffilename + '[' + str(p) +']')
 im.write('file_out-' + str(p)+ '.png')
 #print pdffilename + '[' + str(p) +']','file_out-' + str(p)+ '.png'

The error message in code 2 is as follows:


Traceback (most recent call last):
 File "C:\c.py", line 15, in <module>
 im.read(pdffilename + '[' + str(p) +']')
RuntimeError: pythonw.exe: PostscriptDelegateFailed `C:\DEEP.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/713

Always report an error on the above im.read (pdffilename +' [' + str(p) +'] line 1.

Therefore, according to the error message, I searched on the Internet, but found no useful information, but I thought it should be related to GhostScript, so I searched the installation package on the Internet and found a download connection on github, but when I clicked in, it showed that it could not be downloaded.

Finally, I found this file in the download of csdn: ES80en_ES81en_9.15_ES82en32_ES83en64. After installing the 64-bit version, I ran the above code again and it worked.

But the code needed to do the following 2 changes, otherwise will quote No such file or directory @ error/pdf c/ReadPDFImage / 713 error:


# code 2
import sys
import PyPDF2
import PythonMagick
import ghostscript
pdffilename = "C:\deep.pdf"
pdf_im = PyPDF2.PdfFileReader(file(pdffilename, "rb"))
print '1'
npage = pdf_im.getNumPages()
print('Converting %d pages.' % npage)
for p in range(npage):
 im = PythonMagick.Image(pdffilename + '[' + str(p) +']')
 im.density('300')
 #im.read(pdffilename + '[' + str(p) +']')
 im.write('file_out-' + str(p)+ '.png')
 #print pdffilename + '[' + str(p) +']','file_out-' + str(p)+ '.png'

This time, I have a very profound experience, that is, in the process of solving this problem, most of the time is spent in searching for information and verifying whether the qualification information is useful. The ability to search for information is very important.

In the actual search process, there were only a few articles about PythonMagick in China, and most of the helpful articles were from foreign countries. However, these foreign articles did not solve my problem or provide useful clues. Finally, I solved the problem through my own thinking.

More about Python related content interested readers to view this site project: Python pictures skills summary, "Python data structure and algorithm tutorial", "Python Socket programming skills summary", "Python function using techniques", "Python string skills summary", "Python introduction and advanced tutorial" and "Python file and directory skills summary"

I hope this article is helpful for Python programming.


Related articles: