OiO.lk Blog pdf Python pdf to docx
pdf

Python pdf to docx


I have a lot of pdf files that have images and tables and I have to convert them to word (.doc or .docx) files. The main problem is that pdf2docx is very slow, but it’s the only library that I have found that is able to convert pdf with images. Do you, guys, know any libraries except pdf2docx or other ways that can do it?

I’ve tried this:

from pdf2docx import parse

parse('output.pdf', 'output2.docx')

And this:

from pdf2docx import Converter

cv = Converter('output.pdf')
cv.convert('output2.docx')
cv.close()



You need to sign in to view this answers

Exit mobile version