OiO.lk Blog pdf Delete content from PDF using Python
pdf

Delete content from PDF using Python


I need to cleanse a large number of PDFs from PDF content and leave only an image inside (the structure of the PDFs is always the same).

Here is a screenshot of the PDF content:

The image marked in yellow is the one I want to keep, all those Paths and Texts and the other smaller image are to be deleted. I have checked out some Python libraries for PDF such as PyPDF but it seems to me like it does not allow me to access that content, only comments and annotations and such stuff.

Does anyone have a solution?



You need to sign in to view this answers

Exit mobile version