OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Is there any way that I can identify whether the PDF is edited/tampered and the exact location where the PDF is edited/tampered using Python?

  • Thread starter Thread starter Abhishek Tanksali
  • Start date Start date
A

Abhishek Tanksali

Guest
I am working on identifying forgery/tampering in bank statements PDF documents. Info metadata and XMP metadata is not always present in the PDFs that I have so I am not able to create any generalized rule to identify tampered PDFs. I am using Python libraries such as PyMuPDF, PDFMiner, PyPDF2 etc.

I have 2 questions:

  1. Is there any concrete way to identify whether the PDF is tampered (using Python or any other opensource technology) ?
  2. If the PDF is tampered then which part of the PDF has been tampered (using Python or any other opensource technology)?

Attaching 2 PDFs for reference -

original :- "sbi statment_out2.pdf" link - https://drive.google.com/file/d/1DoWAKYcCudRO-Cwjbgf7RjiJUsF3DD3s/view?usp=sharing

Tampered using Sejda online editor :- "sbi statment_out2_Sejda_edited.pdf link - https://drive.google.com/file/d/1J4eRy9tO3jN8AqEWNrKXtn40G6vdH5G3/view?usp=sharing

In tempered PDF, I have edited '2,412.00' under 'Credit' column to '12.00'.

Kindly let me know in case any open source solution, preferably in Python.

Thanks.
<p>I am working on identifying forgery/tampering in bank statements PDF documents.
Info metadata and XMP metadata is not always present in the PDFs that I have so I am not able to create any generalized rule to identify tampered PDFs. I am using Python libraries such as PyMuPDF, PDFMiner, PyPDF2 etc.</p>
<p>I have 2 questions:</p>
<ol>
<li>Is there any concrete way to identify whether the PDF is tampered (using Python or any other opensource technology) ?</li>
<li>If the PDF is tampered then which part of the PDF has been tampered (using Python or any other opensource technology)?</li>
</ol>
<p>Attaching 2 PDFs for reference -</p>
<p>original :-
"sbi statment_out2.pdf"
link - <a href="https://drive.google.com/file/d/1DoWAKYcCudRO-Cwjbgf7RjiJUsF3DD3s/view?usp=sharing" rel="nofollow noreferrer">https://drive.google.com/file/d/1DoWAKYcCudRO-Cwjbgf7RjiJUsF3DD3s/view?usp=sharing</a></p>
<p>Tampered using Sejda online editor :-
"sbi statment_out2_Sejda_edited.pdf
link - <a href="https://drive.google.com/file/d/1J4eRy9tO3jN8AqEWNrKXtn40G6vdH5G3/view?usp=sharing" rel="nofollow noreferrer">https://drive.google.com/file/d/1J4eRy9tO3jN8AqEWNrKXtn40G6vdH5G3/view?usp=sharing</a></p>
<p>In tempered PDF, I have edited '2,412.00' under 'Credit' column to '12.00'.</p>
<p>Kindly let me know in case any open source solution, preferably in Python.</p>
<p>Thanks.</p>
Continue reading...
 

Latest posts

A
Replies
0
Views
1
Alvah_Franey
A
H
Replies
0
Views
1
habrewning
H
Top