I have a tagged PDF (PDF-UA compliance) with simple structure:
I need to add the annotation:
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/">
<annots>
<text color="#FFC333" date="D:20240425000000" flags="print,nozoom,norotate" opacity="0.600000" page="0" rect="43.086,707.79,63.086,732.79" title="ABC">
<contents-richtext>
<body xmlns="http://www.w3.org/1999/xhtml">
<p dir="ltr">Sample comment</p>
</body>
</contents-richtext>
<popup flags="nozoom,norotate" open="yes" page="0" rect="595.0,507.78998,790.0,632.79"/>
</text>
</annots>
</xfdf>
in the nested tag Annot
in the tag P
. Like this (I’ve made in manually in the Adobe Acrobat):
I’m able to import the XFDF XML into the PDF:
PDDocument document = PDDocument.load(new File("tagged.pdf"));
FDFDocument fdfDoc = FDFDocument.loadXFDF(new File("annotation.xfdf"));
List<FDFAnnotation> fdfAnnots = fdfDoc.getCatalog().getFDF().getAnnotations();
List<PDAnnotation> pageAnnotations = new ArrayList<>();
for (int i=0; i < fdfAnnots.size(); i++) {
FDFAnnotation fdfannot = fdfAnnots.get(i);
PDAnnotation pdfannot = PDAnnotation.createAnnotation(fdfannot.getCOSObject());
pdfannot.constructAppearances();
pageAnnotations.add(pdfannot);
}
document.getPage(0).setAnnotations(pageAnnotations);
document.save(new File("tagged_with_annotation.pdf"));
but the veraPDF checker outputs:
An annotation, excluding annotations of subtype Widget, PrinterMark or Link, shall be nested within an Annot tag
and PDF Accessibility Checker (PAC tool) outputs:
Annotation is not nested inside an "Annot" structure element
as expected, because I didn’t set the ‘annot’ tag (and concrete place) for imported annotation.
Also, I’m able to obtain the tag P
object:
PDDocumentCatalog pdDocumentCatalog = document.getDocumentCatalog();
PDStructureTreeRoot pdStructureTreeRoot = pdDocumentCatalog.getStructureTreeRoot();
COSArray aDocument = (COSArray)(pdStructureTreeRoot.getK().getCOSObject());
COSObject oDocument = (COSObject) aDocument.get(0);
COSArray aPart = (COSArray)oDocument.getItem(COSName.K).getCOSObject();
COSObject oPart = (COSObject) aPart.get(0);
COSArray aSect = (COSArray)oPart.getItem(COSName.K).getCOSObject();
COSObject oSect = (COSObject) aSect.get(0);
COSArray aP = (COSArray)oSect.getItem(COSName.K).getCOSObject();
COSObject oP= (COSObject) aP.get(0);
But I don’t understand how to edit the existing tags tree in the PDF.
Question: how to add the tag and annotation into the existing tags tree into PDF using PDFBox?
UPDATE: the example PDF and XFDF here.
UPDATE 2: I’ve updated the file tagged_with_annotation_acrobat.pdf on google drive. The previous version was incorrect – missed Annotation tag.
UPDATE 3: To select the last P, change:
COSArray aP = oSect.getCOSArray(COSName.K);
to
COSArray aH1P = oSect.getCOSArray(COSName.K);
COSDictionary oP = (COSDictionary) aH1P.getObject(1);
COSArray aP = oP.getCOSArray(COSName.K);
and for adding the annotation without outer P inside last P
// add the annotation element
COSDictionary anDict = new COSDictionary();
anDict.setItem(COSName.S, COSName.ANNOT);
anDict.setItem(COSName.P, oP);
anDict.setItem(COSName.PG, page);
PDObjectReference objRef = new PDObjectReference();
anDict.setItem(COSName.K, objRef);
objRef.setReferencedObject(page.getAnnotations().get(0));
aP.add(anDict);
You need to sign in to view this answers