使用iText将带有表单的PDF转换为仅包含文本的PDF（保留数据）-Java 学习之路

我有多个PDF，使用acroforms和pdfbox填充了多个记录（a.pdf，b.pdf，c [0-9] .pdf，d [0-9] .pdf，ez.pdf） .
生成的文件（aflat.pdf，bflat.pdf，c [0-9] flat.pdf，d [0-9] flat.pdf，ezflat.pdf）应该删除其形式（字典和adobe使用的）但是填写为pdf上保存的原始文本的字段（setReadOnly不是我想要的！） .

PdfStamper只能在不保存内容的情况下删除字段，但我发现了一些对PdfContentByte的引用作为保存内容的方法 . 唉，文档太简短，无法理解我应该如何做到这一点 .

作为最后的手段，我可以使用FieldPosition直接在PDF上书写 . 有没有人遇到过这样的问题？我该如何解决？

UPDATE ： Saving a single page of b.pdf yields a valid bfilled.pdf but a blank bflattened.pdf . 保存整个文档解决了这个问题 .

populateB();
    try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
        //importing the page will corrupt the fields
        /*wrong approach*/doc.importPage((PDPage)pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
        /*wrong approach*/doc.save(stream);
        //save the whole document instead
        pdfDocuments.get(0).save(stream);//<---right approach

    }
    try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
        PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
        stamper.setFormFlattening(true);
        stamper.close();
    }

2 回答

3

使用 PdfStamper.setFormFlattening(true) 删除字段并将其作为内容写入 .

回复于 2024-04-25T23:55:02+08:00

使用acroforms时始终使用整个页面

populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
    //importing the page will corrupt the fields
    doc.importPage((PDPage) pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
    doc.save(stream); 
    //save the whole document instead
    pdfDocuments.get(0).save(stream);

}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
    PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
    stamper.setFormFlattening(true);
    stamper.close();
}

回复于 2024-04-25T23:55:02+08:00

使用iText将带有表单的PDF转换为仅包含文本的PDF（保留数据）

2 回答

相关问题