使用pdfbox获取pdf文本的字体属性-Java 学习之路

我正在研究PDFBox来提取pdf文件的内容 . 我能够提取文本，但我还需要获取文本的字体属性 . 那么有人可以帮我提取字体属性吗？

我也在正确提取某些字符方面遇到问题 . PDFBox给出'？'当它无法识别角色时 . 所以如果可能的话，也给我一些解决问题的建议 .

提前致谢..

1 回答

import org.apache.pdfbox.pdmodel.PDDocument;  
import org.apache.pdfbox.util.PDFTextStripper;  
public class pdf2box {  
    public static void main(String args[])
    {
        try
        {
    PDDocument pddDocument=PDDocument.load("table2.pdf");
    PDFTextStripper textStripper=new PDFTextStripper();
    System.out.println(textStripper.getText(pddDocument));
    textStripper.getFonts();



    pddDocument.close();
        }
        catch(Exception ex)
        {
        ex.printStackTrace();
        }
    }


}

回复于 2024-04-28T03:46:26+08:00

使用pdfbox获取pdf文本的字体属性

1 回答

相关问题