如何将方程式从docx复制到另一个docx中的特定位置？-Java 学习之路

您好我正在尝试编写一个结合了docx文件的代码 . 这些文件可能包含文本，图像，表格或方程式 . 该代码旨在复制这些对象并将它们附加到基础docx . 我可以使用docx模块's ' add_picture ' and ' add_paragraph'方法复制和合并文本，图像和表格，但我无法对单词方程式执行此操作 . 我决定尝试深入研究docx的xml并从那里复制方程部分 . 我可以将方程式附加到我的基础文档，但是当我继续附加图片，文本和表格时，这些方程式会出现在docx的 end 上 . 我的问题是： why 如果我按照我希望它们出现的顺序遍历附加对象并且 there is a way 以保持代码不将方程式放在docx的末尾，是否会发生这种情况 .

以下是代码的一些概述：

创建基础文档：

文档=文档（ 'basedoc.docx'）

获取要追加的子文档列表
开始循环子文档列表
为每个子文档我迭代并找到不同的父对象和子对象 . 我在这个名为'iter_block_items'（https://github.com/python-openxml/python-docx/issues/276）的网站上发现了这个功能 . 文件项目称为块 .
对于子文档的每个块项目，我对类型，样式以及是否存在等式进行分类：

if isinstance（block，Paragraph）：

if "r:embed" in block._element.xml:

    append content,style, and equation arrays, content being a drawing/image

elif "m:oMathPara" in block._element.xml:

    append content,style, and equation arrays, content being an equation
    equationXml.append(block._element.xml)

elif 'w:br w:type="page"' in block._element.xml:

    append content,style, and equation arrays, content being a page break

else:

    append content,style, and equation arrays), content being text

其他：

append content,style, and equation arrays, content being a table

一旦我拥有了我的内容和样式数组，我就会遍历内容数组并附加表格，图纸，分页符和文本 .

if equationXml[i]=='0': #the content is either an image, table, text, or page break
        if "Table" in str(contentStyle[i]):
                insert table and caption
        else:
            if "drawing" in content[i]:
                insert image and caption

            elif "pageBreak" in content[i]:
                document.add_page_break()
            else:
                insert text
    else:                        #there is an equation present
      document=EquationInsert.(document,equationXml[i])

我的EquationInsert文件有一个名为'AddEquation'的函数，我基本上重写了我的文档对象（其中UpdateableZipFile是我在网上找到的一个快速更新zip文件中的文件的代码）：

def AddEquation(self,document,equationContent):
    document.save('temp.docx')
    z = zipfile.ZipFile('temp.docx')
    tree=etree.parse(z.open('word/document.xml'))
    nmspcDict = tree.getroot().iter().next().nsmap

    for key in nmspcDict:
        ET.register_namespace(key, nmspcDict[key])
    tree2=etree.ElementTree(etree.fromstring(equationContent))
    xmlRoot2=tree2.getroot()
    xmlRoot=tree.getroot()
    xmlRoot[1].append(xmlRoot2) #note that [1] had to be used bc [0] was a comment. need to see if general case or not


    tree.write("document.xml",encoding="utf-8", xml_declaration=True, standalone="yes", pretty_print=True)

    with UpdateableZipFile.UpdateableZipFile("temp.docx","a") as o:
        o.write("document.xml","word/document.xml")

    document = Document('temp.docx') 
    os.remove('document.xml')
    z.close()
    os.remove('temp.docx')
    return document

此代码添加了等式，但是当主代码继续循环遍历子文档项时，方程式只是以某种方式被推到基础文档的末尾 . 我已经尝试从插入方程函数返回docx并从中创建一个新文档但是没有做任何事情 . 如果有人有任何关于如何使方程式的建议没有到达文件的末尾，将非常感激 . 否则，我将不得不冒险了解如何将这些方程式转换为图像= /或docx可以处理的内容 . 我愿意接受解决方案/建议/评论 . 谢谢！

1 回答

0

我'm sure you'll会在XML中找到你的答案 . 您可以使用 opc-diag 方便地浏览.docx "package"中的XML "part" .

Word文档中的段落和表格位于 document.xml 部分中，作为 <w:body> 元素下的子元素 . <w:body> 中的最后一个元素是节元素（ <w:sectPr> IIRC） . 如果你在这个元素之后附加你的方程式，它们将继续浮动到底部，因为新的段落和表格被添加到sectPr元素之上 .

我将使用尽可能短的测试文档，并检查代码生成的XML，将其与您想要的方式（可能是在Word中手工创建）进行比较 . 这应该会很快指出代码中的任何元素排序问题 .

回复于 2024-05-02T11:17:54+08:00

如何将方程式从docx复制到另一个docx中的特定位置？

1 回答

相关问题