首页 文章

展平嵌套在文本节点中的子元素

提问于
浏览
1

这里有许多扁平化问题,但没有一个涉及这种复杂程度 .

我有一个xml文档,看起来像:

<document>
<div class='target-one'>
    maybe some text node, maybe not...1
    <randomElement>
        maybe some text node, maybe not...2
    </randomElement>

    <div class='target-one'>
        <randomElement>
            maybe some text node, maybe not...3
        </randomElement>
    </div>
    maybe some text node, maybe not...4
    <randomElement>
        maybe some text node, maybe not...5
    </randomElement>

    <div class='target-two'>
        maybe some text node, maybe not...6
        <randomElement>
            maybe some text node, maybe not...7
        </randomElement>
    </div>
    maybe some text node, maybe not...8
    <randomElement>
        maybe some text node, maybe not...9
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...10
    <randomElement>
        maybe some text node, maybe not...11
    </randomElement>

    <div class='target-one'>
        <randomElement>
            maybe some text node, maybe not...12
        </randomElement>
    </div>
    maybe some text node, maybe not...13
    <randomElement>
        maybe some text node, maybe not...14
    </randomElement>

    <div class='target-two'>
        maybe some text node, maybe not...15
        <randomElement>
            maybe some text node, maybe not...16
        </randomElement>
    </div>
    maybe some text node, maybe not...17
    <randomElement>
        maybe some text node, maybe not...18
    </randomElement>
</div>

</document>

因此,有一个目标元素列表,可以按任何顺序嵌套 . 我想通过添加更多的父元素来嵌套randomElement和节点,同时使目标子节点成为目标兄弟节点,从而在嵌套时展平它们 . 我的意思是输出应该如下所示:

<document>
<div class='target-one'>
    maybe some text node, maybe not...1
    <randomElement>
        maybe some text node, maybe not...2
    </randomElement>
</div>
<div class='target-one'>
    <randomElement>
        maybe some text node, maybe not...3
    </randomElement>
</div>
<div class='target-one'>
    maybe some text node, maybe not...4
    <randomElement>
        maybe some text node, maybe not...5
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...6
    <randomElement>
        maybe some text node, maybe not...7
    </randomElement>
</div>
<div class='target-one'>
    maybe some text node, maybe not...8
    <randomElement>
        maybe some text node, maybe not...9
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...10
    <randomElement>
        maybe some text node, maybe not...11
    </randomElement>
</div>
<div class='target-one'>
    <randomElement>
        maybe some text node, maybe not...12
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...13
    <randomElement>
        maybe some text node, maybe not...14
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...15
    <randomElement>
        maybe some text node, maybe not...16
    </randomElement>
</div>
<div class='target-two'>
    maybe some text node, maybe not...17
    <randomElement>
        maybe some text node, maybe not...18
    </randomElement>
</div>

</document>

所以我结束了更多的父div,但所有文本和其他节点都在正确的位置 . 请注意,randomElement可能是一个不是目标类的div ...

这是为了重新格式化电子书以便在在线库中进行分页,因此在我们实际遇到问题div之前可能存在大量元素 . 因此,我们需要一些方法来选择问题子div之间的所有元素和文本节点作为一个组,因为如果它们都包含在它们自己的div中,那就没有好处 - 我们将最终将每个p,em或span作为一个组合它自己的页面 .

与此同时,大多数父母的孩子都没有问题 . 只要解决方案通过它们,我就可以用另一个运行清理任何空的div,但我确实需要这个至少在基本级别上工作,文本也没有子元素 .

这是我在StackOverflow上的第一个问题,因为我只是没有得到这个必要的递归 .

谢谢!

基于用户52889的答案进行编辑 . 这从来没有成功,但为了便于阅读,我将它留在这里:

我可以在撒克逊人中解雇的XSL:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="2.0">
<xsl:output method="html"
        indent="yes"
        encoding="utf-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>
<xsl:template match="/"> 
    <xsl:apply-templates />  
</xsl:template>
<xsl:template match="div[matches(@class,'target-one|target-two','i')]">
    <xsl:for-each select="node()">
        <xsl:choose>
            <xsl:when test="self::*[matches(@class,'target-one|target-two','i')]">
                <xsl:apply-templates select="."/>
            </xsl:when>
            <xsl:when test="preceding-sibling::node()[0][not(self::*[matches(@class,'target-one|target-two','i')])]">
                <!-- do nothing, it will be handled by the next case -->
            </xsl:when>
            <xsl:otherwise>
                <!--
      create a copy of the element matched by the template, with its attrs
      add to it the current node and all nodes which follow it, up to the next SIGNIFICANT node
      or, put another way, all following siblings which either
      a) do not have a preceding signficant node, or
      b) whose nearest preceding singificant node is the same as the nearest preceding significant node of the current node, i.e. its following sibling node is the current node.
    -->
                <xsl:element name="{../name()}">
                    <xsl:apply-templates select="../@*"/>
                    <xsl:apply-templates select="following-sibling::node()[
          not(preceding-sibling::*[matches(@class,'target-one|target-two','i')])
          or 
          count(preceding-sibling::*[matches(@class,'target-one|target-two','i')][0]/following-sibling::node()[0] | current()) = 1
        ]" />
                </xsl:element>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:for-each>
</xsl:template>
</xsl:stylesheet>

此文件的当前输出包含子项和重复项:

<document>
<div class="target-one">
    <randomElement>
        maybe some text node, maybe not...2

    </randomElement>
    <div class="target-one"></div>
    maybe some text node, maybe not...4

    <randomElement>
        maybe some text node, maybe not...5

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...7

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...8

    <randomElement>
        maybe some text node, maybe not...9

    </randomElement>
</div>
<div class="target-one">
    <div class="target-one"></div>
    maybe some text node, maybe not...4

    <randomElement>
        maybe some text node, maybe not...5

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...7

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...8

    <randomElement>
        maybe some text node, maybe not...9

    </randomElement>
</div>
<div class="target-one"></div>
<div class="target-one">
    <randomElement>
        maybe some text node, maybe not...5

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...7

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...8

    <randomElement>
        maybe some text node, maybe not...9

    </randomElement>
</div>
<div class="target-one">
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...7

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...8

    <randomElement>
        maybe some text node, maybe not...9

    </randomElement>
</div>
<div class="target-two">
    <randomElement>
        maybe some text node, maybe not...7

    </randomElement>
</div>
<div class="target-two"></div>
<div class="target-one">
    <randomElement>
        maybe some text node, maybe not...9

    </randomElement>
</div>
<div class="target-one"></div>
<div class="target-two">
    <randomElement>
        maybe some text node, maybe not...11

    </randomElement>
    <div class="target-one"></div>
    maybe some text node, maybe not...13

    <randomElement>
        maybe some text node, maybe not...14

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...16

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...17

    <randomElement>
        maybe some text node, maybe not...18

    </randomElement>
</div>
<div class="target-two">
    <div class="target-one"></div>
    maybe some text node, maybe not...13

    <randomElement>
        maybe some text node, maybe not...14

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...16

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...17

    <randomElement>
        maybe some text node, maybe not...18

    </randomElement>
</div>
<div class="target-one"></div>
<div class="target-two">
    <randomElement>
        maybe some text node, maybe not...14

    </randomElement>
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...16

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...17

    <randomElement>
        maybe some text node, maybe not...18

    </randomElement>
</div>
<div class="target-two">
    <div class="target-two">
        <randomElement>
            maybe some text node, maybe not...16

        </randomElement>
    </div>
    <div class="target-two"></div>
    maybe some text node, maybe not...17

    <randomElement>
        maybe some text node, maybe not...18

    </randomElement>
</div>
<div class="target-two">
    <randomElement>
        maybe some text node, maybe not...16

    </randomElement>
</div>
<div class="target-two"></div>
<div class="target-two">
    <randomElement>
        maybe some text node, maybe not...18

    </randomElement>
</div>
<div class="target-two"></div>
</document>

3 回答

  • 2

    我试图把它当作一个分组问题来对待

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    
    <xsl:param name="prefix" select="'target-'"/>
    
    <xsl:output indent="yes"/>
    
    <xsl:template match="document">
      <xsl:copy>
        <xsl:for-each-group select="descendant::text()[normalize-space()]"
          group-adjacent="generate-id(ancestor::div[starts-with(@class, $prefix)][1])">
          <xsl:apply-templates select="ancestor::div[starts-with(@class, $prefix)][1]" mode="g">
            <xsl:with-param name="group" select="current-group()"/>
          </xsl:apply-templates>
        </xsl:for-each-group>
      </xsl:copy>
    </xsl:template>
    
    <xsl:template match="*" mode="g">
      <xsl:param name="group"/>
      <xsl:if test=". intersect $group/ancestor::*">
        <xsl:copy>
          <xsl:copy-of select="@*"/>
          <xsl:apply-templates select="node()" mode="g">
            <xsl:with-param name="group" select="$group"/>
          </xsl:apply-templates>
        </xsl:copy>
      </xsl:if>
    </xsl:template>
    
    <xsl:template match="text()" mode="g">
      <xsl:param name="group"/>
      <xsl:if test=". intersect $group">
        <xsl:copy/>
      </xsl:if>
    </xsl:template>
    
    </xsl:stylesheet>
    

    这基本上将最近的祖先 div 与您正在寻找的 class 组合在一起的任何非空白文本节点后代,然后使用所有分组的文本节点重新创建包含在祖先中的子树 .

  • 1

    它只是一个例子 . 以下样式表将产生所需的结果 - 也许是's what you'正在寻找 . 如果没有,请编辑您的问题并解释所请求转换后面的 logic .

    XSLT 2.0 (or 1.0)

    <xsl:stylesheet version="2.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
    <xsl:strip-space elements="*"/>
    
    <xsl:template match="/document">
        <document>
            <xsl:for-each select="//randomElement">
                <div class='{../@class}'>
                    <xsl:copy-of select=". | preceding-sibling::text()[1]"/>
                </div>
            </xsl:for-each>
        </document>
    </xsl:template>
    
    </xsl:stylesheet>
    
  • 0

    听起来像你想要的东西如下,其中SIGNIFICANT是一些描述所有这些的表达式,只有那些你希望成为新列表项的元素(例如 div[substring(@class,1,6)='target'] )......

    <xsl:template match="SIGNIFICANT">
      <xsl:for-each select="node()">
        <xsl:choose>
          <xsl:when test="self::SIGNIFICANT">
            <xsl:apply-templates select="."/>
          </xsl:when>
          <xsl:when test="preceding-sibling::node()[0][not(self::SIGNIFICANT)]">
            <!-- do nothing, it will be handled by the next case -->
          </xsl:when>
          <xsl:otherwise>
            <!--
              create a copy of the element matched by the template, with its attrs
              add to it the current node and all nodes which follow it, up to the next SIGNIFICANT node
              or, put another way, all following siblings which either
              a) do not have a preceding signficant node, or
              b) whose nearest preceding singificant node is the same as the nearest preceding significant node of the current node, i.e. its following sibling node is the current node.
            -->
            <xsl:element name="../name()">
              <xsl:apply-templates select="../@*"/>
              <xsl:apply-templates select="following-sibling::node()[
                  not(preceding-sibling::SIGNIFICANT)
                  or 
                  count(preceding-sibling::SIGNIFICANT[0]/following-sibling::node()[0] | current()) = 1
                ]">
            </xsl:element>
          </xsl:otherwise>
      </xsl:for-each>
    </xsl:template>
    

    注意:这意味着将完全删除没有子节点的顶级 div . 如果你不想要这种行为,你可以在一个选择/何时进行简单的包装 .

    另请注意:对于极长列表,可能有一种更高效的方式来递归执行此操作 .

相关问题