硒xpath刮混合内容html Span-Java 学习之路

我正在尝试刮掉混合内容的span元素

<span id="span-id">
  <!--starts with some whitespace-->
  <b>bold title</b>
  

  text here that I want to grab....
</span>

这是一个标识 Span 的抓取代码片段 . 它没有问题就接了它，但是webelement的文本字段是空白的 .

IWebDriver driver = new FirefoxDriver();
driver.Navigate().GoToUrl("http://page-to-examine.com");
var query = driver.FindElement(By.XPath("//span[@id='span-id']"));

我已经尝试将/ text（）添加到表达式中，该表达式也不返回任何内容 . 如果我添加/ b我会得到粗体文本的文本内容 - 这恰好是我不感兴趣的 Headers .

我肯定有一点xpath魔法，这应该很容易，但到目前为止我还没找到它！或者，还有更好的方法？感激地收到任何评论 .

2 回答

3
我尝试将/ text（）添加到表达式中，该表达式也不返回任何内容

这将选择上下文节点的所有text-node-children - 并且有三个 .

你所谓的“无”是最可能的第一个，这是一个只有空格的文本节点（因此你看到它中没有“） .

What you need is ：
```
//span[@id='span-id']/text()[3]
```
Of course, there are other variations possible ：
```
//span[@id='span-id']/text()[last()]
```
要么：
```
//span[@id='span-id']/br/following-sibling::text()[1]
```
XSLT-based verification ：
```
<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     "<xsl:copy-of select="//span[@id='span-id']/text()[3]"/>"
 </xsl:template>

</xsl:stylesheet>
```
This transformation simply outputs whatever the XPath expression selects. When applied on the provided XML document （已删除评论）：
```
<span id="span-id">
    <b>bold title</b>
    

    text here that I want to grab....   
</span>
```
the wanted result is produced ：
```
"
    text here that I want to grab....   
"
```
回复于 2024-04-28T11:44:43+08:00
4
我相信以下xpath查询应该适用于您的情况 . 跟随兄弟姐妹对你正在尝试做的事情有用 .
```
//span[@id='span-id']/br/following-sibling::text()
```
回复于 2024-04-28T11:44:43+08:00

硒xpath刮混合内容html Span

2 回答

相关问题