用xpath匹配文本？-Java 学习之路

我在屏幕上抓取一个HTML页面，其中包含：

<table border=1 class="searchresult" cellpadding=2> 
<tr><th colspan=2>Last search</th></tr> 
<tr><th align=left>Search term</th><td>xxxxxx</td></tr> 
<tr><th align=left>Result</th><td>yyyyyyyy/td></tr> 
</table>

我想写一个XPATH表达式，它获取包含“yyyyyyyy”的数据单元格 . 我已经到了

.//table[@class='searchresult']//tr/th

它为我提供了表中所有表头节点的列表 . 我可以在用户代码中迭代它们，找到.text为“Results”的那个，然后在其上调用.getnext（）来获取表数据 . 但是，通过编写更具体的XPATH模式，有更简洁的方法吗？好像应该有，但是我还没有弄清楚XPATH还没弄到底是怎么回事 .

如果重要的话，我在Python中使用lxml进行此操作 .

2 回答

0

.//table[@class='searchresult']//tr/td[preceding-sibling::th]可能会给你你需要的东西 .

这里有两篇关于半自动创建XPath语句的综合论文，专门用于屏幕抓取目的：

http://tobiasanton.com/Tobias_Anton/Academia.html

回复于 2024-04-28T00:02:13+08:00
1
Use ：
```
//table/tr[last()]/td
```
这将选择任何 td 元素，该元素是任何 tr 的子元素，它是此XHTML文档中任何 table 的最后一个 tr 子元素 .

这可能会选择多个 td 元素，具体取决于XHTML文档中是否只有一个 table . You need to make this expression more precise, if more than one table element is present .

For example, if the table in question is the first in the document, use ：
```
(//table)[1]/tr[last()]/td
```
回复于 2024-04-28T00:02:13+08:00

用xpath匹配文本？

2 回答

相关问题