首页 文章

scrapy xpath无法获取值

提问于
浏览
0

我有一个网站,我想保存两个span元素值 .

这是我的HTML代码的相关部分:

<div class="box-search-product-filter-row">

    <span class="result-numbers" sth-bind="model.navigationSettings.showFilter">

    <span class="number" sth-bind="span1"></span>

    <span class="result" sth-bind="span2"></span>

    </span>

</div>

我创造了一个蜘蛛:

from scrapy.spiders import Spider
from scrapy.selector import Selector

class MySpdier(Spider):

    name = "list"
    allowed_domains = ["example.com"]
    start_urls = [
        "https://www.example.com"]

    def parse(self, response):
        sel = Selector(response)
        divs = sel.xpath("//div[@class='box-search-product-filter-row']")


        for div in divs:
            sth = div.xpath("/span[class='result']/text()").extract()

            print sth

当我爬蜘蛛时,它只打印:

[]

任何人都可以帮助我如何从我的两个(类号和类结果)span元素中获取值?

2 回答

  • 1

    你在xpath "/span[class='result']/text()" 中忘记了 @ . 此外,您正在寻找的 Span 不是一级孩子,因此您需要使用 .// 而不是 / . 见:
    enter image description here
    来源:http://www.w3schools.com/xsl/xpath_syntax.asp

    完整和正确的xpath将是: ".//span[@class='result']" '/text()'如果您只想选择文本,但示例中的节点没有文本,所以它在这里不起作用 .

  • 0

    这对你有用

    EDIT:

    from scrapy.spiders import Spider
    from scrapy.selector import Selector
    
    class MySpdier(Spider):
    
        name = "list"
        allowed_domains = ["example.com"]
        start_urls = [
            "https://www.example.com"]
    
        def parse(self, response):
            sel = Selector(response)
            divs = sel.xpath("//div[@class='box-search-product-filter-row']")    
    
            for div in divs:
                sth = div.xpath(".//span[@class='result']/text()").extract()    
                print sth
    

相关问题