继续使用美丽的汤和python3获得'TypeError: ' NoneType ' object is not callable'-Java 学习之路

-1

我是一个初学者并且在课程中苦苦挣扎，所以这个问题可能非常简单，但是我正在运行这个（通常是凌乱的）代码（保存在文件x.py下）从网站中提取链接和名称，如行格式：

<li style="margin-top: 21px;">
  <a href="http://py4e-data.dr-chuck.net/known_by_Prabhjoit.html">Prabhjoit</a>
</li>

所以我设置了这个：import urllib.request，urllib.parse，urllib.error来自bs4 import BeautifulSoup import ssl＃忽略SSL证书错误ctx = ssl.create_default_context（）ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE

url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, 'html.parser')
for line in soup:
    if not line.startswith('<li'):
        continue
    stuff = line.split('"')
    link = stuff[3]
    thing = stuff[4].split('<')
    name = thing[0].split('>')
    count = count + 1
    if count == 18:
        break
print(name[1])
print(link)

它不断产生错误：

Traceback (most recent call last):
  File "x.py", line 15, in <module>
    if not line.startswith('<li'):
TypeError: 'NoneType' object is not callable

我已经挣扎了几个小时，我会很感激任何建议 .

1 回答

1
line 不是字符串，它没有 startswith() 方法 . 它是BeautifulSoup Tag object，因为BeautifulSoup已将HTML源文本解析为富对象模型 . 不要试图把它当作文字对待！

导致该错误是因为如果您访问它不知道的 Tag 对象上的任何属性，它会执行search for a child element with that name（因此它执行 line.find('startswith') ），并且由于没有具有该名称的元素，因此返回 None . None.startswith() 然后失败并显示您看到的错误 .

如果你想找到第18个 <li> 元素，只需向BeautifulSoup询问该特定元素：
```
soup = BeautifulSoup(html, 'html.parser')
li_link_elements = soup.select('li a[href]', limit=18)
if len(li_link_elements) == 18:
    last = li_link_elements[-1]
    print(last.get_text())
    print(last['href'])
```
这使用CSS selector仅查找 <a> 链接元素，其父元素是 <li> 元素且具有 href 属性 . 搜索仅限于18个这样的标签，最后一个是打印的，但前提是我们实际在页面中找到了18个 .

使用Element.get_text() method检索元素文本，该文本将包含来自任何嵌套元素的文本（例如 <span> 或 <strong> 或其他额外标记）， href 属性为accessed using standard indexing notation .
回复于 2024-04-29T16:58:46+08:00

继续使用美丽的汤和python3获得'TypeError: ' NoneType ' object is not callable'

1 回答

相关问题