Java 学习之路

1 votes

answers

views

导入LXML时出错

我正在导入“来自docx导入文档” . 这给了一个错误文件“”，第1行，在文件“/usr/local/lib/python2.7/site-packages/docx.py”，第17行，来自lxml import etree ImportError：/ usr / local / lib / python2 . 7 / site-packages / lxml-3.7.0-py2.7-lin...

python-2.7 lxml python-import importerror
1 votes

answers

views

SSL：CERTIFICATE_VERIFY_FAILED证书验证失败

from lxml import html import requests url = "https://website.com/" page = requests.get(url) tree = html.fromstring(page.content) page.content SSLError: [SSL: CERTIFICATE_VERIFY_FAILED]...

python python-3.x ssl xmlhttprequest lxml
0 votes

answers

views

Python lxml不会安装在适当的平台上

我有以下配置： Windows 7 Enterprise SP1 64位Python 3.4.1（v3.4.1：c0e311e010fc，2014年5月18日，10：45：13）[MSC v.1600 64位（AMD64）]在win32点1.5.6我没有访问互联网我不是Windows系统的管理员我正在尝试在我的Python上安装 lxml library . 我从https://pypi....

python pip lxml python-3.4 python-wheel
0 votes

answers

views

<？xml version =“1.0”encoding =“UTF-8”？> not <？xml version = '1.0' encoding = 'UTF-8'？>

我正在使用lxml tree.write(xmlFileOut, pretty_print = True, xml_declaration = True, encoding='UTF-8' 写出我打开并编辑过的xml文件，但我绝对需要将xml声明作为 <?xml version=“1.0” encoding=“UTF-8”?> 并不是 <?xml version='1.0' ...

python xml lxml xml-declaration
0 votes

answers

views

lxml Xml解析

<xml> <maintag> <content> lorem ipsum <strong> dolor sit </strong> and so on </content> </maintag> </xml> 我定期解析的xml文件可能在内容标记内有标记，如上所示 . 我在这里解析文件： p...

python xml xml-parsing lxml
11 votes

answers

views

ubuntu 11.04 lxml导入自定义python的etree问题

ubuntu 11.04有本机python2.7我从源码到/usr/local/python2.5/bin构建python2.5，并尝试为我的自定义python2.5安装安装lxml . 我也使用virtualenv . 我用python2.5切换到我的环境 . 在导入lxml时出现错误 . from lxml import etree ImportError: /home/se7en/.virt...

python lxml
7 votes

answers

views

将lxml.etree导入python时出错

我在我的mac上安装了一个lxml，当我输入这样的python时 localhost:lxml-3.0.1 apple$ python Python 2.7.3 (v2.7.3:70274d53c1dd, Apr 9 2012, 20:52:43) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help&quo...

python python-2.7 lxml
1 votes

answers

views

lxml on python-3.3.0 ImportError：undefined symbol：xmlBufContent

我很难在python-3.3.0上安装lxml（3.1.0） . 它安装没有错误，我可以在正确的文件夹（/usr/local/lib/python3.3/site-packages/）中看到lxml-3.1.0-py3.3-linux-i686.egg，但是当我尝试导入etree，我得到这个：来自lxml import etree Traceback（最近一次调用last）：ImportErr...

lxml importerror python-3.3
3 votes

answers

views

使用XPath 1.0提取文本与正则表达式匹配的URL

我想在Scrapy中使用XPath提取此类型的URL（链接文本是一个带有任意数字的数字，而href是一个随机文本） . <a href="http://www.example.com/link_to_some_page.html>3</a> <a href="http://www.example.com/another_link-abcd....

python regex xpath lxml scrapy
26 votes

answers

views

如何在lxml中获取元素的路径？

我正在使用Python中的lxml中的XPath在HTML文档中搜索 . 我怎样才能获得某个元素的路径？这是ruby nokogiri的例子： page.xpath('//text()').each do |textnode| path = textnode.path puts path end 打印例如' /html/body/div/div[1]/div[1]/p/text(...

python xpath lxml
1 votes

answers

views

Python lxml xpath无输出

出于教育目的，我试图使用lxml和Python中的请求来抓取this page . 具体来说，我只想在页面上打印所有教授的研究领域 . 这是我到目前为止所做的 import requests from lxml import html response=requests.get('http://cse.iitkgp.ac.in/index.php?secret=d2RkOUgybWlNZzJwQ...

python-2.7 xpath web-scraping python-requests lxml
1 votes

answers

views

无法使用easy_install或pip在mac上安装

我正在尝试使用easy_install（和pip）安装lxml和pycrypto模块但是收到错误消息 Running lxml-2.3.4/setup.py -q bdist_egg --dist-dir /tmp/easy_install-kGsWMh/lxml-2.3.4/egg-dist-tmp-Gjqy3f Building lxml version 2.3.4. Building wit...

python lxml pip easy-install pycrypto
3 votes

answers

views

lxml / BeautifulSoup解析器警告

使用Python 3，我试图通过使用带有BeautifulSoup的 lxml 来解析丑陋的HTML（不受我的控制），如下所述：http://lxml.de/elementsoup.html 具体来说，我想使用 lxml ，但我是丑陋的HTML和 lxml 将自己拒绝它 . 上面的链接说：“你需要做的就是将它传递给fromstring（）函数：” from lxml.html.soupparser...

python python-3.x beautifulsoup lxml
7 votes

answers

views

在HTML BeautifulSoup中按文本查找并替换

我正在尝试使用python和BeautifulSoup标记一个HTML文件（字面上用“mark”标签包装字符串） . 问题基本如下...... 说我有我原来的html文档： test = "<h1>oh hey</h1><div>here is some <b>SILLY</b> text</div>" ...

python regex html-parsing beautifulsoup lxml
0 votes

answers

views

BeautifulSoup和lxml解析器的问题

我在使用BeautifulSoup 4.1.0和lxml解析器抓取一些网页时发现了一个奇怪的行为 . 内置的html.parser不能用于我试图抓取的网页，我决定使用lxml解析器 . 我的Eclipse控制台上打印的结果看起来好不到一秒钟，然后，它会自动切换到一个不完整，无用且不太好看的输出，所有字符之间都有空格： ! - - S w i t c h - - &gt; ...

python web-scraping beautifulsoup lxml
0 votes

answers

views

解析源代码（Python）方法：美丽的汤，lxml，html5lib的区别？

我有一个大的HTML源代码我想解析（~200,000）行，我相当确定整个格式都很差 . 我一直在研究一些解析器，似乎Beautiful Soup，lxml，html5lib是最受欢迎的 . 从阅读这个网站，似乎lxml是最常用和最快的，而美丽的汤比较慢，但会导致更多的错误和变化 . 我对Beautiful Soup文档http://www.crummy.com/software/Beautiful...

python parsing beautifulsoup lxml
4 votes

answers

views

在beautifulsoup的上下文中lxml和html5lib之间的区别

在beautifulsoup的上下文中，lxml和html5lib解析器的功能有区别吗？我正在尝试学习使用BS4并使用以下代码构造 - ret = requests.get('http://www.olivegarden.com') soup = BeautifulSoup(ret.text, 'html5lib') for item in soup.find_all('a'): pri...

python beautifulsoup lxml html5lib
0 votes

answers

views

使用BeautifulSoup或lxml解析和修改html . 使用一些html标记包围文本，该标记直接位于<body>标记下

我作为初学者在Python2.7工作 . 我想解析和修改一些html文件 . 为此，我使用Beautiful Soup和lxml也是一种选择 . 现在的问题是我可以通过修改html来包含带有一些html标签的文本 . 文本直接在'body'标签下，所以什么文本直接在body标签下我想修改html，以便我可以在我想要的标签下获取文本 . 所以我可以解析它并轻松找出这个文本的位置 . <html...

python html-parsing beautifulsoup lxml
-1 votes

answers

views

无法通过pip安装lxml？还有其他选择吗？ [重复]

这个问题在这里已有答案： Installing lxml, libxml2, libxslt on Windows 8.1 4个答案实际上我试图安装lxml因为 UserWarning：没有明确指定解析器，所以我使用了最好的HTML解析器（“html.parser”） . 这通常不是问题，但如果您在另一个系统上或在不同的虚拟环境中运行此代码，它可能使用不同的解析器并且行为不同 . 所以我...

python pip lxml bs4
2 votes

answers

views

BeautifulSoup抑制lxml解析erorrs？

我使用lxml与BeautifulSoup一起解析和导航XML文件 . 我注意到奇怪的行为 . 当读取格式错误的XML文件（例如截断的doc或缺少结束标记）时，Beautifulsoup会抑制lxml解析器抛出的异常 . 例： from bs4 import BeautifulSoup soup = BeautifulSoup("<foo><bar>trololo...

xml beautifulsoup lxml
0 votes

answers

views

lxml在查找链接时错误地解析了Doctype

我有一个BeautifulSoup4（4.2.1）解析器，它从我们的模板文件中收集所有 href 属性，直到现在它已经完美无缺 . 但是安装了lxml后，我们其中一个人现在正在使用; TypeError: string indices must be integers . 我设法在我的Linux Mint VM上复制它，唯一的区别似乎是lxml所以我假设当bs4使用该html解析器时会出现问题 ...

python html beautifulsoup lxml
3 votes

answers

views

在美丽的汤中找不到lxml

我正在尝试使用beautifulsoup4来解析一系列用XHTML编写的网页 . 我假设为了获得最佳结果，我应该与xml解析器配对，而且我所知道的唯一一个由beautifulsoup支持的是lxml . 但是，当我尝试按照beautifuloup文档运行以下内容时： import requests from bs4 import BeautifulSoup r = requests.get(‘...

python-3.x beautifulsoup lxml anaconda
9 votes

answers

views

XPath：选择空值的标签

我如何在XPath 1.0中找到空 col name="POW" 的所有行？ <row> <col name="WOJ">02</col> <col name="POW"/> <col name="GMI"/> <col name="RODZ&...

python xml xpath lxml
37 votes

answers

views

lxml运行时错误：原因：不兼容的库版本：etree.so需要12.0.0或更高版本，但libxml2.2.dylib提供版本10.0.0

我有一个令人困惑的问题 . 我使用的是mac版本10.9，anaconda 3.4.1，python 2.7.6 . 使用python-amazon-product-api开发Web应用程序 . 我克服了安装lxml的障碍，引用了clang error: unknown argument: '-mno-fused-madd' (python package installation failure...

python amazon lxml osx-mavericks
2 votes

answers

views

使用Python解析XML数据

实际上我正在开发一个小项目，需要解析公共可用的XML数据 . 我的目标是将数据写入mysql数据库以进行进一步处理 . XML数据链接：http://offenedaten.frankfurt.de/dataset/912fe0ab-8976-4837-b591-57dbf163d6e5/resource/48378186-5732-41f3-9823-9d1938f2695e/download/...

python xml parsing lxml
1 votes

answers

views

lxml xml使用xml标记内的html标记进行解析

<xml> <maintag> <content> lorem <br>ipsum</br> <strong> dolor sit </strong> and so on </content> </maintag> </xml> 我定期解析的xml文件可能在内容标记内部...

python html xml xml-parsing lxml
1 votes

answers

views

redhat上的easy_install lxml错误

我已经尝试了几种方法来安装lxml（实际上我需要安装Scrapy，它依赖于lxml安装），easy_install，pip，源代码构建，但是它们没有用完 . 现在我使用：STATIC_DEPS = true easy_install lxml . 但得到以下错误： / usr / bin / ld：/tmp/easy_install-mptFTT/lxml-3.0alpha2/build/tmp...

python lxml scrapy
10 votes

answers

views

无法在MacOS 10.8.4中安装lxml

我在将lxml安装到Mac OS时遇到问题 . 构建它时出现以下错误 . 这是我使用 pip install lxml 时出现的错误 /private/var/folders/9s/s5hl5w4x7zjdjkdljw9cnsrm0000gn/T/pip-build-khuevu/lxml/src/lxml/includes/etree_defs.h:9:10：致命错误：找不到'libxml /...

python macos lxml libxml2
0 votes

answers

views

UnicodeEncodeError：'charmap'编解码器可以在python 2.7中't encode character u' \ xfd'

我从我的localhost下载来自不同网站的不同公司名称有时我遇到这个问题，这是中断下载程序 . 我的脚本对其他国家工作正常，但是当我下载捷克共和国时发生了这种类型的错误 . 到目前为止处理的公司总数：0 Traceback（最近一次调用最后一次）：文件“process1.py”，第261行，打印“公司名称：”hit.text文件“C：\ Python27 \ lib \ encodings \...

mysql python-2.7 beautifulsoup lxml command-prompt
22 votes

answers

views

使用pip和python 2.7在Windows 7上安装lxml

当我尝试在我的Windows 7机器上使用pip升级lxml时，我得到下面打印的日志 . 当我卸载并尝试从头开始安装时，我得到了同样的错误 . 有任何想法吗？从https://pypi.python.org/packages/source/l/lxml/l xml-3.2.4.tar.gz下载/解包lxml＃md5 = cc363499060f615aca1ec8dcc04df331正在下载l...

python-2.7 lxml pip

热门问题