Python中单行匹配的多行输出-Java 学习之路

我仍然非常擅长Python，但我正在尝试编写能解析NOAA天气并按照我们的无线电广播顺序显示的代码 .

我已经设法将一个使用python表达式的当前条件列表组合在一起，其中html文件被切割成一个行列表，然后以正确的顺序重新输出，但每个都是一行数据 . 该代码看起来像这样：

#other function downloads  
#http://www.arh.noaa.gov/wmofcst_pf.php?wmo=ASAK48PAFC&type=public
#and renames it currents.html
from bs4 import BeautifulSoup as bs
import re
soup = bs(open('currents.html')
weatherRaw = soup.pre.string
towns = ['PAOM', 'PAUN', 'PAGM', 'PASA']
townOut = []
weatherLines = weatherRaw.splitlines()
for i in range(len(towns)):
    p = re.compile(towns[i] + '.*')
    for line in weatherLines:
        matched = p.match(line)
        if matched:
            townOut.append(matched.group())

现在我正在研究预测部分，我遇到了一个问题，因为每个预测都必须运行多行，并且我已经将文件切割成一个行列表 .

所以：我正在寻找的是一个表达式，它允许我使用类似的循环，这次开始在找到的行上追加并在仅包含&&的行结束它 . 像这样的东西：

#sample data from http://www.arh.noaa.gov/wmofcst.php?wmo=FPAK52PAFG&type=public
#BeautifulSouped into list fcst (forecast.pre.get_text().splitlines())
zones = ['AKZ214', 'AKZ215', 'AKZ213'] #note the out-of-numerical-order zones
weatherFull = []
for i in range(len(zones)):
    start = re.compile(zones[i] '.*')
    end = re.compile('&&')
    for line in fcst:
        matched = start.match(line)
        if matched:
            weatherFull.append(matched.group())
            #and the other lines of various contents and length
            #until reaching the end match object

我该怎么做才能改进这段代码？我知道它非常冗长，但是当我开始时，我喜欢能够跟踪我在做什么 . 提前致谢！

1 回答

抱歉，如果这不是你想要的那样（在这种情况下，很乐意调整） . 很棒，你使用的是BeautifulSoup，但实际上你可以更进一步 . 查看HTML，似乎每个块都以 <a name=zone> 结构开头，并在下一个 <a name=zone> 结束 . 在这种情况下，您可以执行以下操作来为每个区域提取相应的HTML：

from bs4 import BeautifulSoup

# I put the HTML in a file, but this will work with a URL as well
with open('weather.html', 'r') as f:
  fcst = f.read()

# Turn the html into a navigable soup object
soup = BeautifulSoup(fcst)

# Define your zones
zones = ['AKZ214', 'AKZ215', 'AKZ213']

weatherFull = []

# This is a more Pythonic loop structure - instead of looping over
# a range of len(zones), simply iterate over each element itself
for zone in zones:
  # Here we use BS's built-in 'find' function to find the 'a' element
  # with a name = the zone in question (as this is the pattern).
  zone_node = soup.find('a', {'name': zone})

  # This loop will continue to cycle through the elements after the 'a'
  # tag until it hits another 'a' (this is highly structure dependent :) )
  while True:
    weatherFull.append(zone_node)
    # Set the tag node = to the next node
    zone_node = zone_node.nextSibling
    # If the next node's tag name = 'a', break out and go to the next zone
    if getattr(zone_node, 'name', None)  == 'a':
      break

# Process weatherFull however you like
print weatherFull

希望这会有所帮助（或至少在你想要的任何地方！） .

回复于 2024-05-03T05:44:59+08:00

Python中单行匹配的多行输出

1 回答

相关问题