首页 文章

美丽的汤从标签获取标签(不是Navigable Strings)的孩子

提问于
浏览
10

美丽的汤文档提供了属性.contents和.children来访问给定标记的子元素(分别是列表和迭代),并包括Navigable Strings和Tags . 我只想要Tag类型的孩子 .

我目前正在使用列表理解来完成此任务:

rows=[x for x in table.tbody.children if type(x)==bs4.element.Tag]

但我想知道是否有一个更好/更pythonic /内置的方式来获得Tag儿童 .

1 回答

  • 14

    感谢J.F.Sebastian,以下内容将有效:

    rows=table.tbody.find_all(True, recursive=False)
    

    这里的文档:http://www.crummy.com/software/BeautifulSoup/bs4/doc/#true

    在我的情况下,我需要表中的实际行,所以我最终使用了以下内容,这更精确,我认为更具可读性:

    rows=table.tbody.find_all('tr')
    

    再次,文档:http://www.crummy.com/software/BeautifulSoup/bs4/doc/#navigating-using-tag-names

    我相信这比迭代标签的所有子项更好 .

    使用以下输入:

    <table cellspacing="0" cellpadding="0">
      <thead>
        <tr class="title-row">
          <th class="title" colspan="100">
            <div style="position:relative;">
              President
                <span class="pct-rpt">
                    99% reporting
                </span>
            </div>
          </th>
        </tr>
        <tr class="header-row">
            <th class="photo first">
    
            </th>
            <th class="candidate ">
              Candidate
            </th>
            <th class="party ">
              Party
            </th>
            <th class="votes ">
              Votes
            </th>
            <th class="pct ">
              Pct.
            </th>
            <th class="change ">
              Change from &lsquo;08
            </th>
            <th class="evotes last">
              Electoral Votes
            </th>
        </tr>
      </thead>
      <tbody>
          <tr class="">
              <td class="photo first">
                <div class="photo_wrap"><img alt="P-barack-obama" height="48" src="http://i1.nyt.com/projects/assets/election_2012/images/candidate_photos/election_night/p-barack-obama.jpg?1352320690" width="68" /></div>
              </td>
              <td class="candidate ">
                <div class="winner dem"><img alt="Hp-checkmark@2x" height="9" src="http://i1.nyt.com/projects/assets/election_2012/images/swatches/hp-checkmark@2x.png?1352320690" width="10" />Barack Obama</div>
              </td>
              <td class="party ">
                Dem.
              </td>
              <td class="votes ">
                2,916,811
              </td>
              <td class="pct ">
                57.3%
              </td>
              <td class="change ">
                -4.6%
              </td>
              <td class="evotes last">
                20
              </td>
          </tr>
          <tr class="">
              <td class="photo first">
    
              </td>
              <td class="candidate ">
                <div class="not-winner">Mitt Romney</div>
              </td>
              <td class="party ">
                Rep.
              </td>
              <td class="votes ">
                2,090,116
              </td>
              <td class="pct ">
                41.1%
              </td>
              <td class="change ">
                +4.3%
              </td>
              <td class="evotes last">
                0
              </td>
          </tr>
          <tr class="">
              <td class="photo first">
    
              </td>
              <td class="candidate ">
                <div class="not-winner">Gary Johnson</div>
              </td>
              <td class="party ">
                Lib.
              </td>
              <td class="votes ">
                54,798
              </td>
              <td class="pct ">
                1.1%
              </td>
              <td class="change ">
                &ndash;
              </td>
              <td class="evotes last">
                0
              </td>
          </tr>
          <tr class="last-row">
              <td class="photo first">
    
              </td>
              <td class="candidate ">
                div class="not-winner">Jill Stein</div>
              </td>
              <td class="party ">
                Green
              </td>
              <td class="votes ">
                29,336
              </td>
              <td class="pct ">
                0.6%
              </td>
              <td class="change ">
                &ndash;
              </td>
              <td class="evotes last">
                0
              </td>
          </tr>
          <tr>
            <td class="footer" colspan="100">
              <a href="/2012/results/president">President Map</a> &nbsp;|&nbsp;
              <a href="/2012/results/president/big-board">President Big Board</a>&nbsp;|&nbsp;
              <a href="/2012/results/president/exit-polls?state=il">Exit Polls</a>
            </td>
          </tr>
      </tbody>
    </table>
    

相关问题