How To Check If A Soup Contains An Element?
I have an html. I would like to check if it contains at least one English section. This is signified by
English
Solution 1:
You can try select_one
instead of find
. Something like this.
soup.select_one('details[data-level="2"] summary.section-heading h2#English')
The result will be
<h2 id="English">English</h2>
Solution 2:
You can use find_all
and then search what you wish for:
from bs4 import BeautifulSoup
texte = """
<divid="bodyContent"class="content mw-parser-output"><divid="mw-content-text"style="direction: ltr;"><h1class="section-heading"tabindex="0"aria-haspopup="true"data-section-id="0"><spanclass="mw-headline"id="title_0">pomme</span></h1><detailsdata-level="2"open=""><summaryclass="section-heading"><h2id="English">English</h2></summary><detailsdata-level="3"open="">abc</details></details><detailsdata-level="2"open=""><summaryclass="section-heading"><h2id="French">French</h2></summary><detailsdata-level="3"open="">abc</details></details></div></div>
"""
soup = BeautifulSoup(texte, 'html.parser')
details = soup.find_all("details", {"data-level": "2"})
lang = "English"
for detail in details:
detail_str = str(detail)
if lang in detail_str:
print(detail)
Outputs:
<detailsdata-level="2"open=""><summaryclass="section-heading"><h2id="English">English</h2></summary><detailsdata-level="3"open="">abc</details></details>
Solution 3:
As BeautifulSoup
doesn't have xpath support, we can use lxml
alternatively.
from lxml import html
texte = """
<div id="bodyContent" class="content mw-parser-output">
<div id="mw-content-text" style="direction: ltr;">
<h1 class="section-heading" tabindex="0" aria-haspopup="true" data-section-id="0">
<span class="mw-headline" id="title_0">pomme</span>
</h1>
<details data-level="2" open="">
<summary class="section-heading"><h2 id="English">English</h2></summary>
<details data-level="3" open="">abc</details>
</details>
<details data-level="2" open="">
<summary class="section-heading"><h2 id="French">French</h2></summary>
<details data-level="3" open="">abc</details>
</details>
</div>
</div>
"""
tree = html.fromstring(texte)
element = tree.xpath('//details[@data-level="2"]//h2[contains(text(),"English")]')
if element:
print("Found")
else:
print("Not found")
Post a Comment for "How To Check If A Soup Contains An Element?"