Python Soup - Find Elements by Text
Python BeautifulSoup - Find Elements by Text
To select elements in a HTML page by Text using BeautifulSoup, first find the elements by required selection criteria like a CSS selector, tag, or something like that, and then filter these elements based on the value of text property of the elements.
In this tutorial, we shall go through a step by step process to find elements in a webpage that contain specific text.
Steps to find elements by text using BeautifulSoup
Let us see the steps to find div elements that has the text 'Article'
.
- Import BeautifulSoup from bs4 library.
- Given HTML content as a string in html_content, and the text that we need to search in the elements' text in search_text.
- Parse HTML content using BeautifulSoup() constructor, and store the returned object in soup.
- Call select() method on the soup object, and pass the required selector value. The select() method returns a list of Tag elements.
- Use a Python For loop to iterate over the list, and for each element, check if the text in element contains the search text.
Example: Get Div Elements with the Text 'Article'
In the following program, we take a sample HTML content, and then find the div elements with containing the text 'Article'
using BeautifulSoup.select() method.
Python Program
from bs4 import BeautifulSoup
def contains_text(text):
return text and "Article" in text
# Sample HTML content
html_content = """
<html>
<body>
<div>Article 1</div>
<div>Article 2</div>
<div>Story 1</div>
<div>Story 2</div>
</body>
</html>
"""
# Search text
search_text = "Article"
# Parse the HTML content
soup = BeautifulSoup(html_content, "html.parser")
# Get all div elements
div_elements = soup.select('div')
# Filter div elements based on text
elements_with_text = []
for element in div_elements:
if search_text in element.text:
elements_with_text.append(element)
# Print filtered elements
for element in elements_with_text:
print(element)
Output
<div>Article 1</div>
<div>Article 2</div>
Summary
In this Python BeautifulSoup tutorial, we have seen how to find the elements in HTML page by text value using BeautifulSoup in Python, with a step by step process, and an example.