Python BeautifulSoup - Get text of HTML Element
Python BeautifulSoup - Get text of HTML Element
To get the text of a HTML element in Python using BeautifulSoup, you can use Tag.text property. The text property returns a string value representing the text content of the HTML element, just the text, no tags or attributes.
1. Get text of a Div element using Tag.text property in Python
In the following program, we take a sample HTML content in html_content variable, find the div element with id="my_div"
, and then get the text of this div element using Tag.text property.
Python Program
from bs4 import BeautifulSoup
# Sample HTML content
html_content = """
<html>
<body>
<div id="my_div" class="article sample">
This is a sample div element.
</div>
</body>
</html>
"""
# Parse the HTML content
soup = BeautifulSoup(html_content, 'html.parser')
# Consider this as given element
element = soup.find(id="my_div")
# Get text of element
text = element.text.strip()
print(text)
Output
This is a sample div element.
2. Get text of a Div element having child elements in Python
In the following program, we take a sample HTML content in html_content variable, find the first div element, and then get the id attribute of this div element using attrs property.
Python Program
from bs4 import BeautifulSoup
# Sample HTML content
html_content = """
<html>
<body>
<div id="my_div" class="article sample">
<h2>Welcome!</h2>
<p>This is a paragraph.</p>
</div>
</body>
</html>
"""
# Parse the HTML content
soup = BeautifulSoup(html_content, 'html.parser')
# Consider this as given element
element = soup.find(id="my_div")
# Get text of element
text = element.text
print(text)
Output
Welcome!
This is a paragraph.
You may trim this text using string strip() method, to remove the surrounding new lines.
text = element.text.strip()
Then, you would get the following output.
Welcome!
This is a paragraph.
Summary
In this Python BeautifulSoup tutorial, given the HTML element, we have seen how to get the text of given HTML element as a string, using Tag.text property.