Python BeautifulSoup - Get text of HTML Element


Python BeautifulSoup - Get text of HTML Element

To get the text of a HTML element in Python using BeautifulSoup, you can use Tag.text property. The text property returns a string value representing the text content of the HTML element, just the text, no tags or attributes.

1. Get text of a Div element using Tag.text property in Python

In the following program, we take a sample HTML content in html_content variable, find the div element with id="my_div", and then get the text of this div element using Tag.text property.

Python BeautifulSoup - Get text of HTML Element

Python Program

from bs4 import BeautifulSoup

# Sample HTML content
html_content = """
<html>
    <body>
        <div id="my_div" class="article sample">
            This is a sample div element.
        </div>
    </body>
</html>
"""

# Parse the HTML content
soup = BeautifulSoup(html_content, 'html.parser')

# Consider this as given element
element = soup.find(id="my_div")

# Get text of element
text = element.text.strip()

print(text)

Output

This is a sample div element.

2. Get text of a Div element having child elements in Python

In the following program, we take a sample HTML content in html_content variable, find the first div element, and then get the id attribute of this div element using attrs property.

Get text of a Div element having child elements in Python

Python Program

from bs4 import BeautifulSoup

# Sample HTML content
html_content = """
<html>
    <body>
        <div id="my_div" class="article sample">
            <h2>Welcome!</h2>
            <p>This is a paragraph.</p>
        </div>
    </body>
</html>
"""

# Parse the HTML content
soup = BeautifulSoup(html_content, 'html.parser')

# Consider this as given element
element = soup.find(id="my_div")

# Get text of element
text = element.text

print(text)

Output


Welcome!
This is a paragraph.

You may trim this text using string strip() method, to remove the surrounding new lines.

text = element.text.strip()

Then, you would get the following output.

Welcome!
This is a paragraph.

Summary

In this Python BeautifulSoup tutorial, given the HTML element, we have seen how to get the text of given HTML element as a string, using Tag.text property.