Python - Count number of occurrences of a word in Text File


Count Occurrences of a Word in Text File

To count the number of occurrences of a specific word in a text file, you can read the content of the file into a string and use the String.count() function. This tutorial will guide you through the process with examples and explanations.


Syntax of count()

The count() function counts the occurrences of a substring in a string.

n = String.count(word)

Parameters:

  • word: The substring to search for in the string.

Return Value:

  • Returns an integer indicating the number of times the substring appears in the string.

Examples

1. Count How Many Times the Word "python" Occurred in a Text File

In this example, we will count the occurrences of the word "python" in the following text file.

Text File

Welcome to www.pythonexamples.org. Here, you will find python programs for all general use cases.

Python Program

# Open file in read mode
file = open("C:\workspace\python\data.txt", "r")

# Read content of file to a string
data = file.read()

# Get number of occurrences of the word
occurrences = data.count("python")

print('Number of occurrences of the word:', occurrences)

Explanation (Step-by-Step):

  1. file.read(): Reads the file content into the variable data.
  2. data.count("python"): Counts the number of times the substring "python" appears in the text.
  3. The result is printed to the console.

Output

Number of occurrences of the word: 2

2. Count Occurrences Case-Insensitively

By default, the count() function is case-sensitive. To count occurrences irrespective of case, convert the text and the word to lowercase before using count().

Text File

Python is a versatile language. python supports multiple paradigms. PYTHON is popular for its simplicity.

Python Program

# Open file in read mode
file = open("C:\workspace\python\data.txt", "r")

# Read content of file to a string and convert to lowercase
data = file.read().lower()

# Get number of occurrences of the word (case-insensitive)
occurrences = data.count("python")

print('Number of occurrences of the word (case-insensitive):', occurrences)

Explanation:

  1. data.lower(): Converts the entire text to lowercase to enable case-insensitive comparison.
  2. data.count("python"): Counts the occurrences of "python" in the lowercase text.

Output

Number of occurrences of the word (case-insensitive): 3

3. Count Occurrences of a Word with Word Boundaries

Sometimes, you might want to count only exact matches of the word, excluding cases where it appears as part of another word (e.g., "python" should not count in "pythonic").

Text File

Python is great. pythonic is not the same as python. Learn Python.

Python Program

import re

# Open file in read mode
file = open("C:\workspace\python\data.txt", "r")

# Read content of file to a string
data = file.read()

# Use regex to find word occurrences (case-insensitive)
pattern = r'\bpython\b'
occurrences = len(re.findall(pattern, data, re.IGNORECASE))

print('Number of occurrences of the word (exact match):', occurrences)

Explanation:

  1. re.findall(): Finds all occurrences of the word "python" in the text using the regex pattern \bpython\b.
  2. re.IGNORECASE: Ensures the search is case-insensitive.
  3. len(): Returns the number of matches found by findall().

Output

Number of occurrences of the word (exact match): 3

Summary

In this tutorial of Python Examples, we explored how to count occurrences of a word in a text file using Python. We discussed:

  • Basic usage of the count() function.
  • Case-insensitive word counting by converting text to lowercase.
  • Exact word matching using regular expressions.

These techniques are versatile and can be adapted for different use cases based on your requirements.


Python Libraries