Count Overlapping Occurrences of Substring - Python Examples


Python - Find number of overlapping occurrences of a substring

There could be scenarios where the occurrences of a substring in a string overlap. For example, the string abababa has overlapping occurrences of the substring aba.

In this tutorial, we will learn how to find the number of these overlapping occurrences of a substring in a given string.


Steps to find the number of overlapping occurrences

To get the total number of occurrences of a substring in a string, including overlapping ones, follow these steps:

  1. Initialize a count variable to store the number of occurrences.
  2. Use a while loop to find the substring in the string iteratively using string.find(substring, start).
  3. If the substring is found, increment the count, update the start position to index + 1, and repeat the search.
  4. If the substring is not found, exit the loop.

Examples

1. Find the number of overlapping occurrences of a substring in a string

In this example, we find overlapping occurrences of the substring ghg in the string abcdefghghghghghgh.

Python Program

string = 'abcdefghghghghghgh'
substring = 'ghg'

count = 0
start = 0

while True:
    index = string.find(substring, start)
    if index == -1:
        break
    count += 1
    start = index + 1

print(count)

Explanation:

  1. The initial string is abcdefghghghghghgh and the substring is ghg.
  2. Start searching for the substring from index 0.
  3. Whenever the substring is found, increment the count and update the starting position to index + 1 to account for overlapping.
  4. Stop the loop when no more occurrences are found (when find returns -1).

Output:

5

2. Case-insensitive search for overlapping occurrences

In this example, we search for overlapping occurrences of a substring in a case-insensitive manner.

Python Program

string = 'AbAbAbABA'
substring = 'aba'

count = 0
start = 0
string_lower = string.lower()
substring_lower = substring.lower()

while True:
    index = string_lower.find(substring_lower, start)
    if index == -1:
        break
    count += 1
    start = index + 1

print(count)

Explanation:

  1. Convert both the string and the substring to lowercase using lower().
  2. Use the same logic as in the first example to find overlapping occurrences.

Output:

4

3. Finding overlapping occurrences in a larger string

This example demonstrates handling larger strings efficiently.

Python Program

string = 'a' * 1000 + 'aba' + 'a' * 1000
substring = 'aba'

count = 0
start = 0

while True:
    index = string.find(substring, start)
    if index == -1:
        break
    count += 1
    start = index + 1

print(count)

Explanation:

  1. The string contains 1000 'a's, followed by 'aba', followed by another 1000 'a's.
  2. Using the same logic as earlier, the program efficiently finds and counts overlapping occurrences.

Output:

1

Summary

In this tutorial, we learned how to find overlapping occurrences of a substring in a string using a while loop and the find() method. We covered examples for:

  • Basic substring searches with overlapping.
  • Case-insensitive searches.
  • Efficient handling of larger strings.

These techniques are useful for text processing tasks where precise control over substring matching is required.




Python Libraries